Pivotal Knowledge Base

Follow

Loading from External Tables Error: "gpfdist Error - Line too Long in File"

Environment

 Product  Version
 Pivotal Greenplum  All
 OS  All Supported OS

Symptom

When loading data from external tables with gpfdist, the query failed with the following error:

ERROR: gpfdist error - line too long in file /tmp/1.log near (0 bytes) (url.c:1746) (seg13 slice1 sdw3:43001 pid=23691) (cdbdisp.c:1499)
DETAIL: External table new_test, file gpfdist://mdw:8090/1.log

Cause

Default row limit for external tables using gpfdist is 32KB as documented. If certain rows are longer than 32KB, the query will error out like above.

Here is more information regarding our test case:

msong=# create external table new_test ( a text) location('gpfdist://mdw:8090/1.log') FORMAT 'TEXT' (DELIMITER '|');
CREATE EXTERNAL TABLE
msong=# select * from new_test;
ERROR: gpfdist error - line too long in file /tmp/1.log near (0 bytes) (url.c:1746) (seg13 slice1 sdw3:43001 pid=23691) (cdbdisp.c:1499)
DETAIL: External table new_test, file gpfdist://mdw:8090/1.log
msong=#

gpfdist verbose logs contains following 500 session error:

ps -ef|grep gpfdist
gpadmin  20727 27150  0 13:37 pts/5    00:00:00 gpfdist -d /tmp -l /tmp/1.log -p 8090 -V
cat /tmp/1.log
[2014-06-18 13:39:14] ::ffff:172.28.8.7 - 500 session error
[44] request end
---------------------------------------------------
[2014-06-18 13:39:14] ::ffff:172.28.8.7 requests /1.log
[2014-06-18 13:39:14] [45] got a request: GET /1.log HTTP/1.1
[2014-06-18 13:39:14] request headers:Host:172.28.8.250:8090
Accept:*/*
X-GP-XID:1402370147-0000043790
X-GP-CID:0
X-GP-SN:0
X-GP-SEGMENT-ID:38
X-GP-SEGMENT-COUNT:48
X-GP-LINE-DELIM-LENGTH:-1
X-GP-PROTO:1
X-GP-MASTER_HOST:172.28.8.250
X-GP-MASTER_PORT:4300
X-GP-CSVOPT:m0x92q0h0
X-GP_SEG_PG_CONF:/data1/primary_4300/gpseg38/postgresql.conf
X-GP_SEG_DATADIR:/data1/primary_4300/gpseg38
X-GP-DATABASE:msong
X-GP-USER:gpadmin
X-GP-SEG-PORT:43002
X-GP-SESSION-ID:83085
[2014-06-18 13:39:14] ::ffff:172.28.8.7 - 500 session error
[45] request end
---------------------------------------------------

Resolution

Use -m option to increase the max row length for gpfdist, this value can be increased up to 256MB.

For example, increasing the value up to 64KB solved the issue in our test case.

gpfdist -t 600 -d /tmp -l /tmp/1.log  -p 8090 -V -m 655350 &

NOTE: If using gpload, you can pass the -m parameter value to gpfdist using the MAX_LINE_LENGTH parameter in the YAML file.

Comments

Powered by Zendesk