Pivotal Knowledge Base

Follow

Query gpfdist external table failed with "HTTP/1.0 400 invalid request"

Environment

Greenplum database 4.x

Symptom

When querying external table via the gpfdist protocol, it fails with error message “HTTP/1.0 400 invalid request.”

Error Message:

1. Query gpfdist based external table and failed

gpadmin=# create external table ext_tt1(c1 int, c2 int, c3 int) location ('gpfdist://192.0.2.1:8081/BATCH 1/*.csv') format 'CSV';
CREATE EXTERNAL TABLE
Time: 221.612 ms
gpadmin=# select * from ext_tt1;
ERROR: http response code 400 from gpfdist (gpfdist://192.0.2.1:8081/BATCH 1/*.csv): HTTP/1.0 400 invalid request (url_curl.c:454) (seg0 slice1 sdw3:49000 pid=5609) (cdbdisp.c:1322)

2. Logs from the Greenplum segment

2017-08-14 12:52:42.761391 CDT,"gpadmin","gpadmin",p5609,th-843743168,"192.0.2.1","47178",2017-08-14 12:52:38 CDT,379759,con54,cmd7,seg0,slice1,,x379759,sx1,"LOG","00000","quit fill_buffer due to still_running = 0, bot = 0, top = 0, want = 1, for_write = 0, error = 0, eof = 0, datalen = 0, maxfd = 18, nfds = 1, e = 0",,,,,,"select * from ext_tt1;",0,,"url_curl.c",715,
2017-08-14 12:52:43.828111 CDT,"gpadmin","gpadmin",p5609,th-843743168,"192.0.2.1","47178",2017-08-14 12:52:38 CDT,379759,con54,cmd7,seg0,slice1,,x379759,sx1,"ERROR","XX000","http response code 400 from gpfdist (gpfdist://192.0.2.1:8081/BATCH 1/*.csv): HTTP/1.0 400 invalid request (url_curl.c:454)",,,,,,"select * from ext_tt1;",0,,"url_curl.c",454,"Stack trace:

3. gpfdist logs

2017-08-14 12:52:24 24311 INFO [0:7:0:9] ::ffff:192.0.2.2 requests /BATCH
2017-08-14 12:52:24 24311 INFO [0:7:0:9] got a request at port 42902:
GET /BATCH 1/*.csv HTTP/1.1
2017-08-14 12:52:24 24311 INFO [0:7:0:9] request headers:
2017-08-14 12:52:24 24311 INFO [0:7:0:9] Host:192.0.2.1:8081
2017-08-14 12:52:24 24311 INFO [0:7:0:9] Accept:*/*
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-XID:1502731703-0000000045
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-CID:0
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-SN:1
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-SEGMENT-ID:0
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-SEGMENT-COUNT:24
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-LINE-DELIM-LENGTH:-1
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-PROTO:1
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-MASTER_HOST:127.0.0.2
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-MASTER_PORT:54320
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-CSVOPT:m1x34q34h0n0
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP_SEG_PG_CONF:/data1/primary/bloomtest/gpseg0/postgresql.conf
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP_SEG_DATADIR:/data1/primary/bloomtest/gpseg0
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-DATABASE:gpadmin
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-USER:gpadmin
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-SEG-PORT:49000
2017-08-14 12:52:24 24311 INFO [0:7:0:9] X-GP-SESSION-ID:54
2017-08-14 12:52:24 24311 INFO [0:12:0:16] reject invalid request from ::ffff:192.0.2.2
2017-08-14 12:52:24 24311 WARN [0:12:0:16] HTTP ERROR: ::ffff:192.0.2.2 - 400 invalid request 

Cause

A whitespace character is included in the LOCATION field in the definition of the external table. As shown in above example whitespace is included in "BATCH 1." 

According to GPDB document wildcards or other C-style pattern matching is accepted by gpfdist for denoting multiple files. And a whitespace character should be specified as [[:space:]].

Resolution

Follow the document to specify whitespace character as necessary. 

For issue in above example, revise DDL of an external table as follows. Then it could work.

gpadmin=# create external table ext_tt1(c1 int, c2 int, c3 int) location ('gpfdist://192.0.2.1:8081/BATCH[[:space:]]1/*.csv') format 'CSV';
CREATE EXTERNAL TABLE
gpadmin=# select * from ext_tt1;
c1 | c2 | c3
----+----+----
1 | 2 | 3
(1 row)

Additional Information

Greenplum Create External Table Documentation

Comments

Powered by Zendesk