Any DML operation on an external table encounters error like below:
ERROR: <port> substring size must not exceed 8 characters (seg5 slice1 sdw2:46001 pid=9343)
The above error was a result of SELECT operation . Here sdw2 is the segment host that has encountered error while trying to communicate to gpfdist host mentioned in LOCATION clause of EXTERNAL TABLE during creation as shown below:
template1=# \d+ ext_expenses
External table "public.ext_expenses" Column | Type | Modifiers | Storage | Description ----------+------+-----------+----------+------------- name | text | | extended | category | text | | extended | Type: readable Encoding: UTF8 Format type: text Format options: delimiter '|' null ' ' escape '\' External location: gpfdist://:etlhost-1:8081/*.txt
This issue is happening due to an error in the syntax with the definition of the external table. As you can see the External location above shows an extra colon (:) just before the hostname (etlhost-1).
There should not be any colon (:) before the hostname(OR IP address if used). Since there is a colon in front of the hostname, it is considering to be the port name, and port name cannot exceed 8 characters.
Drop and Create the external table again with proper syntax. Below is the example for both the cases(hostname and IP address)
-- Wrong syntax for LOCATION
LOCATION ( 'gpfdist://:etlhost-1:8081/*.txt'
LOCATION ( 'gpfdist://:172.28.8.20:8081/*.txt'
-- Correct syntax for LOCATION
LOCATION ( 'gpfdist://etlhost-1:8081/*.txt'
LOCATION ( 'gpfdist://172.28.8.20:8081/*.txt'
Complete correct DDL should be like :
CREATE EXTERNAL TABLE ext_expenses ( name text, category text ) LOCATION ('gpfdist://etlhost-1:8081/*.txt') FORMAT 'TEXT' ( DELIMITER '|' NULL ' ') ;