Pivotal Knowledge Base

Follow

gptransfer fails with message "ERROR: missing data for column {column name}"

Environment

Product Version
Pivotal Greenplum (GPDB) 4.3.5.x

Symptom

  • gptransfer fails with ERROR:  missing data for column "{column name}".
  • The column in the ordinal position before the column mentioned in the error message has a field with newline characters \n in it.
  • This issue does not occur on GPDB 4.3.6.x or above.

The error message is shown below:

20160229:13:58:11:038935 gptransfer:gpdbsne:gpadmin-[ERROR]:-Failed to transfer table gpadmin.public.gptrans
20160229:13:58:11:038935 gptransfer:gpdbsne:gpadmin-[ERROR]:-error 'ERROR: missing data for column "col2" (seg1 slice1 gpdbsne:50001 pid=40399)
DETAIL: External table ext_gptrans_7c210e71297b9f1a96a9bff41b4cd927, line 1 of gpfdist://gpdbsne:8001/gpadmin.public.gptrans.pipe.1: "OR,33,6,0,N,1,0,W,7,"
' in 'INSERT INTO public.gptrans SELECT * FROM gptransfer.ext_gptrans_7c210e71297b9f1a96a9bff41b4cd927'
20160229:13:58:11:038935 gptransfer:gpdbsne:gpadmin-[WARNING]:-Some tables failed to transfer. A list of these tables
20160229:13:58:11:038935 gptransfer:gpdbsne:gpadmin-[WARNING]:-has been written to the file failed_transer_tables_20160229_135738.txt
20160229:13:58:11:038935 gptransfer:gpdbsne:gpadmin-[WARNING]:-This file can be used with the -f option to continue
20160229:13:58:11:038935 gptransfer:gpdbsne:gpadmin-[WARNING]:-the data transfer.
20160229:13:58:11:038935 gptransfer:gpdbsne:gpadmin-[INFO]:-Removing work directories...
20160229:13:58:16:038935 gptransfer:gpdbsne:gpadmin-[INFO]:-Finished.
[gpadmin@gpdbsne ~]$

Cause 

This is a known issue (MPP-23953) with gptransfer in versions older than Greenplum 4.3.6.1.

The Greenplum Database gptransfer utility, in versions older than 4.3.6.1, failed if the column data contained newline characters and in some cases when handling Null data and no data ('').

These issues have been resolved. The default gptransfer utility has been enhanced to use the CSV format to transfer data. The CSV format can handle data that contains newline characters and Null data.

For information about changes to gptransfer, see the 4.3.6.1 release notes under gptransfer Utility Changes

RCA 

To demonstrate the issue, we will first need a table to transfer.

Create a table such as the following:

CREATE TABLE gptrans (
col1 character varying(300),
col2 character varying(10),
col3 character varying(10)
) DISTRIBUTED BY (col1);

Copy some data into it and ensure some of the data has newline (\n) characters:

COPY gptrans (col1, col2, col3) FROM stdin;
OR,33,6,0,N,1,0,W,7,\nWA 33 T6N R1W 8,0 \N \N
\.

Now, we will attempt to transfer this with GPDB 4.3.5.1 to illustrate the error:

gptransfer -d gpadmin -t gpadmin.test.gptrans --dest-database=gptrans --truncate --source-port=5432
20160229:13:57:38:038935 gptransfer:gpdbsne:gpadmin-[INFO]:-Starting gptransfer with args: -d gpadmin -t gpadmin.test.gptrans --dest-database=gptrans --truncate --source-port=5432 .
<snip> .
20160229:13:57:58:038935 gptransfer:gpdbsne:gpadmin-[INFO]:-Starting gpfdist for readable external table for table gpadmin.public.gptrans...
20160229:13:57:59:038935 gptransfer:gpdbsne:gpadmin-[INFO]:-Creating source writable external table for source table gpadmin.public.gptrans...
20160229:13:58:00:038935 gptransfer:gpdbsne:gpadmin-[INFO]:-Creating external table for destination table gptrans.public.gptrans...
20160229:13:58:00:038935 gptransfer:gpdbsne:gpadmin-[INFO]:-Transfering data gpadmin.public.gptrans -> gptrans.public.gptrans...
20160229:13:58:04:038935 gptransfer:gpdbsne:gpadmin-[ERROR]:-Failed to transfer table gpadmin.amfamdba.gptrans
20160229:13:58:05:038935 gptransfer:gpdbsne:gpadmin-[INFO]:-Closing named pipes for table gpadmin.public.gptrans...
20160229:13:58:05:038935 gptransfer:gpdbsne:gpadmin-[INFO]:-Stopping gpfdist for source table gpadmin.public.gptrans...
20160229:13:58:06:038935 gptransfer:gpdbsne:gpadmin-[INFO]:-Removing FIFO pipes for source table gpadmin.public.gptrans...
20160229:13:58:11:038935 gptransfer:gpdbsne:gpadmin-[ERROR]:-Failed to transfer table gpadmin.public.gptrans
20160229:13:58:11:038935 gptransfer:gpdbsne:gpadmin-[ERROR]:-error 'ERROR: missing data for column "col2" (seg1 slice1 gpdbsne:50001 pid=40399)
DETAIL: External table ext_gptrans_7c210e71297b9f1a96a9bff41b4cd927, line 1 of gpfdist://gpdbsne:8001/gpadmin.public.gptrans.pipe.1: "OR,33,6,0,N,1,0,W,7,"
' in 'INSERT INTO public.gptrans SELECT * FROM gptransfer.ext_gptrans_7c210e71297b9f1a96a9bff41b4cd927'
20160229:13:58:11:038935 gptransfer:gpdbsne:gpadmin-[WARNING]:-Some tables failed to transfer. A list of these tables
20160229:13:58:11:038935 gptransfer:gpdbsne:gpadmin-[WARNING]:-has been written to the file failed_transer_tables_20160229_135738.txt
20160229:13:58:11:038935 gptransfer:gpdbsne:gpadmin-[WARNING]:-This file can be used with the -f option to continue
20160229:13:58:11:038935 gptransfer:gpdbsne:gpadmin-[WARNING]:-the data transfer.
20160229:13:58:11:038935 gptransfer:gpdbsne:gpadmin-[INFO]:-Removing work directories...
20160229:13:58:16:038935 gptransfer:gpdbsne:gpadmin-[INFO]:-Finished.

Now, we reattempt the transfer on GPDB 4.3.6.1 to illustrate a successful run:

gptransfer -t gpadmin.public.gptrans --dest-database=gptrans_done --truncate --source-port=5432
20160229:13:20:47:013466 gptransfer:gpdb-sandbox:gpadmin-[INFO]:-Starting gptransfer with args: -t gpadmin.public.gptrans --dest-database=gptrans_done --truncate --source-port=5432 .
<snip> .
20160229:13:20:49:013466 gptransfer:gpdb-sandbox:gpadmin-[INFO]:-Creating schema public in database gptrans_done...
20160229:13:20:50:013466 gptransfer:gpdb-sandbox:gpadmin-[INFO]:-Starting transfer of gpadmin.public.gptrans to gptrans_done.public.gptrans...
20160229:13:20:50:013466 gptransfer:gpdb-sandbox:gpadmin-[INFO]:-Truncating target table gptrans_done.public.gptrans...
20160229:13:20:51:013466 gptransfer:gpdb-sandbox:gpadmin-[INFO]:-Transfering data gpadmin.public.gptrans -> gptrans_done.public.gptrans...
20160229:13:21:02:013466 gptransfer:gpdb-sandbox:gpadmin-[INFO]:-Finished transferring table gpadmin.public.gptrans, remaining 0 of 1 tables
20160229:13:21:02:013466 gptransfer:gpdb-sandbox:gpadmin-[INFO]:-Removing work directories...
20160229:13:21:07:013466 gptransfer:gpdb-sandbox:gpadmin-[INFO]:-Finished. 

Resolution

  • Upgrade to Greenplum 4.3.6.1 or above to resolve this issue.
  • If upgrading is not an option; remove the newline characters from the data before transfering with gptransfer.

Additional Information

 

Comments

Powered by Zendesk