Pivotal Knowledge Base

Follow

Pivotal Greenplum external table query fails with "org.apache.hadoop.ipc.RPC$VersionMismatch"

Environment

Product Version
Pivotal Greenplum All Versions
Pivotal Hadoop All Versions

Problem

Queries using gphdfs protocol to read / write data from / to an external table may fail with an error org.apache.hadoop.ipc.RPC$VersionMismatch and logs from namenode may indicate an error message related to incorrect header or version. Snippet showcasing query errors

lab=# select * from ext_exp; 
ERROR: external table gphdfs protocol command ended with error. 14/02/05 17:38:43 INFO security.UserGroupInformation: Login successful for user gpadmin@PHD.DEV.COM using keytab file /etc/gphd/hadoop/conf/gpadmin.keytab (seg25 slice1 sdw4:1026 pid=342100)
DETAIL: 14/02/05 17:38:43 ERROR security.UserGroupInformation: PriviledgedActionException as:gpadmin@PHD.DEV.COM (auth:KERBEROS) cause:org.apache.hadoop.ipc.RemoteException(null): org.apache.hadoop.ipc.RPC$VersionMismatch
14/02/05 17:38:45 ERROR security.UserGroupInformation: PriviledgedActionException as:gpadmin@PHD.DEV.COM (auth:KERBEROS) ca
Command: 'gphdfs://host1.local.com:8020/data/hawq/hawq_test_file'
External table ext_exp, file gphdfs://host1.local.com:8020/data/hawq/hawq_test_file

Snippet from namenode logs

2014-02-05 17:29:07,985 WARN org.apache.hadoop.ipc.Server: Incorrect header or version mismatch from 10.181.22.132:37947 got version 7 expected version 8

Cause

Such error is seen when there is a different version of hadoop binaries on Pivotal Hadoop cluster nodes and the Greenplum database (or in general, any hdfs client node).

Solution

Ensure that hadoop client binaries of the same version as that of the Pivotal Hadoop cluster are installed at Greenplum database master and segment servers.

In this instance, Greenplum Database had PHD 1.1.0 hadoop client binaries and hadoop cluster was using PHD 1.1.1 binaries.

Comments

Powered by Zendesk