Pivotal Knowledge Base

Follow

HDB query fails with network issue against ISILON HDFS

Environment

  • ISILON 7.1.1
  • HDB 1.2.x

Symptom

During a benchmark testing with HDB 1.2 AND ISILON 7.1.1 many HDB queries fail with errors like "Append-Only Storage Read could not open segment file ..." as listed below.

ERROR: Append-Only Storage Read could not open segment file 'hdfs://phdisi5-hdfs.all-nc.alliances.isilon.com:8020/hawq_data/gpseg12/16385/16555/16571.1' for relation 'date_dim' (seg12 slice1 ip-10-111-129-225.all-nc.alliances.isilon.com:40004 pid=387529)

"Connection reset by peer" error could be seen in HDB segment logs.

0711 21:07:00.673917 339732 HdfsInputStreamImpl.cpp:345] cannot seek to block 4428530951 on node10.111.129.25:8021 :Connection reset by peer, retry another node",,,,,,,,"SysLoggerMain","syslogger.c",520,
E0711 21:10:17.318006 343201 RpcChannel.cpp:478] Retrying connect to server: phdisi5-hdfs.all-nc.alliances.isilon.com:8020 due to Software caused connection abort

"Connection reset by peer" is also seen in ISILON HDFS logs.

all-nc-s-1(id1) isi_hdfs_d: shutdown(fd==36) error: Connection reset by peer

Cause

The benchmark testing will make HDB open many TCP connections to the name node on ISILON at a fast rate. In this release of ISILON there is a limitation which only allows for 12 new pending TCP connections in the listening queue. Once the listening queue is full any new TCP connections will receive a TCP reset from ISILON resulting in the HDB failure.

Fix

A patch can be provided by ISILON Which increases the listening queue limitation. Form more information please contact ISILON Support <LINK: https://support.emc.com >.

Comments

Powered by Zendesk