Pivotal Knowledge Base

Follow

Packet loss found in flooding ping test from PHD compute node to Isilon node

Environment

Product Version
Pivotal HD (PHD) (compute node running HBase, Hive, HAWQ)  All Versions
 Isilon (data storage hosting HDFS)  All Versions

Symptom

Flooding ping tests from PHD compute node to Isilon storage node result in packet loss. Examples are shown below:

1. Ping is done in force mode, 35 packets are dropped.

[root@hdc3 ~]# ping -f isi-sc.gphd.local -c 2000

PING isi-sc.gphd.local (172.28.8.224) 56(84) bytes of data.
...................................
--- isi-sc.gphd.local ping statistics ---
2000 packets transmitted, 1965 received, 1% packet loss, time 640ms
rtt min/avg/max/mdev = 0.016/0.022/1.962/0.044 ms, ipg/ewma 0.320/0.021 ms

2. Ping is done with the intervals of .001 seconds, 2 packets are dropped.

[root@hdc3 ~]# ping -i .001 isi-sc.gphd.local -c 2000

--- isi-sc.gphd.local ping statistics ---
2000 packets transmitted, 1998 received, 0% packet loss, time 2018ms
rtt min/avg/max/mdev = 0.018/0.020/0.085/0.004 ms

Cause

The following error messages are seen in the /var/log/message on the target Isilon node.

2014-05-08T06:47:08-07:00 <0.5> isi-cluster1-1(id1) /boot/kernel.amd64/kernel: Limiting icmp ping response from 1067 to 1000 packets/sec
2014-05-08T07:35:49-07:00 <0.5> isi-cluster1-1(id1) /boot/kernel.amd64/kernel: Limiting icmp ping response from 1027 to 1000 packets/sec
2014-05-08T07:37:52-07:00 <0.5> isi-cluster1-1(id1) /boot/kernel.amd64/kernel: Limiting icmp ping response from 1028 to 1000 packets/sec
... ...

A system parameter on the Isilon node called net.inet.icmp.icmplim will limit the ICMP request rate, which is set to 1000 by default. So, some packets are dropped in the tests listed above.

[root@hdc3 ~]# grep net.inet.icmp.icmplim sysctl.conf
net.inet.icmp.icmplim=1000

Resolution

Increase the parameter to a larger value to adapt ICMP request rate. There are 2 ways to make the change.

1. Change it in /etc/sysctl.conf. A system reboot is needed to make the change effective.

2. Run the command "sysctl net.inet.icmp.icmplim=<new value> as root. The change will be lost upon system reboot.

If the problem remains after net.inet.icmp.icmplim is adjusted to the appropriate value, then you need to examine if the issue is due to poor network performance.

Comments

Powered by Zendesk