Pivotal Knowledge Base

Follow

Network troubleshooting guide

 Environment 

Product Version
OS RHEL 6.x

Overview

We often find ourselves troubleshooting network packet loss errors and misconstrue their meaning.  In this article, we will describe some of the standard tools available on most systems for troubleshooting network-related issues. We also explain which layer these tools are reporting on, so that the user can get a sense where the fault might be and how to proceed from there. Understanding what the available network troubleshooting tools are and when to use them can be a big help with misbehaving distributed applications.    

Table Of Contents
 Overview
 Bad Network
 Inspecting Nic Firmware and Driver
 Inspecting Kernel and Application
 Other Tips 

The basic high-level life cycle of a network datagram as it travels through the NIC (Network Interface Card) to the application.  This is obviously skipping a lot of information and lot of abstract concepts. So do not expect to get a deep dive on how Linux handles the network. Instead, this article should provide guidance on how to troubleshoot the network.

  • The packet will first reach the NIC firmware.
  • The packet then gets buffered by the NIC driver and waits for the kernel.
  • When the kernel is ready, the packet gets pulled off of the driver buffer and into the kernel's buffer.
  • Finally, when the application is ready, the packet will be pulled from the kernel buffer.

The simplified image above shows us that there are a lot of layers that a network packet has to go through before it reaches the applications. It is important to understand this because sometimes when seeing packet loss on the client, it can actually be directly correlated to an application issue. Distributed applications rely heavily on the network to share and process data. If the application cannot keep up with the incoming data, then we may start to see network packets dropped by the kernel. Understanding this basic flow will help you work the problem.

  • Bad Network

From the typical distributed application user perspective, it may not always be obvious if the problem that they are facing is related to the external network. It is important to rule out what you can before assuming immediately that it is some mystical networking issue. Before you call in your network expert, we will explore some things that can be checked first.  

Checklist

  • Inspecting at the NIC/Firmware/Driver level

 

ethtool can read information from the NIC driver given an associated interface name.

  • This command shows NIC Driver State. Immediately from this output, we can see if the interface link is up or down "Link detected: yes". In addition, we can ensure the Speed is 10Gbit and Duplex is Full.
    • ethtool <interface name>
      root@node4 ~]# ethtool eth0
      Settings for eth0:
      Supported ports: [ ]
      Supported link modes: Not reported
      Supported pause frame use: No
      Supports auto-negotiation: No
      Advertised link modes: Not reported
      Advertised pause frame use: No
      Advertised auto-negotiation: No
      Speed: 10000Mb/s
      Duplex: Full
      Port: Twisted Pair
      PHYAD: 0
      Transceiver: internal
      Auto-negotiation: off
      MDI-X: Unknown
      Link detected: yes
  • Checking the NIC Driver settings. In general, distributed applications will send and receive large amounts of data. So having things like generic receive offload (GRO) or large receive offload (LRO) enabled may result in a network bottleneck at the driver level. In the case of GPDB, we transmit UDP datagrams in8192 byte chunks whichresults inip fragmentation that will put your network interface card to work. If you have these settings enabled and are experiencing network latency related symptoms, then a good test would be to disable GRO and LRO.
    • ethtool -k <interface name>
      [root@mdw ~]# ethtool -k eth0
      Offload parameters for eth0:
      Cannot get device udp large send offload settings: Operation not supported
      rx-checksumming: on
      tx-checksumming: on
      scatter-gather: on
      tcp segmentation offload: on
      udp fragmentation offload: off
      generic segmentation offload: off
      generic-receive-offload: off
      large-receive-offload: on
  • Capital S switch will dump all the firmware/driver counters available. Typically there is a lot of information dumped and the outputs will vary depending on software revisions. If you find that there are CRC errors, then you can assume there is a hardware-related problem with the NIC cable or the NIC card itself. The recommended actions here would be to first try and replace the network cable that plugs into the eth portandask the network team to rule out hardware on the switch. If all that fails, then replace the NIC on the given server.
    • ethtool -S <interface name>
      [root@mdw ~]# ethtool -S eth0 | egrep "crc|error"
           rx_error_bytes: 0
           tx_error_bytes: 0
           tx_mac_errors: 0
           tx_carrier_errors: 0
           rx_crc_errors: 123
           rx_align_errors: 0

ifconfig and "netstat -i" will both pretty much give you the same kind of information which are the tx/rx error counters you will see in ethtool -S. Counters found in RX-DRP or RX-OVR are typical indicators that there are packet errors at the driver level.  If you see Drops or Overruns then it could mean that the kernel is not pulling packets fast enough or the driver is not able to keep up with the workload.  In the case of the driver, it would be a good time to reach out to the vendor for support and see if there are any firmware/driver updates or settings that can be implemented to improve performance here. 

  • ifconfig <interface name>
    [root@gpdb-sandbox ~]# ifconfig eth0
    eth0 Link encap:Ethernet HWaddr 00:0C:29:F7:B1:14 
     inet addr:172.16.34.128 Bcast:172.16.34.255 Mask:255.255.255.0
     inet6 addr: fe80::20c:29ff:fef7:b114/64 Scope:Link
     UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
     RX packets:84887 errors:0 dropped:0 overruns:0 frame:0
     TX packets:28074 errors:0 dropped:0 overruns:0 carrier:0
     collisions:0 txqueuelen:1000 
     RX bytes:43123224 (41.1 MiB) TX bytes:5168804 (4.9 MiB)
  • netstat -i <interface name>
    [root@mdw ~]# netstat -i
    Kernel Interface table
    Iface       MTU Met    RX-OK RX-ERR RX-DRP RX-OVR    TX-OK TX-ERR TX-DRP TX-OVR Flg
    eth0       1500   0 618457118      0      0      0 644714020      0      0      0 BMRU
    eth1       1500   0 106690824      0      0      0 13942215      0      0      0 BMRU
    eth4       1500   0 579489994      0      0      0  4526273      0      0      0 BMRU
    eth5       1500   0     5145      0      0      0       11      0      0      0 BMRU
    lo        16436   0  1040195      0      0      0  1040195      0      0      0 LRU
    vmnet1     1500   0        0      0      0      0        6      0      0      0 BMRU
    vmnet8     1500   0  1896834      0      0      0        6      0      0      0 BMRU
  • Inspecting at the kernel and application level

 

tpcdump operates at the kernel level and can be useful for inspecting TCP based applications.  For information on how and when to use this tool, refer to Running TCPDUMP to debug distributed applications.

"netstat -s" will dump out all of the kernel counters and provides detailed information related to the UDP/TCP kernel stack counters. 

  • If the TCP-collapsed counter is incrementing, then it may indicate the application is not reading packets off of the kernel buffer fast enough. While the kernel is dropping packets here, it means the application could very well be the bottleneck and we need to determine if the problem is simply workload-related.
    • netstat -st |egrep -i collapsed
      [root@mdw ~]# netstat -st | egrep -i collapsed
          3190 packets collapsed in receive queue due to low socket buffer
  • If we see a lot of network retransmissions, it usually indicates that the receiver is not acknowledging our packets fast enough or the receiver has never received the original packet as it was dropped somewhere along the way.
    • netstat -st |egrep -iretrans
      [root@mdw ~]# netstat -st | egrep -i retrans
          81145 segments retransmited
          51597 fast retransmits
          25709 forward retransmits
          1750 retransmits in slow start
          62 sack retransmits failed
  • If UDP is reporting received errors, then that is similar to TCP collapsed errors where the application is not reading data fast enough from the kernel. You might see this counter go up in cases where there is a processing skew in GPDB because all segments are sending data over UDP to a single segment.
    • netstat -su |egrep error
      [root@mdw ~]# netstat -su | egrep error
          0 packet receive errors

Other Useful Tips

Testing MTU settings with ping can help rule out Jumbo Frame issues. If Jumbo Frames are enabled, ifconfig will report "MTU:9000" and when Jumbo Frames are disabled, ifconfig will report "MTU:1500".  In some environments where Jumbo Frames are enabled on the local client and disabled at some endpoint in the network, then a simple ping test will return success all the time even though there is a fault. A good way to detect the Jumbo Frame misconfiguration is by making the ping command send a Jumbo Frame to the remote node and check if it reports packet loss.

  • ping -s 8192

traceroute can be useful in multi-VLAN environments where nodes have to go to the default gateway to find the remote node. traceroute will show you how many network routes it had to make and which network path the data is flowing through.

gpadmin@hadoop:~$ traceroute node4
traceroute to node4 (172.17.0.14), 30 hops max, 60 byte packets
 1 node4.phd.local (172.17.0.14) 0.065 ms 0.024 ms 0.022 ms

sar if installed, can provide some high-level historical metrics. By default, sar has a 10-minute polling interval so the data resolution can be a bit weak. However, checking the network counters here could help provide evidence that there were network issues during a given event.

[gpadmin@gpdb-sandbox ~]$ sar -n DEV 1 1
Linux 2.6.32-573.el6.x86_64 (gpdb-sandbox.localdomain) 04/23/2016 _x86_64_ (2 CPU

02:08:20 PM IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s
02:08:21 PM lo 3.96 3.96 1.64 1.64 0.00 0.00 0.00
02:08:21 PM eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00

Average: IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s
Average: lo 3.96 3.96 1.64 1.64 0.00 0.00 0.00
Average: eth0 0.00 0.00 0.00 0.00 0.00 0.00 0.00

telnet is useful when you want to test if a firewall is blocking a given port. An example is shown below; let us assume we have applications running on port 5432 and 1521 for the given host, host.domain. The example shows that there might be a firewall blocking port 1521:

[root@gpdb-sandbox ~]# telnet host.domain 5432
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
^CConnection closed by foreign host.

[root@gpdb-sandbox ~]# telnet host.domain 1521
Trying 127.0.0.1...
telnet: connect to address 127.0.0.1: Connection refused

Also, remember to look out for proxy settings in the environment.

 

 

Comments

Powered by Zendesk