Pivotal Greenplum Database (GPDB)
The end user is faced with a number of challenges when collecting TCPDUMP for a distributed application. This article will help the user narrow down the scope of the TCP collection help the user determine if a TCPDUMP will be useful when troubleshooting a distributed application
Some of Pivotal's distributed applications include
- Pivotal Greenplum
- Pivotal HDB
- Pivotal Hadoop
How to overcome the most common issues
Fill up the data partition
Depending on workload tcpdump data size can grow extremely fast. While troubleshooting one issue you could fill up the OS root partition and cause a very serious issue. To avoid this we can run a rolling TCPDUMP with parameters that control how much data is collected.
Too much data collected
Along with filling up the data partition we often see users performing a wide open trace with no filters. If you run a wide open trace on a 10Gig interface during production workload you can end up with Gigs of unnecessary data. The poor engineer assigned to review the trace has to spend hours parsing and cutting up the trace file to avoid the infamous Wireshark out of memory error.
So please help out the engineer by strategically applying a TCPDUMP Filter and limit the capture length. Let's say, for example, you need to collect a tcpdump to debug a HDFS permission issue when a mapreduce job runs. One could submit a TCPDUMP on all node managers in the cluster with no filter and collect a huge amount of useless data. Or the Engineer could apply filter "tcp port 8020 and host <ip of namenode>" to filter only on HDFS metadata traffic from that node.
Too much Data for TCPDUMP to collect
There can be Terabytes of data being transferred in some of these distributed applications. Running a TCP dump on a highly utilized node may cause the Kernel to discard some of the frames before sending them up the stack where TCPDUMP is listening. In this case, the packet would reach the application but TCPDUMP will not be able to collect the information.
Usually, when you run into this situation you may need to look at "When to take a tcpdump" because chances are a dump will not be helpful in this case. If the node is this much utilized then will most likely be the root of the problem and TCPDUMP is not required then
When to take a tcpdump
Please make sure to ask yourself this question before going down the road of tcpdump collection. In most cases, the application should have enough logging to help the user understand if there was a communication issue on the remote node. For example, if your application is GPFDIST and you set a read timeout of 300 seconds and the segment log says we timed out communicating with GPFDIST then a tcpdump might not help you here. The tcpdump will simply tell you that GPFDIST didn't respond so segment timed out. That was something we could infer from the logs.
Whenever possible avoid taking TCPDUMP. The best times to run a TCP collection is to simply rule out network latency, customer firewalls, and to debug what data is being sent to the remote node.
How to execute the rolling TCPDUMP which solves all of the issues
First, we need to prepare the node for collection. In order to run the dump you have to be root because the kernel will not allow non-super users to put a network interface into promiscuous mode, however, when tcpdump runs you it will by default write out files as the tcpdump user. We need to make sure tcpdump user has to write permissions to the collection directory
- mkdir -p /data/support/tcpdump
- chmod 777 /data/support/tcpdump
- verify tcpdump user can write to this new directory
sudo -u tcpdump touch /data/support/tcpdump/test123
Once the above pre-reqs are completed then you can start the trace
tcpdump -C 512 -W 8 -s 256 -w /data/support/tcpdump/gpdbgang_issue -i bond0 "tcp port 8020"
Now lets talk about the options. The above command will collect 8 files with size of 512MB resulting in 4GB of data collected. TCPDUMP will first write to file1 through file8 and once file8 has reached 512MB it will loop back and overwrite file1 resulting in a rolling tcpdump
the -s option is the packet capture length. In a typical infrastructure the max packet length is 1514 bytes. If you are not interested in capturing the data payload of packet then it is a good idea to limit this to about 256 bytes so you are sure to capture the packet header and some data.
|-C||512||Max file Size|
|-W||8||Total Number of Files to Collect|
|-s||256||Capture Lenght for each packet|
|-w||/data/support/tcpdump/gpdbgang_issue||path to save collection. in this case gpdbgang_issue will be used as prefix for the 8 files collected|
|-i||bond0||Only listen on Interface bond0|
|Filter||"tcp port 8020"||Filter on TCP port 8020|