Pivotal Knowledge Base

Follow

Greenplum Error: "Interconnect Error Segment lost Contact with Master"

Environment

 Product  Version
 Pivotal Greenplum  4.3.x
 OS  RHEL 6.x

Symptom

Master log shows error messages "Unexpected internal error (cdbgang.c:1447)", for example:

2014-05-01 18:24:33.948410 PDT,"dbuser","db",p45271,th-1144576256,"xx.xx.xx.xx","12322",2014-05-01 18:07:57 PDT,34925338,con40629,cmd450,seg-1,,dx4430905,x34925338,sx1,"ERROR","XX000","Unexpected internal error (cdbgang.c:1447)",,,,,,"some SQL",0,,"cdbgang.c",1447,"Stack trace:
1    0xa6fdf9 postgres  (elog.c:468)
2    0xa74202 postgres elog_internalerror (elog.c:279)
3    0xb9bb4a postgres allocateGang (cdbgang.c:1519)
4    0x705c6d postgres AssignGangs (execUtils.c:1691)
5    0x6ebceb postgres ExecutorStart (execMain.c:549)
6    0x915ff9 postgres PortalStart (pquery.c:873)
7    0x90c704 postgres  (postgres.c:2451)
8    0x91067d postgres PostgresMain (postgres.c:4928)
9    0x876181 postgres  (postmaster.c:6801)
10   0x87c2c0 postgres PostmasterMain (postmaster.c:2346)
11   0x7811ba postgres main (main.c:212)
12   0x7f28b96edcdd libc.so.6 __libc_start_main (??:0)
13   0x47cae9 postgres  (??:0)
"

Primary segment log shows error messages "Interconnect error segment lost contact with master (recv)", for example:

2014-05-01 13:47:14.163434 PDT,"dbuser","db",p53867,th527841024,"192.168.17.125","39149",2014-05-01 13:46:25 PDT,46549276,con29197,cmd33,seg11,slice6,dx2195689,x46549
276,sx1,"ERROR","58M01","Interconnect error segment lost contact with master (recv)",,,,,,"someSQL" 

Cause

Possible reason for this problem is that the master's address in gp_segment_configuration is pointing to the public IP address instead of the private IP address.

For example:

template1=# select * from gp_segment_configuration where content=-1;
 dbid | content | role | preferred_role | mode | status | port |      hostname      |      address       | replication_port | san_mounts
------+---------+------+----------------+------+--------+------+--------------------+--------------------+------------------+------------
    1 |      -1 | p    | p              | s    | u      | 5432 | gp-prd-rpt-master  | gp-prd-rpt-master  |                  |

$ cat /etc/hosts|grep master
10.111.111.111 gp-prd-rpt-master.xxx.com gp-prd-rpt-master
192.168.16.125 gp-prd-rpt-master-1
192.168.17.125 gp-prd-rpt-master-2

And all the segments' addresses are pointing to private IP addresses:

template1=# select * from gp_segment_configuration where content=11;
 dbid | content | role | preferred_role | mode | status | port  |          hostname          |      address      | replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+----------------------------+-------------------+------------------+------------
   13 |      11 | p    | p              | s    | u      | 40001 | gp-prd-rpt-ss06.xxx.com | gp-prd-rpt-ss06-2 |            42001 |
   41 |      11 | m    | m              | s    | u      | 41001 | gp-prd-rpt-ss08.xxx.com | gp-prd-rpt-ss08-1 |            43001 |
(2 rows)

$ cat /etc/hosts|grep -i gp-prd-rpt-ss06-2
192.168.17.116 gp-prd-rpt-ss06-2

Resolution

1. Change the master's address to point to a private IP address in /etc/hosts. For example:

$ cat /etc/hosts|grep master
10.111.111.111 gp-prd-rpt-master.xxx.com 
192.168.16.125 gp-prd-rpt-master-1 gp-prd-rpt-master
192.168.17.125 gp-prd-rpt-master-2

2. SCP the /etc/hosts to all hosts.

3. Restart Greenplum.

 

Comments

Powered by Zendesk