Pivotal Knowledge Base

Follow

How to do failed Segment analysis and recovery

Goal

This article lists several basic steps to do self-analysis of failed segments , it also shows how to do segment recovery.

Along with the steps, an example is shown where mirror segment has failed (mirror segment postmaster process was shut down via SIGTERM signal).

Solution

  1. Identify the failed segment(s).

This can be done using "gp_segment_configuration" table.

Segment is identified by its "dbid".

test=# -- Search for segments not in "u"/"s" state
test=# select * from gp_segment_configuration where (status, mode) != ('u', 's');
dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------
2 | 0 | p | p | c | u | 62500 | mdw | mdw | 41000 |
3 | 0 | m | m | s | d | 50000 | mdw | mdw | 51000 |

Take a snapshot of the "gp_segment_configuration" table:

COPY (select * from gp_segment_configuration order by 1) to ;

Alternatively, using management utility "gpstate -e", you can identify the segments which are marked down.

Note: If all the down segments are from a specific hosts then there could be a filesystem OR server issue. 
In these scenarios you can verify the server in question using : (Here we assume all the down segments are from sdw2)
- ping sdw2 - ssh sdw2 If ssh went fine then check filesystem using below commands. Here we assume the data partitions are data1 and data2: - du -sh /data1 /data2 If any issue noticed then please engage EMC DCA team.
  • Identify the event and the time of the failure.

Use table "gp_configuration_history" and search via "dbid" , search for entries for both primary and mirror for the respective content.

Note the time the failure happened.

test=# -- Search for events about dbid 2 and 3 for the last hour
test=# select * from gp_configuration_history where dbid in (2,3) and time > now() - interval '1 hour' order by 1;
time | dbid | desc
-------------------------------+------+---------------------------------------------------------------------
2014-01-28 17:09:38.359024-05 | 2 | FTS: content 0 fault marking status UP mode: change-tracking role p
2014-01-28 17:09:38.359024-05 | 3 | FTS: content 0 fault marking status DOWN role m

Take a snapshot of the "gp_configuration_history" table.

COPY (select * from gp_configuration_history order by 1) to <location>;
  1. Go to the master segment log file and look for FTS log entries related to the segment down event (use the timing of the gp_configuration_history as a guideline).

Location of logs is $MASTER_DATA_DIRECTORY/pg_log (normally /data/master/gpseg-1/pg_log).

2014-01-28 17:09:38.356767 EST,,,p26526,th49608608,,,,0,con3,,seg-1,,,,,"LOG","00000","FTS: primary (dbid=2) reported mirroring fault with mirror (dbid=3), mirror considered to be down.",,,,,,,0,,"ftsfilerep.c",358,
2014-01-28 17:09:38.359375 EST,"gpadmin","postgres",p23510,th49608608,"[local]",,2014-01-28 17:09:38 EST,1100119376,,cmd2,seg-1,,,x1100119376,sx1,"LOG","00000","statement: insert into gp_configuration_history values (now(), 2, 'FTS: content 0 fault marking status UP mode: change-tracking role p')",,,,,,"insert into gp_configuration_history values (now(), 2, 'FTS: content 0 fault marking status UP mode: change-tracking role p')",0,,"postgres.c",1542,
2014-01-28 17:09:38.538611 EST,"gpadmin","postgres",p23510,th49608608,"[local]",,2014-01-28 17:09:38 EST,1100119376,,cmd6,seg-1,,,x1100119376,sx1,"LOG","00000","statement: insert into gp_configuration_history values (now(), 3, 'FTS: content 0 fault marking status DOWN role m')",,,,,,"insert into gp_configuration_history values (now(), 3, 'FTS: content 0 fault marking status DOWN role m')",0,,"postgres.c",1542,
2014-01-28 17:09:38.572991 EST,,,p26526,th49608608,,,,0,con3,,seg-1,,,,,"LOG","00000","FTS: primary (dbid=2) on mdw:62500 transitioning to change-tracking mode, mirror marked as down.",,,,,,,0,,"ftsfilerep.c",498,

Note: If the primary was down instead of the mirror, then mirror will be assuming primary role and the log file would include instead something similar to:

2014-01-31 15:48:54.250637 EST,,,p24962,th-915353792,,,,0,con3,,seg-1,,,,,"LOG","00000","FTS: mirror (dbid=3) on mdw:50000 taking over as primary in change-tracking mode.",,,,,,,0,,"ftsfilerep.c",492,
2014-01-31 15:48:54.299174 EST,"gpadmin","test",p25116,th-915353792,"[local]",,2014-01-31 15:48:44 EST,1100121506,con8,cmd3,seg-1,,dx4,x1100121506,sx1,"LOG","00000","FTS: reconfiguration is in progress",,,,,,"begin;",0,,"cdbfts.c",1120,
  1. Go to segment log files and try to find relevant log entries considering the time of the event.

Location of segment logs can be determined using the below query. 

Note: DBID can be found from the table gp_segment_configuration.

warehouse=# select hostname, fselocation from gp_segment_configuration gp, pg_filespace_entry pg where gp.dbid=pg.fsedbid and gp.dbid in (1,2);
hostname | fselocation
-----------+---------------------------
sdw1 | /data/gpdb/primary/gpseg0
sdw2 | /data/gpdb/master/gpseg-1

From the above output, you now know which host and which segment directory corresponds to the dbid of the segment which failed. Then to go to log directory,

ssh sdw1
cd /data/gpdb/primary/gpseg0/pg_log/

Primary log contains log messages identifying the loss of connection to the mirror, failure on operations and the transition that is happening

2014-01-28 17:09:23.485405 EST,,,p26475,th108550048,,,1999-12-31 19:00:00 EST,0,,,seg-1,,,,,"WARNING","XX000","receive close on connection: Success (cdbfilerepconnserver.c:334)",,,,,,,0,,"cdbfilerepconnserver.c",334,
2014-01-28 17:09:38.681707 EST,,,p26474,th108550048,,,,0,,,seg-1,,,,,"WARNING","01000","mirror failure, could not complete mirrored request identifier 'shutdown' ack state 'waiting for ack', failover requested",,"run gprecoverseg to re-establish mirror connectivity",,,"mirroring role 'primary role' mirroring state 'sync' segment state 'in shutdown' process name(pid) 'filerep main process(26474)' filerep state 'not initialized' position ack begin '0x2ada14bfd040' position ack end '0x2ada14c7d040' position ack insert '0x2ada14c5d638' position ack consume '0x2ada14c5d638' position ack wraparound '0x2ada14c7d040' insert count ack '543' consume count ack '544' ",,0,,"cdbfilerepprimaryack.c",898,
2014-01-28 17:09:38.681770 EST,,,p26474,th108550048,,,,0,,,seg-1,,,,,"WARNING","01000","mirror failure, could not complete operation on mirror, failover requested","identifier 'shutdown' operation 'shutdown' relation type 'control message' message count '-1'","run gprecoverseg to re-establish mirror connectivity",,,"mirroring role 'primary role' mirroring state 'sync' segment state 'in shutdown' process name(pid) 'filerep main process(26474)' filerep state 'not initialized' position ack begin '0x2ada14bfd040' position ack end '0x2ada14c7d040' position ack insert '0x2ada14c5d638' position ack consume '0x2ada14c5d638' position ack wraparound '0x2ada14c7d040' insert count ack '543' consume count ack '544' position begin '0x2ada143fc040' position end '0x2ada14bfc100' position insert '0x2ada147c2460' position consume '0x2ada147c0748' position wraparound '0x2ada14bfc100' insert count '1339' consume count '1339' ",,0,,"cdbfilerepprimary.c",1290,
2014-01-28 17:09:38.791906 EST,,,p26409,th108550048,,,,0,,,seg-1,,,,,"LOG","00000","filerep main process (PID 26474) exited with exit code 0",,,,,,,0,,"postmaster.c",5609,
2014-01-28 17:09:38.797445 EST,,,p23514,th108550048,,,,0,,,seg-1,,,,,"LOG","00000","mirror transition, primary address(port) 'mdw(41000)' mirror address(port) 'mdw(51000)'",,,,,"mirroring role 'primary role' mirroring state 'change tracking' segment state 'in shutdown' process name(pid) 'filerep main process(23514)' filerep state 'not initialized' ",,0,,"cdbfilerep.c",3371,
2014-01-28 17:09:38.804996 EST,,,p23515,th108550048,,,,0,,,seg-1,,,,,"LOG","00000","CHANGETRACKING: ChangeTracking_RetrieveIsTransitionToInsync() found insync_transition_completed:'true' full resync:'false'",,,,,,,0,,"cdbresynchronizechangetracking.c",2522,
2014-01-28 17:09:38.805275 EST,,,p23515,th108550048,,,,0,,,seg-1,,,,,"LOG","00000","searching for last checkpoint location for creating the initial resynchronize changetracking",,,,,,,0,,"xlog.c",10959,
2014-01-28 17:09:38.805484 EST,,,p23515,th108550048,,,,0,,,seg-1,,,,,"LOG","00000","scanned through 1 initial xlog records since last checkpoint for writing into the resynchronize change log",,,,,,,0,,"cdbresynchronizechangetracking.c",206,

Mirror segment log contains log messages identifying the shutdown request (kill <postmaster_pid>) and the shutdown of the filerep process.

Location of can be identified using the query shared above in the post.

2014-01-28 17:09:23.484983 EST,,,p26411,th-1702717536,,,,0,,,seg-1,,,,,"LOG","00000","received smart shutdown request",,,,,,,0,,"postmaster.c",4186,
2014-01-28 17:09:23.532979 EST,,,p26411,th-1702717536,,,,0,,,seg-1,,,,,"LOG","00000","filerep main process (PID 26485) exited with exit code 0",,,,,,,0,,"postmaster.c",5609,

If more information is necessary, please open a service request with Support and provide:

gp_segment_configuration table output.
gp_configuration_history table output.
master segment log file
primary/mirror segments log files
if the error is resource/OS/hardware related, please provide the respective OS log files and outputs - /var/log/messages, sar historical logs, etc.
  1. Recovering the segment.

Run recovery only when you have identified why it went down at first place.

Make sure that the servers are accessible and the OS and hardware are running properly.

When you have identified the reason the segment went down and want to recover it, issue the "gprecoverseg" command. This will initiate incremental recovery.

Steps in the recovery:

  • the segments (both primary and mirror) will be transitioned in recovery mode ("u"/"r").
test=# select * from gp_segment_configuration where dbid in (2,3);
dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------
2 | 0 | p | p | r | u | 62500 | mdw | mdw | 41000 |
3 | 0 | m | m | r | u | 50000 | mdw | mdw | 51000 |

test=# select * from gp_configuration_history where dbid in (2,3) and time > now() - interval '1 hour' order by 1;
time | dbid | desc
-------------------------------+------+-----------------------------------------------------------------------------------------------
2014-01-28 17:09:38.359024-05 | 2 | FTS: content 0 fault marking status UP mode: change-tracking role p
2014-01-28 17:09:38.359024-05 | 3 | FTS: content 0 fault marking status DOWN role m
2014-01-28 17:28:40.615605-05 | 3 | gprecoverseg: segment config for resync: segment mode and status
2014-01-28 17:28:43.795785-05 | 2 | gprecoverseg: segment resync marking mirrors up and primaries resync: segment mode and status
2014-01-28 17:28:43.795785-05 | 3 | gprecoverseg: segment resync marking mirrors up and primaries resync: segment mode and status
  • The mirror segment will be recovered from the primary using the changetracking information You can use "gpstate -e" to monitor the recovery progress
gpadmin:Fullrack@mdw $ gpstate -e
20140128:17:29:38:000858 gpstate:mdw:gpadmin-[INFO]:-Starting gpstate with args: -e
20140128:17:29:38:000858 gpstate:mdw:gpadmin-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.2.3.1 build 2'
20140128:17:29:38:000858 gpstate:mdw:gpadmin-[INFO]:-master Greenplum Version: 'PostgreSQL 8.2.15 (Greenplum Database 4.2.3.1 build 2) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.4.2 compiled on Dec 13 2012 20:07:11'
20140128:17:29:38:000858 gpstate:mdw:gpadmin-[INFO]:-Obtaining Segment details from master...
20140128:17:29:38:000858 gpstate:mdw:gpadmin-[INFO]:-Gathering data from segments...
....
20140128:17:29:42:000858 gpstate:mdw:gpadmin-[INFO]:-----------------------------------------------------
20140128:17:29:42:000858 gpstate:mdw:gpadmin-[INFO]:-Segment Mirroring Status Report
20140128:17:29:42:000858 gpstate:mdw:gpadmin-[INFO]:-----------------------------------------------------
20140128:17:29:42:000858 gpstate:mdw:gpadmin-[INFO]:-Segment Pairs in Resynchronization
20140128:17:29:42:000858 gpstate:mdw:gpadmin-[INFO]:- Current Primary Port Resync mode Est. resync progress Data synced Est. total to sync Est. resync end time Change tracking size Mirror Port
20140128:17:29:42:000858 gpstate:mdw:gpadmin-[INFO]:- mdw 62500 Incremental 100% 288 kB Sync complete; awaiting config change 102 MB mdw 50000
  • Segments will be transitioned in normal (sync) mode ("u"/"s")
test=# select * from gp_segment_configuration where dbid in (2,3);
dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------
2 | 0 | p | p | s | u | 62500 | mdw | mdw | 41000 |
3 | 0 | m | m | s | u | 50000 | mdw | mdw | 51000 |

test=# select * from gp_configuration_history where dbid in (2,3) and time > now() - interval '1 hour' order by 1;
time | dbid | desc
-------------------------------+------+-----------------------------------------------------------------------------------------------
2014-01-28 17:09:38.359024-05 | 2 | FTS: content 0 fault marking status UP mode: change-tracking role p
2014-01-28 17:09:38.359024-05 | 3 | FTS: content 0 fault marking status DOWN role m
2014-01-28 17:28:40.615605-05 | 3 | gprecoverseg: segment config for resync: segment mode and status
2014-01-28 17:28:43.795785-05 | 2 | gprecoverseg: segment resync marking mirrors up and primaries resync: segment mode and status
2014-01-28 17:28:43.795785-05 | 3 | gprecoverseg: segment resync marking mirrors up and primaries resync: segment mode and status
2014-01-28 17:29:38.599875-05 | 2 | FTS: changed segment to insync from resync.
2014-01-28 17:29:38.599875-05 | 3 | FTS: changed segment to insync from resync.

The master log file will contain information about these transitions also:

2014-01-28 17:29:38.597679 EST,,,p26526,th49608608,,,,0,con3,,seg-1,,,,,"LOG","00000","FTS: updating segment to in-sync.",,,,,,,0,,"fts.c",1125,
2014-01-28 17:29:38.602814 EST,"gpadmin","postgres",p907,th49608608,"[local]",,2014-01-28 17:29:38 EST,1100119417,,cmd6,seg-1,,,x1100119417,sx1,"LOG","00000","statement: insert into gp_configuration_history values (now(), 2, 'FTS: changed segment to insync from resync.')",,,,,,"insert into gp_configuration_history values (now(), 2, 'FTS: changed segment to insync from resync.')",0,,"postgres.c",1542,
2014-01-28 17:29:38.603178 EST,"gpadmin","postgres",p907,th49608608,"[local]",,2014-01-28 17:29:38 EST,1100119417,,cmd8,seg-1,,,x1100119417,sx1,"LOG","00000","statement: insert into gp_configuration_history values (now(), 3, 'FTS: changed segment to insync from resync.')",,,,,,"insert into gp_configuration_history values (now(), 3, 'FTS: changed segment to insync from resync.')",0,,"postgres.c",1542,
2014-01-28 17:29:38.603334 EST,,,p26526,th49608608,,,,0,con3,,seg-1,,,,,"LOG","00000","FTS: resynchronization of mirror (dbid=3, content=0) on mdw:50000 has completed.",,,,,,,0,,"fts.c",1183,
  • Only required when roles have switched

In case the roles have switched, i.e preferred_role of a segment was primary 'p' but if its now acting as mirror 'm', in order to switch to preferred roles, rebalance operation using gprecoverseg is required or you can restart the database.

Note: Rebalance operation causes segments to restart, any running queries will be cancelled, so please plan it during a maintenance window.

You can verify if the segment roles are switched using below two methods:

-- Query gp_segment_configuration

warehouse=# select * from gp_segment_configuration where role!=preferred_role;
 dbid | content | role | preferred_role | mode | status | port | hostname | address | replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+-----------+-----------+------------------+------------
 2 | 0 | m | p | r | u | 40000 | localhost | localhost | 41000 |
 3 | 0 | p | m | r | u | 50000 | localhost | localhost | 51000 |

-- Verify using "gpstate -e". Look for "Segments with Primary and Mirror Roles Switched" in the output.

20150115:10:57:07:002364 gpstate:master:bhuvnesh-[INFO]:-----------------------------------------------------
20150115:10:57:07:002364 gpstate:master:bhuvnesh-[INFO]:-Segment Mirroring Status Report
20150115:10:57:07:002364 gpstate:master:bhuvnesh-[INFO]:-----------------------------------------------------
20150115:10:57:07:002364 gpstate:master:bhuvnesh-[INFO]:-Segments with Primary and Mirror Roles Switched
20150115:10:57:07:002364 gpstate:master:bhuvnesh-[INFO]:- Current Primary Port Mirror Port
20150115:10:57:07:002364 gpstate:master:bhuvnesh-[INFO]:- localhost 50000 localhost 40000
20150115:10:57:07:002364 gpstate:master:bhuvnesh-[INFO]:-----------------------------------------------------
  • Commands to perform rebalance operation
gpadmin:Fullrack@mdw $ gprecoverseg -r
or
gpadmin:Fullrack@mdw $ gpstop -raf

Please NOTE:

  1. If all the down segments are from a specific host then there could be a filesystem/switch/server problem. Engage EMC DCA team to fix any hardware issue if found.
  2. In certain cases (missing files on mirror, etc.) the incremental recovery will fail. The reason will be found in the primary/mirror log files. If the problem is in the mirror, then full recovery (gprecoverseg -F) can be used. Full recovery will transfer all of the contents of the primary segment data directory to the mirror.
  3. Usual reasons for segment to fail:
  • mirror cannot keep up with workload - log entry will be recorded before the transition warning for timeout expiration (WARNING: "threshold '75' percent of 'gp_segment_connect_timeout=' is reached mirror may not be able to keep up with primary, primary may transition to change tracking", "increase guc 'gp_segment_connect_timeout' by 'gpconfig' and 'gpstop -u' ").
  • segment crash - log entry will be written with the type of the crash and usually the stack of the crashed process.
  • segment Out of Memory / Postmaster Reset - if segment runs out of memory in an important location in the code, Postmaster Reset is done to protect the instance and the data -
  • filesystem problems on primary or mirror segment - the kernel returned error message will be recorded in the log file

Comments

Powered by Zendesk