Pivotal Knowledge Base

Follow

Locks held by orphan process without PID or Session ID

Environment

Product Version
 Pivotal Greenplum (GPDB)  4.3.x
 OS  RHEL 6.x

Problem

  • One of the tables are locked and users are unable to query that table only. Other sessions should not face issue if not trying to access the locked table.
  • The pg_locks identifies there is a process holding the relation in AccessShareLock, but the "pid" and other process information are not shown for that transaction.

Here is a code during the time of the issue.

-- The oid of the table.

bi_prod=# select oid,relname ,relnamespace from pg_class where relname='dim_marketing_channels_stg';
    oid    |          relname           | relnamespace
-----------+----------------------------+--------------
 134578649 | dim_marketing_channels_stg |         2200
  25772518 | dim_marketing_channels_stg |     25770500
(2 rows)

Time: 2.130 ms

-- There is a lock on segment 50 ( content 50 ), but there is no PID / mppsessionid to identify the process information.

bi_prod=# select * from pg_locks where relation=25772518;
 locktype | database | relation | page | tuple | transactionid | classid | objid | objsubid | transaction | pid |      mode       | granted | mppsessionid | mppiswriter | gp_seg
ment_id
----------+----------+----------+------+-------+---------------+---------+-------+----------+-------------+-----+-----------------+---------+--------------+-------------+-------
--------
 relation | 19830152 | 25772518 |      |       |               |         |       |          |   257476400 |     | AccessShareLock | t       |            0 | f           |
     50
(1 row)

-- The current primary for the content 50.

gpadmin=# select * from gp_segment_configuration where content=50;
 dbid | content | role | preferred_role | mode | status | port  | hostname | address | replication_port | san_mounts
------+---------+------+----------------+------+--------+-------+----------+---------+------------------+------------
  101 |      50 | p    | p              | s    | u      | 40002 | sdw9     | sdw9-1  |            41002 |
  144 |      50 | m    | m              | s    | u      | 50003 | sdw12    | sdw12-2 |            51003 |
(2 rows)

Time: 2.098 ms

Cause

One of the main reason is when the process information is missing on pg_locks and when the query was under prepared transaction on one of the segments and the process crashes.

So in the above mentioned issue, we witness the following:

-- Login into the content 50 server.

[gpadmin@mdw ~]$ ssh sdw9
Last login: Thu Jun 26 03:18:11 2014 from mdw

-- Logged in via PGOPTIONS='-c gp_session_role=utility' psql -p 40002

-- Checking the prepared transaction view on the segment 50 , shows there is a open transaction 257476400 which matches the transaction column of pg_locks

bi_prod=# select * from pg_prepared_xacts ;
 transaction |          gid          |           prepared            |       owner       | database
-------------+-----------------------+-------------------------------+-------------------+-----------
   257476966 | 1401009074-0007049016 | 2014-06-24 09:08:33.450082-04 | gpmon             | gpperfmon
   257476400 | 1401009074-0007048086 | 2014-06-24 09:08:53.542825-04 | gp_etl_engine_dwh | bi_prod
(2 rows)

-- Master log file shows that master lost connection to this segment half through the transaction and this is the reason for the above

2014-06-24 09:08:34.120709 EDT,"gp_etl_engine_dwh","bi_prod",p13130,th-2095704672,"127.0.0.1","11640",2014-06-24 09:08:34 EDT,72048277,con17463,cmd3491,seg-1,,dx6397135,x72048277,sx1,"LOG","00000","DTM error (gathered 96 results from cmd 'Distributed Commit Prepared')","Lost connection to one or more segments - fault detector checking for segment failures. (seg50 sdw9:40002 pid=23693)",,,,,,0,,"cdbtm.c",1952,
2014-06-24 09:08:34.120742 EDT,"gp_etl_engine_dwh","bi_prod",p13130,th-2095704672,"127.0.0.1","11640",2014-06-24 09:08:34 EDT,72048277,con17463,cmd3491,seg-1,,dx6397135,x72048277,sx1,"WARNING","01000","The distributed transaction 'Commit Prepared' broadcast failed to one or more segments for gid = 1401009074-0007048086.",,,,,,,0,,"cdbtm.c",696,
2014-06-24 09:08:34.121868 EDT,"gp_etl_engine_dwh","bi_prod",p13130,th-2095704672,"127.0.0.1","11640",2014-06-24 09:08:34 EDT,72048277,con17562,,seg-1,,dx6397135,x72048277,sx1,"LOG","00000","The previous session was reset because its gang was disconnected (session id = 17463). The new session id = 17562",,,,,,,0,,"cdbgang.c",2772,

Resolution

Restart the database and that should clear the orphaned lock on the segment. Please do not rollback the transaction. 

Check whether lock is released after restart:

bi_prod=#  select * from pg_locks where relation=25772518;
 locktype | database | relation | page | tuple | transactionid | classid | objid | objsubid | transaction | pid | mode | granted | mppsessionid | mppiswriter | gp_segment_id
----------+----------+----------+------+-------+---------------+---------+-------+----------+-------------+-----+------+---------+--------------+-------------+---------------
(0 rows)

bi_prod=# select * from pg_prepared_xacts ;
 transaction |          gid          |           prepared            | owner | database
-------------+-----------------------+-------------------------------+-------+-----------
(0 rows)

Comments

Powered by Zendesk