Pivotal Knowledge Base

Follow

User logon Error "FATAL", "(57014)", "canceling statement due to statement timeout" seen in Pivotal Greenplum

Environment 

Product Version
Pivotal Greenplum (GPDB) 4.3.x
OS RHEL 6.x

Symptom 

Users are unable to log on to the database other than when using local host. When using external clients, all users, superuser or nonsuperuser, fail to log on. 

When users try to log on via external clients, it hangs for some time and errors out with the following errors in the master logs:

2016-08-06 12:51:39.878278 EDT,"pidsolardb1p","postgres",p40428,th-1642248416,"172.16.46.236","61277",2016-08-06 12:50:39 EDT,64569559,con455,,seg-1,,,x64569559,sx1,"LOG","00000","Issuing cancel signal (SIGINT) to my self (pid = 40428) for statement timeout.",,,,,,,0,,"proc.c",1551,
2016-08-06 12:51:39.878471 EDT,"pidsolardb1p","postgres",p40428,th-1642248416,"172.16.46.236","61277",2016-08-06 12:50:39 EDT,64569559,con455,,seg-1,,,x64569559,sx1,"LOG","00000","Process interrupt for 'query cancel pending'.",,,,,,,0,,"postgres.c",3523,
2016-08-06 12:51:39.878509 EDT,"pidsolardb1p","postgres",p40428,th-1642248416,"172.16.46.236","61277",2016-08-06 12:50:39 EDT,64569559,con455,,seg-1,,,x64569559,sx1,"FATAL","57014","canceling statement due to statement timeout",,,,,,,0,,"postgres.c",3544,

Resolution

All logon from outside clients was logged out irrespective of whether it was a superuser or nonsuperuser. There were no changes to the customer configuration.

The following was done to troubleshoot the issue:

  • Increased the log_min_messages to DEBUG1 from ERROR to login more details:
[gpadmin@mdw pg_log]$ gpconfig -s log_min_messages
Values on all segments are consistent
GUC          : log_min_messages
Master  value: DEBUG1
Segment value: DEBUG1
  • The master log logged in more details, as that master tried to acquire a session and hit the timeout : 

2016-08-06 13:47:17.264294 EDT,"test_sup","postgres",p53213,th-1642248416,"172.16.138.36","59400",2016-08-06 13:47:17 EDT,0,,,seg-1,,,,,"DEBUG1","00000","InitProcess(): gp_session_id -1, gp_is_callback 0",,,,,,,0,,"proc.c",407,

2016-08-06 13:47:17.297347 EDT,"test_sup","postgres",p53213,th-1642248416,"172.16.138.36","59400",2016-08-06 13:47:17 EDT,0,con1470,,seg-1,,,,,"DEBUG1","00000","SessionState_Acquire: pinCount: 1, activeProcessCount: 0",,,,,,,0,,"session_state.c",122,"Stack trace:

2016-08-06 13:48:17.298859 EDT,"test_sup","postgres",p53213,th-1642248416,"172.16.138.36","59400",2016-08-06 13:47:17 EDT,64571278,con1470,,seg-1,,,x64571278,sx1,"LOG","00000","Issuing cancel signal (SIGINT) to my self (pid = 53213) for statement timeout.",,,,,,,0,,"proc.c",1551,

  • pstack on the login pid, indicated that the connection was waiting on ldap:

[root@mdw ~]# pstack 53213

#0  0x000000364c80eaa0 in __connect_nocancel () from /lib64/libpthread.so.0

#1  0x00007f05a0de099e in ldap_pvt_connect () from /usr/local/greenplum-db/./lib/libldap-2.3.so.0

#2  0x00007f05a0de0e89 in ldap_connect_to_host () from /usr/local/greenplum-db/./lib/libldap-2.3.so.0

#3  0x00007f05a0dc8f41 in ldap_int_open_connection () from /usr/local/greenplum-db/./lib/libldap-2.3.so.0

#4  0x00007f05a0dddd83 in ldap_new_connection () from /usr/local/greenplum-db/./lib/libldap-2.3.so.0

#5  0x00007f05a0dc8929 in ldap_open_defconn () from /usr/local/greenplum-db/./lib/libldap-2.3.so.0

#6  0x00007f05a0ddd565 in ldap_send_initial_request () from /usr/local/greenplum-db/./lib/libldap-2.3.so.0

#7  0x00007f05a0dd0b2d in ldap_sasl_bind () from /usr/local/greenplum-db/./lib/libldap-2.3.so.0

#8  0x00007f05a0dd0c34 in ldap_sasl_bind_s () from /usr/local/greenplum-db/./lib/libldap-2.3.so.0

#9  0x00007f05a0dd15d5 in ldap_simple_bind_s () from /usr/local/greenplum-db/./lib/libldap-2.3.so.0

#10 0x000000000079b092 in CheckLDAPAuth ()

#11 0x000000000079bf90 in ClientAuthentication ()

#12 0x0000000000aef2f0 in InitPostgres ()

#13 0x0000000000969511 in PostgresMain ()

#14 0x00000000008c731e in ServerLoop ()

#15 0x00000000008ca0c0 in PostmasterMain ()

#16 0x00000000007c870f in main ()

  • strace all indicated that master was waiting on a response from an IP and hit the timeout. The IP pointed to customers ldap server:

[root@mdw ~]# strace -p 53213

Process 53213 attached - interrupt to quit

connect(5, {sa_family=AF_INET, sin_port=htons(389), sin_addr=inet_addr("172.16.94.52")}, 16) = ? ERESTARTSYS (To be restarted)

--- SIGALRM (Alarm clock) @ 0 (0) ---

select(3, NULL, [2], NULL, {1, 0})      = 1 (out [2], left {0, 999996})

write(2, "\0\0\0\0\304\0\0\0\335\317\0\0 G\35\236 G\35\236\0\0\0\0tcf\0\0\2\0\0"..., 244) = 244

Hence the customer was requested to check their LDAP server which was apparently hung. Once customer rebooted the LDAP server, connections completed successfully.

 

 

Comments

Powered by Zendesk