Pivotal Knowledge Base

Follow

Scripts to Terminate all idle connections in Pivotal Greenplum database cluster

Environment

Product Version
Pivotal Greenplum (GPDB) All versions

Purpose

How to kill all idle connections in Pivotal Greenplum Database cluster?

Cause

Users of Greenplum Database may open connections to the database through clients such as PSQL, and then leave those connections open and idle for long periods of time without logging off. Depending on the number of users, this could consume system resources and claim most or all of the available connections.

For example, in a system using the default setting for the server parameter max_connections (25 connections on the master), 24 idle connections would leave only one available connection before reaching the maximum.

Note:

Please verify the script mentioned in the documents on a test cluster before running it on production or important clusters/database, as those scripts are for education purpose only.

Resolution

When client sessions terminates / exit, Greenplum session cleanup will happen automatically.

Copy the following script, onto a file name idleconn.sh, to terminate / kill idle connections on the Greenplum Database master instance.

#!/bin/sh
conn=0
TIME=2700
HELP=0
################################################################################################
# Objective  : To identify the CONNECTIONS  IDLE for more than 45 minutes
# Summary    : This will identify the pid's and will issues a kill for then:
# Logging    : All logs will be stored in $HOME/gpAdminLogs/gp_idle.log
# Consequences  : Connections killed will get the below error messages and they would need to reconnect
#            FATAL:  terminating connection due to administrator command
#            server closed the connection unexpectedly
#                This probably means the server terminated abnormally
#                before or while processing the request.
#            The connection to the server was lost. Attempting reset: Succeeded.
################################################################################################
help() {
echo '
    -h    : Display help
    -c    : kill idle connections
    -t    : age of query and connection [default: 2700 seconds ]
    idleconn.sh -c  -t 3000 [default: 2700 values in seconds]
    idleconn.sh -c            [default: 2700 values in seconds]
'
}
source $GPHOME/greenplum_path.sh
LOGFILE=${HOME}/gpAdminLogs/gp_idle_$(date '+%Y%m%d').log

#
# Function to log the idle connections onthe system
#
logging (){
   # Logging all the connections which were idle for more than 2700s

   # If query_start is null then the user connected but ran no query. So check the backend_start time.
   psql -t -c "select * from pg_stat_activity where query_start is null and now()-backend_start > '${TIME}s'" template1 >> $LOGFILE

   # if query_start is not null then the user ran a query at some stage. So check the query_start time rather than the backend_start.
   psql -t -c "select * from pg_stat_activity where query_start is not null and now()-query_start > '${TIME}s'" template1 >> $LOGFILE
}

#
# Function to kill the idle connections
#
idle_conn () {
   # Generating pid's for connections opened for more then 45 minutes :

   # If query_start is null then the user connected but ran no query. So check the backend_start time.
   psql -A -t -c "SELECT 'kill '||procpid from pg_stat_activity where query_start is null and now()-backend_start > '${TIME}s' and current_query='' " template1 | bash

   # if query_start is not null then the user ran a query at some stage. So check the query_start time rather than the backend_start.
   psql -A -t -c "SELECT 'kill '||procpid from pg_stat_activity where query_start is not null and now()-query_start > '${TIME}s' and current_query='' " template1 | bash
}


#
# Main program
#
if [ $# -ne 0 ] ; then
   while getopts "ct:h" opts ; do
      case  "$opts" in
         c)
            conn=1
            ;;
         t)
            TIME=$OPTARG;
            ;;
         h)
            HELP=1
            ;;
      esac
   done
fi

echo $(date)" -- Start" >> $LOGFILE

if [ $HELP -eq 1 ] || [ $conn -eq 0 ];then
   help
fi
if [ $conn -eq 1 ];then
   logging
   idle_conn
fi
echo $(date)" -- End" >> $LOGFILE


#Remove any old log file...
find ${HOME}/gpAdminLogs -name gp_idle\*.log -mtime +60 -exec rm {} \;

To execute the script, use

/bin/sh idleconn.sh -c

To execute script to terminate IDLE connection running longer than 3000 sec use

/bin/sh idleconn.sh -c -t 3000

By default the -t is set to 2700 seconds.

Note : In Greenplum 4.x, Greenplum introduced GUC "gp_vmem_idle_resource_timeout" to free up idle session resources ( only segment connection, not master session) For more details reference Greenplum Database 4.x Administrator's Guide from pivotal documentation web site.

Comments

  • Avatar
    Christian

    nice article, very helpful, but if you need to kill it "live" I use that command ;P

    select pg_terminate_backend(procpid) from pg_stat_activity where current_query='<IDLE>';

  • Avatar
    Joe Manning

    I modified the script to check the query_start if it is not NULL. But if it is NULL, then check the backend_start.
    I think it is better to check the query_start time rather than the time the client connected (backend_start).

  • Avatar
    Danilo Fortunato

    I would modify the script in this way:
    - terminate the idle sessions using pg_terminate_backend function, instead of killing the processes at the operating system level, as it seems much safer
    - replace current_query='' with current_query='<IDLE>', as this is the value I get from pg_stat_activity

    Edited by Danilo Fortunato
Powered by Zendesk