Pivotal Knowledge Base

Follow

HowTo - Enable core generation on a server ?

Goal

This article will describe the steps required to enable core generation. Core's are very useful to debug an application crash, we recommend to enable core generation on the master and segment servers for HAWQ and Greenplum Databases

Solution

Before we proceed, let's first identify the current setting on the server related to the core file. You can use the below command to identify if core generation is enabled or not. The below indicates that core file generation is effectively disabled (0 size of core file will be saved).

gpadmin$ ulimit -a
core file size (blocks, -c) 0 

We have to follow two steps to enable the core file generation:

1. Make the core file size change persistent

As root open /etc/security/limits.d/corefiles.conf file (you might need to create a new one) and type the following:

 # Core file size set to unlimited
gpadmin - core unlimited
 

Save the file and login as gpadmin user and verify both soft and hard limits are set to unlimited:

 [root@hdp1 ~]# su - gpadmin
[gpadmin@hdp1 ~]$ ulimit -S -c
unlimited
[gpadmin@hdp1 ~]$ ulimit -H -c
unlimited

2. Define naming convention and location for core files

As root open file /etc/sysctl.d/cores_sysctl.conf and add the below lines (Based on the OS of your server, use the appropriate name of parameters. Below example is for RHEL servers)

kernel.core_uses_pid = 1
kernel.core_pattern = /<directory>/core-%e-%s-%u-%g-%p-%t {Choose the directory where you want to place the core files, their size may range in GB's, so choose it appropriately}

where:
kernel.core_uses_pid = 1 - Appends the coring processes PID to the core file name.
kernel.core_pattern = /<directory>/core-%e-%s-%u-%g-%p-%t - When the application terminates abnormally, a core file should appear in the /tmp. The kernel.core_pattern sysctl controls exact location of core file. You can define the core file name with the following template whih can contain % specifiers which are substituted by the following values when a core file is created:
%% - A single % character
%p - PID of dumped process
%u - real UID of dumped process
%g - real GID of dumped process
%s - number of signal causing dump
%t - time of dump (seconds since 0:00h, 1 Jan 1970)
%h - hostname (same as ’nodename’ returned by uname(2))
%e - executable filename

Make sure the location of your choosing has permissions 1777. Otherwise, gpadmin won't be able to write the core files.

c) Apply changes at runtime 

Load the configuration from /etc/sysctl.d/cores_sysctl.conf using the following command:

 [root@hdp1 ~]# sysctl -p /etc/sysctl.d/cores_sysctl.conf
 kernel.core_uses_pid = 1
 kernel.core_pattern = /var/crash/user/core-%e-%s-%u-%g-%p-%t
 [root@hdp1 ~]#

d) Verification

Verify the runtime values using the following commands:

[root@hdp1 ~]# sysctl kernel.core_pattern
kernel.core_pattern = /var/crash/user/core-%e-%s-%u-%g-%p-%t
[root@hdp1 ~]# sysctl kernel.core_uses_pid
kernel.core_uses_pid = 1

Note: Greenplum / HDB database needs to be restarted to ensure that these changes are effective.

Note2: Make sure to relog your gpadmin session to pick up the changes in the limits.

After restarting Greenplum / HDB this command will help to verify the running limits on Greenplum / HDB processes:

cat /proc/$(pgrep -f silent)/limits

Comments

  • Avatar
    Ignacio Elizaga

    Just a quick comment on this - gpadmin user will need to be able to write in kernel.core_pattern directory. So we need to make sure that this directory is either owned by gpadmin user or drwxrwxrwt permissions. Otherwise, root user will be able to generate the corefiles but gpadmin will not.

  • Avatar
    Gowri Kothandaraman

    Yes Ignacio You are correct. root user will be able to generate the corefiles but gpadmin will not.Sample Repro

    Root:-

    [root@gpdb_singlenode var]# pwd
    /var

    [root@gpdb_singlenode var]# ls -lrtha core
    total 8.0K
    drwxr-xr-x. 21 root root 4.0K Oct 11 12:59 ..
    drwxr-xr-x 2 root root 4.0K Oct 11 12:59 .

    [root@gpdb_singlenode var]# id
    uid=0(root) gid=0(root) groups=0(root)

    [root@gpdb_singlenode var]# sleep 60 &
    [1] 13729
    [root@gpdb_singlenode var]# kill -SIGABRT 13729

    [root@gpdb_singlenode var]# cd core/
    [1]+ Aborted (core dumped) sleep 60 (wd: /var)
    (wd now: /var/core)

    [root@gpdb_singlenode core]# pwd
    /var/core

    [root@gpdb_singlenode core]# ls -lrtha
    total 116K
    drwxr-xr-x. 21 root root 4.0K Oct 11 12:59 ..
    -rw------- 1 root root 316K Oct 11 13:00 core-sleep-6-0-0-13729-1444593639

    drwxr-xr-x 2 root root 4.0K Oct 11 13:00 .

    GPADMIN:-

    -bash-4.1$ sleep 60 &
    [1] 13785

    -bash-4.1$ kill -SIGABRT 13785

    -bash-4.1$ cd core/
    [1]+ Aborted sleep 60 (wd: /var)
    (wd now: /var/core)

    -bash-4.1$ ls -lrtha
    total 116K
    drwxr-xr-x. 21 root root 4.0K Oct 11 12:59 ..
    -rw------- 1 root root 316K Oct 11 13:00 core-sleep-6-0-0-13729-1444593639

    drwxr-xr-x 2 root root 4.0K Oct 11 13:00 .

    I changed core Directory permission to 777

    [root@gpdb_singlenode var]# chmod 777 core
    [root@gpdb_singlenode var]# stat core
    File: `core'
    Size: 4096 Blocks: 8 IO Block: 4096 directory
    Device: fd00h/64768d Inode: 1051472 Links: 2
    Access: (0777/drwxrwxrwx) Uid: ( 0/ root) Gid: ( 0/ root)
    Access: 2015-10-11 13:05:56.643808523 -0700
    Modify: 2015-10-11 13:00:39.205808517 -0700

    Change: 2015-10-11 13:06:10.517808539 -0700

    ====
    -bash-4.1$ id
    uid=500(gpadmin) gid=500(gpadmin) groups=500(gpadmin)

    -bash-4.1$ kill -SIGABRT 13941

    -bash-4.1$ ls -lrtha
    total 224K
    drwxr-xr-x. 21 root root 4.0K Oct 11 12:59 ..
    -rw------- 1 root root 316K Oct 11 13:00 core-sleep-6-0-0-13729-1444593639
    -rw------- 1 gpadmin gpadmin 316K Oct 11 13:07 core-sleep-6-500-500-13941-1444594022
    drwxrwxrwx 2 root root 4.0K Oct 11 13:07 .
    [1]+ Aborted (core dumped) sleep 60

    ====

    Now gpadmin able to generate core file with pid 13941 after changing the permission of core directory .

Powered by Zendesk