Segments server going down due to "Thread overran stack, or stack corrupted"


  • DCA
  • Red Hat Enterprise Linux 5.7 or later
  • XFS file system


Kernel panic where Instruction pointer is print_context_stack and Thread overran stack, or stack corrupted, this leads to

  1. Segments keep going down
  2. Kernel Panic

Below error reported in /var/crash/kernel/vmcore-dmesg

<4>CPU 16
<4>Modules linked in: autofs4 ipmi_devintf fuse cpufreq_ondemand acpi_cpufreq freq_table mperf bonding 8021q garp stp llc ipv6 xfs exportfs iTCO_wdt iTCO_vendor_support microcode bna(U) sb_edac edac_core i2c_i801 lpc_ich mfd_core ioatdma sg igb dca i2c_algo_bit i2c_core ptp pps_core ext3 jbd mbcache sd_mod crc_t10dif bfa(U) scsi_transport_fc scsi_tgt megaraid_sas wmi dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
<4>Pid: 197433, comm: python Not tainted 2.6.32-431.23.3.el6.x86_64 #1 EMC S2600GZ/S2600GZ
<4>RIP: 0010:[<ffffffff8101114d>] [<ffffffff8101114d>] print_context_stack+0xad/0x140
<4>RSP: 0018:ffff8800459038a8 EFLAGS: 00010002
<4>RAX: 00000000000250ac RBX: ffff880045903da0 RCX: 0000000000001a07
<4>RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000046
<4>Process python (pid: 197433, threadinfo ffff88091af7e000, task ffff88085fd4f500)
<4> ffff880000000018 ffff88091af7fff8 ffff880045903dc8 ffff880045901fc0
<4><d> ffffffff817a1c8a ffffffff810153a3 ffffffff817e99cb ffff880045903d98
<4><d> 000000000000cbe0 ffffffff816004a0 ffffffff817a1c8a ffff880045903fc0
<4>Call Trace:
<4> <IRQ>
<4> [<ffffffff810153a3>] ? native_sched_clock+0x13/0x80
<1>BUG: unable to handle kernel paging request at 0000000000025a94
<1>IP: [<ffffffff8101114d>] print_context_stack+0xad/0x140
<4>PGD 101ba5a067 PUD 93e88f067 PMD 0
<0>Thread overran stack, or stack corrupted
<4>Oops: 0000 [#3] SMP
<4>last sysfs file: /sys/devices/system/cpu/online
<0>Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: ffffffff8128bafe


This is caused due to RedHat bug, the article describes the situation with XFS overrunning the stack.  


Stack overrun issue has been addressed in kernel-2.6.32-504.el6. To avoid the crash, upgrade to kernel-2.6.32-504.el6 or later. If its DCA then upgrade to or higher which has the fix.


