北肙

当你不能够再拥有,唯一可以做的,就是令自己不要忘记。

Enabling Kernel Dump and Analyzing the File for RHEL-based Linux

The Kdump Procedure The normal kernel booted with option crashkernel=<size>M , reseving \<size> memory for the kdump kernel. These \<size> of memory is unavailable for the normal kernel during regular operation. The kernel panics. The kdump kernel is booted using kexec with the \<size> of memory been reserved. The normal kernel's memory is captured into […]

The Kdump Procedure

  1. The normal kernel booted with option crashkernel=<size>M , reseving \<size> memory for the kdump kernel. These \<size> of memory is unavailable for the normal kernel during regular operation.
  2. The kernel panics.
  3. The kdump kernel is booted using kexec with the \<size> of memory been reserved.
  4. The normal kernel's memory is captured into a vmcore file.

1. Enabling Kdump Service

The Memory Usage

Memory reserved for the kdump kernel will be taken as long as the system booted.

Minimum memory requirments: 128MB + 64MB for each TB

The value of crashkernel

carshkernel=128M
# or
crashkernel=auto

Read the Error 3, getting the recommended value.

1.1 For RHEL6

Important
Hight recommended to disable IOMMU on Intel-based machine otherwise Intel IOMMU sometimes will impact the kdump service.

Warning
kdump is not completely compatible with HP Smart Array devices and HP system boards. It's high recommended to configure kdump to store the carsh dump to a remote server with NFS or SSH.

1.1.1 The Configuration

Add configuration string crashkernel=<size>M or crashkernel=auto to the kernel option of file /boot/grub/grub.conf.

# vi /boot/grub/grub.conf

or

grubby --update-kernel=ALL --args="crashkernel=auto"

/etc/grub.conf is a symbolic link to /boot/grub/grub.conf
cp -av /etc/grub.conf /boot/grub/

1.1.2 Enabling the kdump service

chkconfig kdump on
service start kdump

1.2 For RHEL7

1.2.1 Add grub kernel option

grubby --update-kernel=ALL --args="crashkernel=auto"

1.2.2 Enabling the kdump service

systemctl enable --now kdump
systemctl status kdump

1.3 Enabling kdump for a specified kernel

1.3.1 Listing the kernel enabled

ls -l /boot/vmlinuz-*

1.3.2 Add kernel option for a specified version

grubby --update-kernel=vmlinuz-a.bc.d-xyz-el7.x86_64 --args="crashkernel=auto"

1.3.3 Enabling the service

systemctl enable --now kdump

or
chkconfig kdump on
service kdump start

1.4 Disabling kdump

redhat.com: 16.3. Disabling the kdump service

2. The Configuration of Kdump

The configuation file

/etc/kdump.conf

2.1 Configuring the Target Type

2.1.1 core dump be stored in a local file system (default)

path /var/crash

2.1.2 Files be stored to a partition (optional)

ext4 /dev/sda3

or

ext4 LABEL=/boot

or

ext4 UUID=03138356-5e61-4ab3-b58e-27507ac41937

2.1.3 Files be stored to a device (optional)

raw /dev/sda5

2.1.4 Files be stored to an NFS server (optional)

net my.server.com:/export/tmp

2.1.5 Files be stored via SSH protocol (optional)

net [email protected]

2.2 Configuring the Core Collector

Core Collector

An application be used to reduce the size of the vmcore file, via compressing or selecting the data.

Default Value

core_collector makedumpfile -c --message-level 1 -d 31

2.2.1 Enabling file compression

core_collector makedumpfile -c

2.2.2 To remove pages

core_collector makedumpfile -c -d <value>
Supported value
Option Description
1 Zero pages
2 Cache pages
4 Cache private
8 User pages
16 Free pages

2.3 Changing the Default Action

The action be taken after kernel crash

Supported actions

Option Description
reboot Reboot the system after crash.
halt Halt the system.
poweroff Power off the system.
shell Run msh, user record the core manually.
mount_root_run_init for RHEL 6.2 and earlier.

2.3.1 Default value

default shell

3. Make the Kernel Crash (for testing)

3.1 Method 1: Sending a SysRq (system request)

echo 1 > /proc/sys/kernel/sysrq
echo c > /proc/sysrq-trigger

3.2 Method 2: Insert a NULL Pointer Kernel Module

Installing the packages

yum install -y make gcc

Creating folder for temporary using

mkdir /tmp/src && cd /tmp/src

Writing files named kernelpanic.c and Makefile

kernelnapic.c

#include <linux/kernel.h>
#include <linux/module.h>
MODULE_LICENSE("GPL"); // required by the compiler

int init_module(void){
    panic("Kernel Panic test");
    return 0;
}

Makefile

obj-m += kernelpanic.o
all:
    make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules

Building the kernel module

cd /tmp/src && make
[root@localhost]# make
make -C /lib/modules/2.6.32-431.el6.x86_64/build M=/tmp/null_pointer modules
make[1]: Entering directory `/usr/src/kernels/2.6.32-431.el6.x86_64'
  CC [M]  /tmp/null_pointer/kernelpanic.o
  Building modules, stage 2.
  MODPOST 1 modules
  CC      /tmp/null_pointer/kernelpanic.mod.o
  LD [M]  /tmp/null_pointer/kernelpanic.ko.unsigned
  NO SIGN [M] /tmp/null_pointer/kernelpanic.ko
make[1]: Leaving directory `/usr/src/kernels/2.6.32-431.el6.x86_64'

Insert kernel module kernelpanic.ki to make kernel panic

insmod kernelpanic.ko

4. Analyzing the Core Dump

4.1 Installing utility

  • Required Packages

    1. crash
    2. kernel-debuginfo
  • install crash

    yum install -y crash
  • kernel-debuginfo download URL

    Centos Debug Info

4.2 Running the Crash Utility

  • file

    /var/crash/.../vmcore
    /var/crash/.../vmcore-dmesg.txt
  • command

    vi vmcore-dmesg.txt
    crash /usr/lib/debug/modules/$(uname -r)/vmlinux /path/to/vmcore
  • crash command

    • ps - processes ran
    • vm - anything loaded in the virtual memory when crashing
    • file - files been opening when crashing
    • log - maybe the same as file vmcore-dmesg.txt
    • bt -
    • dis -l -
    • struct -o vm_struct -
    • sys
    • kmem -i
    • p sysctl_tcp_rmem
    • l *0x0xffffffff80280fa7

5. Trouble shooting

Error 1
kdump: No crashkernel parameter specified for running kernel

Verirying the size of RAM present, if it is a virtual machine, make sure there is enough memory for kdump.

Error 2
Detected change(s) the following file(s):

  /etc/kdump.conf
Rebuilding /boot/initrd-2.6.32-431.el6.x86_64kdump.img
action will be preformed. is not a valid default option
Failed to run mkdumprd
Starting kdump:                                            [FAILED]

Checking the file /etc/kdump.conf, make sure no additional lines been uncommented.

Error 3
Your running kernel is using more than 70% of the amount of space you reserved for 
kdump, you should consider increasing your crashkernel reservation
  1. Slab value from /proc/meminfo is the in kernel data structures cache. It depends on the total amount of RAM presend in the system. It is not consistend and can chang during operation of the server.
    awk '/Slab:.*/ {print $2}' /proc/meminfo
    # the running kernel option
    cat /proc/cmdline
    # the value of crashkernel been configured
    cat /sys/kernel/kexec_crash_size
  2. If the Slab value is bigger than 70% of the memory that was reserved with the crashkernel paramenter, the warning is printed.
  3. Recommended values for crashkernel:
    crashkernel=0M-2G:128M, 2G-6G:256M, 6G-8G:512M, 8G-:768M
  4. RHEL recommended some mappings of RAM and crashkernel values
ram size crashkernel parameter ram/crashkernel factor
>0GB 128MB 15
>2GB 256MB 23
>6GB 512MB 15
>8GB 768MB 31

Note:

Maximum size of crashkernel is 896M.

Usually, crashkernel=auto is sufficient for most of the situation,
If it is not, reserve more memory by syntax crashkernel=XM (X is amount of memory to be reserved in mega bytes, X+64MB is the actually memory be reserved to the kdump kernel).

Error 4

No crashkernel parameter was specified or crashkernel memory reservation failed

Checking the Slab value from /proc/meminfo, if it is larger then 70% of the value crashkernel.

Leave a Reply

Your email address will not be published. Required fields are marked *