ESXi host Purple Screen of Death (PSOD) happens when VMkernel experiences a critical failure. This can be due to hardware issues, driver problems, etc. During the PSOD event, the ESXi hypervisor captures a core dump to help diagnose the cause of the failure. Here’s what happens during this process:
After a PSOD, ESXi captures a core dump, which includes a snapshot of the hypervisor memory and the state of the virtual machines. The core dump is stored based on the host configuration (core dump partition, file, or network), and it helps diagnose the cause of the critical failure by providing insights into the state of the system at the time of the crash. Core dump is crucial for troubleshooting and resolving the issues leading to PSOD. In ESXi 6.7, core dump was stored in partition but since ESXi 7, it is stored to file.
For vSphere design, I would like to know the typical core dump file size to allocate optimal storage space for core dumps. Of course, the size of core file depends on multiple factors but the main factor should be the memory used by vmKernel.
ESXi host memory usage is split into three buckets
- vmKernel memory usage (core hypervisor)
- Other memory usage
- BusyBox Console including
- Core BusyBox Utilities (e.g., ls, cp, mv, ps, top, etc.)
- Networking and Storage Tools (ifconfig, esxcfg-nics, esxcfg-vswitch, esxcli, etc.)
- Direct Console User Interface (DCUI)
- Management Agents and Daemons (hostd, vpxa, network daemons like SSH, DNS, NTP, and network file copy aka NFC)
- Free memory
Here are data from three different ESXi hosts I have access to.
ESXi, 8.0.3 (24022510) with 128 GB (131 008 MB) physical RAM
- vmKernel memory usage: 747 MB
- Other memory usage: 20 264 MB
- Free memory: 109 997 MB
ESXi, 8.0.3 (24022510) with 256 GB (262 034 MB) physical RAM
- vmKernel memory usage: 1544 MB
- Other memory usage: 21 498 MB
- Free memory: 238 991 MB
[root@dp-esx02:~] esxcli system coredump file list
Path Active Configured Size
------------------------------------------------------------------------------------------------------- ------ ---------- ----------
/vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile true true 3882876928
It is configured and active.
[root@dp-esx02:~] esxcli system coredump file get
Active: /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile
Configured: /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile
[root@dp-esx02:~] ls -lah /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile
-rw------- 1 root root 3.6G Oct 29 13:07 /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile
vsish -e set /reliability/crashMe/Panic 1
[root@dp-esx02:~] esxcfg-dumppart --file --copy --devname /vmfs/volumes/66d993b7-e9cd83a8-b129-0025b5ea0e15/vmkdump/00000000-00E0-0000-0000-000000000008.dumpfile --zdumpname /vmfs/volumes/DP-STRG02-Datastore01/zdump-coredump.dp-esx02
Created file /vmfs/volumes/DP-STRG02-Datastore01/zdump-coredump.dp-esx02.1
[root@dp-esx02:~] ls -lah /vmfs/volumes/DP-STRG02-Datastore01/zdump-coredump.dp-esx02.1
-rw-r--r-- 1 root root 443.9M Oct 29 13:07 /vmfs/volumes/DP-STRG02-Datastore01/zdump-coredump.dp-esx02.1
vsish -e set /reliability/crashMe/Panic 1
ESXi, 7.0.3 (23794027) with 512 GB (524 178 MB) physical RAM
- vmKernel memory usage: 3 261 MB
- Other memory usage: 369 029 MB
- Free memory: 151 888 MB
No comments:
Post a Comment