My Personal IT Infrastructure Knowledge Base: Read IO spike every 5 minutes for 7.0U1 (84220)

Thursday, September 16, 2021

Read IO spike every 5 minutes for 7.0U1 (84220)

Symptoms

Storage array experiencing continuous 5 minute read spike and high CPU utilization.
Other storage computations like deduplication and compression can be delayed or stalled.
In our case it was huge environment (200-300 host) connected to Pure storage array

Purpose

This article will explain the reason and provide workaround or fix.

Cause

A change was made ( in 7.0U1):

In hostd to make API call every 5 minutes.
In VMFS a new lighter API was added to get the required stat.

Impact / Risks

Storage overutilization in case of large amount o hosts and large amount of datastores.

Resolution

Not available yet

Workaround

Changing /etc/vmware/hostd/config.xml on each host.
We can recommend to try to 12 hours for customer . Changing vmfsStatsIntervalInSecs=43200.

A one liner to perform this task:

sed -i -e 's/<vmfsStatsIntervalInSecs>.*>/<vmfsStatsIntervalInSecs>21600<\/vmfsStatsIntervalInSecs>/g' /etc/vmware/hostd/config.xml;/etc/init.d/hostd restart

Related Information

30 mins = vmfsStatsIntervalInSecs=1800
1 hour = vmfsStatsIntervalInSecs=3600
3 hours = vmfsStatsIntervalInSecs=10800
6 hours = vmfsStatsIntervalInSecs=21600
12 hours = vmfsStatsIntervalInSecs=43200

Default setting in etc/vmware/hostd/config.xml

<vmfsStatsIntervalInSecs> 300 </vmfsStatsIntervalInSecs>

Confidential or Internal Information

https://bugzilla.eng.vmware.com/show_bug.cgi?id=2580232 change was made ( in 7.0U1)

The relevant PR for this KB https://bugzilla.eng.vmware.com/show_bug.cgi?id=2788282

- Note: hostd datastore refresh invoking VMFS datastore refresh
Vol3GetAttributesVMFS6 -> Res3StatVMFS6 can end up in reading a lot of VMFS
metadata.

- The amount of VMFS metadata read would be proportional to both size of VMFS
datastore and the number of VMFS datastores on ESXi server.

My Personal IT Infrastructure Knowledge Base

Pages

Thursday, September 16, 2021

Read IO spike every 5 minutes for 7.0U1 (84220)

No comments:

Post a Comment