There are a number of times in a virtual machine’s life where it needs to be power cycled (graceful OS shut down, VM powered off, and then powered on again). For example:
- Remediations for CPU vulnerabilities like Spectre, Meltdown, L1TF, and MDS all require a customer to power a VM off and then back on to pick up CPU instruction updates (MDCLEAR, etc.).
- EVC changes, where a customer wants to alter cluster EVC settings but would require large-scale effort and/or downtime, which is untenable.
- EVC changes, where a customer wishes to make a VM able to migrate seamlessly between discrete vSphere installations and/or VMware Cloud on AWS locations.
- Changed-Block Tracking (CBT) enablement on VMs, where VMs need to be power-cycled to start CBT as part of a backup system install (Veeam, Rubrik, Cohesity, et al all require this).
For most customers this is the hardest part of any of these tasks because our products don’t make it easy to do. To get it done the customer needs to do it manually or automate it themselves (difficult for many), and then schedule & coordinate it outside of other maintenance windows, which is almost impossible for many of our customers.
Many customers do have regular maintenance windows, though, where patching of guest OSes occurs. However, guest OS patching causes the OS to reboot, but does not change the power state of the virtual machine/virtual machine monitor itself.
The scheduled VM hardware upgrade shows us that there’s already something in vSphere that can do this. That hardware upgrade process WILL power-cycle a VM when the guest OS is rebooted, and the customer, when scheduling the upgrade, has the choice to only do it on graceful shutdowns. That’s wonderful because it can then be seamlessly worked into regular OS patching cycles and it's low risk.
What if that power-cycle-on-shutdown functionality were exposed more generally to customers, as something they could ask vSphere to do for them at any time, for whatever reason the customer might have? It would certainly solve the four huge examples above, as well as enable what Mr. Blair Fritz dubbed “lazy EVC changes” which would make EVC more flexible and improve its use. VAC shows 21% EVC usage, which is staggeringly low considering how powerful a tool EVC is for expansion, migration, and vulnerability mitigation.
Let’s make EVC changes, CBT enablement, and all these CPU vulnerabilities – present and future – be frictionless for our customers and their millions of VMs!
Just a note. James Yarbrough did the engineering to add this to 6.7U3.
This powerCLI snippet should do it for you:
Get-VM | New-AdvancedSetting -Name “vmx.reboot.powerCycle" -value $true
It will be included in upcoming releases of 6.5 and 6.0 patches as well.
No comments:
Post a Comment