When using libvirt Compute Resource, the options are “On” and “Off”. Whereas the “On” option always works as intended, the “Off” options doesn’t. This happens when no OS is up and running, for example during PXE boot or after a Kernel panic. Also the hammer host reboot sub-command is not working then.
This may also affect other compute resources.
Background
Looks like virsh shutdown (“gracefully shutdown”, probably some ACPI event is sent to the OS) is used which seems not work if there is no OS running.
Using virsh destroy (hard stop) would always work, but there might be situations where this is not wanted (potential data loss).
However, having a broken VM which can’t be power cycled is not optimal.
Mitigation
At the moment you need to hard shutdown/start (reboot) the host via libvirtd directly.
Possible Solution
We could introduce a “Force Off” option which allows the user to hard power off a host.
This does not require any logic, would be easy to implement, and leaves the decision up to the user. In addition, this could also be implemented for other compute resources.
@evgeni recently found out that we have a difference in the old and new detail page. The old page has multiple actions (power on, power off, power cycle, reboot, etc) where the new page only has, as you’ve noticed, On and Off.
So there is some work to be done. My recommendation on the design would be two fold.
First the frontend needs to be updated to support dynamically querying the REST API for the supported actions instead of hardcoding that into Javascript. Looking at the API code I don’t see a way to do that today, but GET /hosts/:id/power is probably the best place to add that.
Then backend wise you probably need to enhance PowerManager::Virt. In the PowerManager::Bmc controller there is already a poweroff action that in the libvirt world would probably map to virsh destroy.