RFC: Comparision of compute attributes for compute object update during host creation

During host creation, in compute orchestration (foreman/app/models/concern/orchastration/compute.rb) the compute_update_required function tries to compare host compute attributes with vm compute attributes and if they differ then it tries to update compute object (aka vm). This approach sometimes update vm even if the host is still provisioning which fails in some cases as vm compute attributes are fetched directly from compute resource and sometimes they have default values added after compute object is created but the host compute attributes don’t have those default values.
One such example, in foreman_salt, it tries to add salt_autosign_key to the host attributes which also triggers a check if compute object needs to be updated. In this case, it fails for Proxmox because vm is already created so it has added volid to the HardDisk and boot order as well on Proxmox server side. But these values are not present in host compute attrtibutes yet as it still takes values from new host and thus it fails to update the compute object:

update VM 103: -virtio local:30, cache=none
Formatting '/var/lib/vz/images/103/vm-103-disk-1.raw', fmt-raw size=32212254720 preallocation=off virtio: successfully created disk 'local: 103/vm-103-disk-1.raw,cache=none, size=30G'
TASK ERROR: 400 Parameter verification failed. virtio0: hotplug problem - can't unplug bootdisk 'virtio1"

This error might also occur for VMware, for example, VMware 8 doesn’t allow hot-plugging of Harddisks and in that case the volid value would be different for vm_compute_attributes and host compute_attributes.

This function compute_update_required is comparing latest values of vm attributes to the values of host compute attributes which were set during host creation which will return false in many cases.
Is there a better approach to check if compute object needs update?

1 Like

Thanks a lot for the detailed explanation of the problem!

I was wondering if it might also be worth considering a fail-safe at the beginning of compute_update_required. For example, what if we temporarily disable it while the host is still being built?

That way, we could prevent plugins from updating attributes too early and avoid running into the issue you’ve described.