I’ve found an edge case where the fix for Bug #19623: Changes to vmware vm gives ‘Could not find network X on VMWare compute resource’ - Foreman isn’t working. This may also relate to Bug #11106: Foreman 1.8.2 with VMware errors with: Could not find virtual machine network interface matching x.x.x.x - Foreman. The error message is a closer fit to #11106, but the root cause looks like a variation on #19623.
As far as I can tell, the issue is caused by collisions between the id/key and ref attributes of two vCenter network objects. Our environment matches the conditions in the original bug where we’re post-vCenter migration with legacy id ref’s on the distributed virtual switches. It looks like we have an object where the ref matches the id of another object’s key.
The select_nic function in FogExtensions::Vsphere::Server (app/models/concerns/fog_extensions/vsphere/server.rb) calls all_networks.detect first with the nic_attrs structure which looks like it matches against network._ref. IE:
vm_network = all_networks.detect { |network| network._ref == nic_attrs['network'] }
Only if that doesn’t match anything would the check below it which checks for network.key execute:
vm_network ||= all_networks.detect { |network| nic_attrs['network'] && [network.name, network.try(:key)].include?(nic_attrs['network']) }
In our environment, this makes a false positive match due to one object have a ref attribute that matches the desired object’s key. I’m pretty sure the check a few lines later that compares the vm_network.name and vm_network.key nil’s out selected_nic, ultimately returning nil to match_macs_to_nics in Orchestration::Compute which then fails with “Orchestration::Compute: Could not match network interface.”
In my environment, I was able to fix the issue by reversing the search by name & key and the search for nic_attrs. IE:
--- app/models/concerns/fog_extensions/vsphere/server.rb-orig 2018-06-08 11:13:50.534459137 -0400
+++ app/models/concerns/fog_extensions/vsphere/server.rb 2018-06-07 16:48:16.890547789 -0400
@@ -38,8 +38,8 @@
def select_nic(fog_nics, nic)
nic_attrs = nic.compute_attributes
all_networks = service.raw_networks(datacenter)
- vm_network = all_networks.detect { |network| network._ref == nic_attrs['network'] }
- vm_network ||= all_networks.detect { |network| nic_attrs['network'] && [network.name, network.try(:key)].include?(nic_attrs['network']) }
+ vm_network = all_networks.detect { |network| nic_attrs['network'] && [network.name, network.try(:key)].include?(nic_attrs['network']) }
+ vm_network ||= all_networks.detect { |network| network._ref == nic_attrs['network'] }
unless vm_network
Rails.logger.info "Could not find Vsphere network for #{nic_attrs.inspect}"
return
I think this should be a safe change in the other cases where #19623 applies, but I’m nowhere near familiar enough with the Foreman codebase to be certain of that.
Does this look like a reasonable change? I’ll gladly submit a PR if it looks reasonable. I’m not sure if I should re-open #19623 or open a new bug for this issue.
Best regards,
Zac Bedell