VMware "Could not match network interface" related to issue #19623

I’ve found an edge case where the fix for Bug #19623: Changes to vmware vm gives ‘Could not find network X on VMWare compute resource’ - Foreman isn’t working. This may also relate to Bug #11106: Foreman 1.8.2 with VMware errors with: Could not find virtual machine network interface matching x.x.x.x - Foreman. The error message is a closer fit to #11106, but the root cause looks like a variation on #19623.

As far as I can tell, the issue is caused by collisions between the id/key and ref attributes of two vCenter network objects. Our environment matches the conditions in the original bug where we’re post-vCenter migration with legacy id ref’s on the distributed virtual switches. It looks like we have an object where the ref matches the id of another object’s key.

The select_nic function in FogExtensions::Vsphere::Server (app/models/concerns/fog_extensions/vsphere/server.rb) calls all_networks.detect first with the nic_attrs structure which looks like it matches against network._ref. IE:
vm_network = all_networks.detect { |network| network._ref == nic_attrs['network'] }
Only if that doesn’t match anything would the check below it which checks for network.key execute:
vm_network ||= all_networks.detect { |network| nic_attrs['network'] && [network.name, network.try(:key)].include?(nic_attrs['network']) }

In our environment, this makes a false positive match due to one object have a ref attribute that matches the desired object’s key. I’m pretty sure the check a few lines later that compares the vm_network.name and vm_network.key nil’s out selected_nic, ultimately returning nil to match_macs_to_nics in Orchestration::Compute which then fails with “Orchestration::Compute: Could not match network interface.”

In my environment, I was able to fix the issue by reversing the search by name & key and the search for nic_attrs. IE:

--- app/models/concerns/fog_extensions/vsphere/server.rb-orig	2018-06-08 11:13:50.534459137 -0400
+++ app/models/concerns/fog_extensions/vsphere/server.rb	2018-06-07 16:48:16.890547789 -0400
@@ -38,8 +38,8 @@
       def select_nic(fog_nics, nic)
         nic_attrs = nic.compute_attributes
         all_networks = service.raw_networks(datacenter)
-        vm_network = all_networks.detect { |network| network._ref == nic_attrs['network'] }
-        vm_network ||= all_networks.detect { |network| nic_attrs['network'] && [network.name, network.try(:key)].include?(nic_attrs['network']) }
+        vm_network = all_networks.detect { |network| nic_attrs['network'] && [network.name, network.try(:key)].include?(nic_attrs['network']) }
+        vm_network ||= all_networks.detect { |network| network._ref == nic_attrs['network'] }
         unless vm_network
           Rails.logger.info "Could not find Vsphere network for #{nic_attrs.inspect}"
           return

I think this should be a safe change in the other cases where #19623 applies, but I’m nowhere near familiar enough with the Foreman codebase to be certain of that.

Does this look like a reasonable change? I’ll gladly submit a PR if it looks reasonable. I’m not sure if I should re-open #19623 or open a new bug for this issue.

Best regards,
Zac Bedell

Ping @vmware team for looking into this on the vmware side.
Regarding reopening or creating a new issue - please create a new one, we use the issues to tell which release shipped certain code, so further issues with code that has been merged already should be treated as new issue. One exception is when code has been merged to develop but still hasn’t been shipped in any release, in which case you can open a new PR using “Refs #1111…” as the commit message to add another commit to the same issue.

@tbrisker I am assigning the issue to me, I will look more into it tomorrow during NA hours.

1 Like

I created an new issue for this:

http://projects.theforeman.org/issues/23909

@pendor Thanks for creating the issue I will start looking at it. sorry for the delay I have been working the other 2 issues that came in and trying to catch up from being on PTO/Conference.

It looks like this issue regressed in commit 5150a1d. That commit switched back to checking by network[:id] and network[:_ref] rather than network[:key]. The result is that the non-unique _ref attribute left over from vCenter migration is getting falsely selected in vm_network, and we get the same “could not find virtual machine network interface matching…” error as before.

I don’t know enough to understand why 5150a1d made that change. Can anyone more familiar with that code assist in finding a way that satisfies the intent of this new commit without causing bug #23909 to regress?

Well fog-vsphere isn’t assinging the key anymore, so I guess the key should be always nil.
@pendor could you share where you’re assigning the name?

There is a PR of mine, which should be selecting the network firstly by ID:

Maybe working just with the ID would solve the issue?
Only place where the name is really needed is hammer, but we can expose a API-only parameter network_name, for user to set name through and it would transfer such name to the _ref.
Would that work for you?

I don’t think I’m assigning the name anywhere that I can share? I’m trying to create a VMware backed machine in Foreman and select the desired network from the drop-down in the host’s Interfaces settings. When I click Submit to create the host, it fails as described above. The patch I submitted last year to search by network id before _ref fixes the issue and has worked for a year. That change was reverted, and the network search fails the same way it did previously. I have another patch & will PR it today.

Yeah, please do :slight_smile: I can’t see the key being present in the networks anymore. So I would like to see how you’ve fixed this now.

Ok, I can see the problem now :slight_smile:
I left a comment just if you care to clean it up a bit more, thanks for taking care of it! :+1:

I have this issue too please could I get some assistance foreman 3.7