First Network Interface No Longer Set for Provisioning

fresh-pie · October 12, 2021, 7:21pm

Problem: I’d like to preface that I am 99% sure this issue is caused by something stupid I’m doing, just looking to see if someone can help me better my ways!

So, somehow in Foreman the “Primary”, “Provisioning”, and “Managed” settings are getting assigned to an interface that is not the 1st NIC (which would also be the NIC it should had originally provisioned on). This means that when the system next PXE boots, it tries to PXE off of that first NIC, but because its not set for provisioning, it will begin communicating with Foreman (via iPXE) using a MAC address that is “unknown” to Foreman, resulting in the FDI being loaded.

For example, let’s say a system has 4 network interface ports:

2: enp6s0f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether e0:4f:43:cc:68:cc brd ff:ff:ff:ff:ff:ff
3: enp6s0f1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether e0:4f:43:cc:68:cd brd ff:ff:ff:ff:ff:ff
4: enp6s0f2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether e0:4f:43:cc:68:ce brd ff:ff:ff:ff:ff:ff
5: enp6s0f3: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
    link/ether e0:4f:43:cc:68:cf brd ff:ff:ff:ff:ff:ff

I have a script that pulls host information out of an “inventory” of sorts and creates hosts in Foreman via hammer. This inventory only has the MAC address of the first NIC, so it imports the host, defining the interface without an identifier, but with the MAC, IPv4 address, and FQDN. This appears to be enough to get the system provisioning.

This would be the hammer command I use to do that initial import:

hammer
--verify-ssl false 
-s SERVER
-u USERNAME
-p PASSWORD
host create
--name name
--hostgroup-title hostgroup
--lifecycle-environment lc_environment
--content-view cv
--kickstart-repository ks_repo
--mac mac
--ip ip
--model-id model
--location-title myLoc
--organization-title myOrg

Maybe part of my problem is that I should be setting the interface to be used for provisioning here?

After provisioning, the system reboots and loads the hard disk (which clues me in that everything so far is OK regarding the interface configuration in Foreman, as the FDI hasn’t loaded). An Ansible job is then automatically triggered via an AWX callback, which configures NIC bonding. However, after all is said and done, sometimes a host will have interfaces configured like so in Foreman:

-----|---------------|---------------------------|-------------------|--------------|-------------------------
ID   | IDENTIFIER    | TYPE                      | MAC ADDRESS       | IP ADDRESS   | DNS NAME
-----|---------------|---------------------------|-------------------|--------------|-------------------------
2254 | enp6s0f0      | interface                 | e0:4f:43:cc:68:cc |              |
2225 | enp6s0f2      | bond (primary, provision) | e0:4f:43:cc:68:ce |              | server.example.net
2255 | enp6s0f3      | interface                 | e0:4f:43:cc:68:cf |              |
2256 | enp0s20f0u1u6 | interface                 | 7e:d3:0a:5b:78:ff |              |
2257 | bond0         | bond                      | e0:4f:43:cc:68:cd | 10.36.10.175 |
2258 | enp6s0f1      | interface                 | e0:4f:43:cc:68:cd |              |
-----|---------------|---------------------------|-------------------|--------------|-------------------------

As you can see, enp6s0f2 is configured for provisioning, however I would expect it to be enp6s0f0, which is the original NIC it used to provision. Because of this, it seems the next reboot will cause it to load iPXE on the first NIC, which would result in the FDI being loaded.

I tried to fix this using hammer, but no dice:

# hammer host interface update --host server.example.net --identifier enp6s0f0 --provision --primary --mac e0:4f:43:cc:68:cc
Could not update the interface:
  Identifier has already been taken
  Interfaces some interfaces are invalid

Unfortunately this is a completely random event that happens, so I’m having a hard time recreating it or understanding more about it. I also don’t see any difference in networking/bonding configurations between a host with this issue and one that doesn’t. I’m trying to understand better how Foreman gathers this configuration for the interfaces and how I can control/modify it properly.

I also don’t know if it matters, but I thought I should share that DHCP is not being managed by Foreman in my environment.

Expected outcome: The first network interface continues to operate as the provisioning interface. Is there anything I can try to do to prevent this from happening or at least fix it after the fact?

Foreman and Proxy versions: 2.2.3

Foreman and Proxy plugin versions: N/A

Distribution and version: CentOS 7

Other relevant data: See above.

fresh-pie · October 13, 2021, 1:32am

So, I think I found a way to fix this after the fact, which would be simply running:

hammer host interface update --host <host> --id <id> --provision

I suppose I could have my AWX job perform the above at the end of setup.

I would still like to understand how I’m ending up in this circumstance to begin with however. If anyone has any thoughts, I would love to hear them

lzap · October 13, 2021, 1:00pm

IIUC your problem is that after you setup your host in a particular way, Ansible facts comes in and reconfigures (corrupts) NICs.

Solution: Turn off “parse interfaces from facts” settings in Administer - Setting. Facts will still come and will be stored in Facts, however, NICs will stay as you created them via hammer.

fresh-pie · October 14, 2021, 4:36pm

Hey Lukas! That makes perfect sense, however I’m actually not seeing that option, but I do see:

Ignore interfaces with matching identifier

Perhaps my Foreman version is too old (2.2.3)? I can certainly try adding my interfaces to the option mentioned above and see if that helps.

lzap · October 15, 2021, 10:50am

Sorry it is named: “Ignore Puppet facts for provisioning”, the Provisioning tab.

fresh-pie · October 15, 2021, 12:45pm

Perfect, thank you Lukas!