Detection of changes in host network configuration by foreman

salt.json.tar.gz (1.8 KB)
foreman.log.tar.gz (1.5 KB)

See attached foreman log and text file with the json output of salt. I’m not sure if that matches with what you need, but please let me know if you need more.

So nothing really useful there, unit tests won’t run so can’t test the JSON. This needs someone who is aware of how this works. @stbenjam or @mmoll - I believe the problem is during extracting grain data, NIC parser gets incorrect info

I discovered that it is actually the ‘foreman-node’ command from ‘smart_proxy_salt’ which retrieves the salt grains and transforms these into foreman facts.

The data is passed to the plainify method here: https://github.com/theforeman/smart_proxy_salt/blob/master/bin/foreman-node#L102

The following data goes into that method at a certain point:

{"eth1"=>[], "lo"=>["127.0.0.1", "::1"], "bond0"=>["192.168.128.3", "fe80::5054:ff:fe26:8058"], "eth0"=>[]}

And the following comes out, note the empty arrays at the beginning and end"

[[], [{"ip_interfaces::lo::0"=>"127.0.0.1"}, {"ip_interfaces::lo::1"=>"::1"}], [{"ip_interfaces::bond0::0"=>"192.168.128.3"}, {"ip_interfaces::bond0::1"=>"fe80::5054:ff:fe26:8058"}], []]

That information is then transformed into json and posted to the foreman api at /api/hosts/facts.

It seems the knowledge of the interfaces eth0 and eth1 is lost here.

The question is, what would be the correct behavior for foreman to acknowledge that the interface doesn’t have an IP address anymore?

2 Likes

I’m not sure if my previous comment is even that relevant. It seems that there are related scenarios that are also troublesome.

E.g. if I completely remove a previously existing bond0 interface, the foreman interface will still show the interface with all of its last known properties. That is probably not fix-able without some kind of cleanup mechanism at the foreman_salt plugin side.

I suspect that data is sourced from the python script at https://github.com/theforeman/smart_proxy_salt/blob/master/bin/foreman-node#L64

If you just run that script manually, do you see eth0 in the output? If that’s the case, you would probably still see those interfaces in the output from salt - eg “salt-call grains.items” and you would want to use a salt event to trigger a refresh of your grains. foreman-node may just need to do some scrubbing of entries to better populate that data…

Yes, both salt-call grains.items and the python code embedded in foreman-node contain the interfaces eth0 and eth1.

The following is the part that the python code produces:

    "ip4_interfaces": {
      "eth1": [], 
      "lo": [
        "127.0.0.1"
      ], 
      "bond0": [
        "192.168.128.214"
      ], 
      "eth0": []
    }, 

So the interfaces are still present here, but contain an empty array.

That piece of output is then processed by foreman-node and foreman-node skips those interfaces (specifically in its ‘plainify’ function.

I tried to change foreman-node such that it would pass the empty interfaces definition to foreman, however that didn’t work as expected, as foreman then had the same interface twice in its administration.
Part of the cause could be that the mac address of the interfaces changes when creating a bond, such that foreman isn’t able to identify the interfaces anymore.

Some cleaning up mechanism might also help to rectify that. But at this point, I’m not sufficiently aware of what foreman needs to get it right.

1 Like

I added some debug statements to investigate further. There seems to be a cascading effect when considering the full picture.

To paint the full picture, I started my scenario again with the bonding+vlan setup.

Problem 1.
First of all the hosts is deployed with two network interfaces, say eth0 and eth1. Afterwards salt changes the configuration to a bond+vlan setup, say bond0 and bond0.128.

In that case both the bond and the vlan obtain the mac-address of one of the slave interfaces in the bond. What is important is that the vlan interface also obtains the same IP address that was previously assigned to eth0.

The new interface information for bond0 and bond0.128 will be sent to foreman. The information is handled in the foreman core set_interface method. That method replaces the interface eth0 with the information of bond0.128. Ergo, eth0 dissappears.

Debugging showed that the variables in that method hold the following,

iface.identifier = bond0.128
iface.identifier_was = eth0

Now that could be just fine iff the interface information of eth0 is properly updated by foreman_salt. But as we saw in the previous posts, that is not the case.

Question 1: Is it correct behavior that in foreman core the physical interface is removed in favor of a virtual interface?

Problem 2.
If problem 1 is correct behavior, fixing the missing interface definitions in the facts in foreman_salt for eth0 and eth1 (as described in the preceding posts, could actually fix (or work around) the issue. However when the information of eth0 and eth1 is properly sent to foreman, they are added as new interfaces, resulting in duplicate entries.

Question 2: Is it correct behavior that if an interface is updated in foreman, while the identifier remains the same, but the mac-address and the IP address has changed, that it is added as a new interface instead of merged with the existing one with the same identifier?


I’d love to hear your thoughts on this.

I’m also interested to known how this process works with puppet, is it really working without issues? Or could it be that bond+vlan issue is never discovered because it seems to require a very specific scenario (matching IP address and mac-address of vlan with an interface). Furthermore it is difficult to notice, unless subsequent salt/puppet runs are depending on the that specific information from foreman.

Excellent debugging work @dannys, thanks! Generally, I’d say our support for bonds is weaker than we’d like - I think @Marek_Hulan would be the best person to comment on this, but he’s away at the moment. I’ve bookmarked the thread and will check back next week to make sure it’s not forgotten :slight_smile:

1 Like

That code is complicated and does not work well with fact importing because it’s built on assumption that XXX_Y or XXX.Y is always a bond or bridge. That’s not the real-world case, people or drivers may decide to name the interfaces with underscores or dots and then Foreman import code errors out.

Yes, I agree that the assumption based on NIC names is tricky. That probably requires some significant refactoring to fix that.

That may also be part of my problem, as I have a vlan interface named ‘XXX.Y’ (which I think is common for vlan interfaces), but that one is actually not recognized as a virtual interface. There doesn’t seem to be anything that could detect that it is a virtual interface, neither based on the XXX.Y naming scheme or other attributes.

I think you are hitting this problem:

Thanks! That seems part of the problem.

I’m saying ‘part of’ because it seems that the real magic for that change occurs in the FactParser::set_additional_attributes, which is called by FactParser::interfaces. Unfortunately the latter seems to be overruled by foreman salt, which doesn’t call FactParser::set_additional_attributes anymore. I’m trying to add that to foreman_salt as well and to see if that helps.

Adding FactParser::set_additional_attributes to FactParser::interfaces in foreman_salt actually helps (together with the fix from github). That solves at least my bonding + vlan issue with a disappearing physical interface. So that may be a possible solution.

Unfortunately, the more I dig into the code, the more issue I run into. Mostly in the foreman_salt plugin I think. I think it would be better to start creating individual issues for that.

Be aware that @Shimon_Shtein and I are working on the Salt importer right now - this just got merged (which fixes some issues for me, testing from you would be appreciated!)

1 Like

That is a nice reduction of code. What exactly should that fix?

Does it depend on a change in foreman core? Or is this all there is to it?

I was hitting an issue with 1.17.1 / salt 2017.6 where I couldn’t import hosts at all, the importer was bombing out - this was due to changes in core that hadn’t been reflected in the plugin.

Right, I didn’t upgrade to 1.17.1 yet, that is probably the reason why I didn’t have the same issue. I probably need to upgrade in order to test your fix properly.

Most of the issues I noticed seem to originate from the foreman_salt FactParser. Looking at the code in the master/development branch, these issues are probably still present.

I’ll give a short overview of the issues that I noticed:

  1. foreman_salt FactParser doesn’t add additional attributes based on VIRTUAL, BRIDGES, BONDS regex
    It seems that adding a call to ‘set_additional_attributes’ in the FactParser can fix that

  2. foreman_salt does not update / clear IP addresses of existing NICs if the NIC has no IP address anymore
    This is what we discussed earlier in this topic. That seems to originate from foreman_proxy_salt / foreman-node command.

  3. foreman_salt FactParser creates interfaces like bond0.1 for the IPv6 ip address definitions.
    That is strange and would problably create a separate interface with an IPv6 instead of adding it to the existing interface. However, foreman doesn’t get that far as it seems it tries to parse the IPv6 address as an IPv4 address, which fails with the messages:

        [app] [W]  Ip is too long (maximum is 15 characters)
        [app] [W]  Ip is invalid
    

    The cause seems to be this line: https://github.com/theforeman/foreman_salt/blob/master/app/services/foreman_salt/fact_parser.rb#L64

1 Like

I created the following bugreport: Bug #24139: foreman_salt does not recognize vlan interfaces (and bonds/bridges) as virtual interfaces - Salt - Foreman
and a pull request to propose a fix: https://github.com/theforeman/foreman_salt/pull/80

I also found this existing bugreport: Bug #12399: Error saving eth0.1 NIC for host failed - Salt - Foreman
which seems to be one of the issues that I discoved as well. I created a pull request for that also: https://github.com/theforeman/foreman_salt/pull/80

I didn’t add nor change tests though, as wasn’t able to get the tests up and running for the foreman salt plugin…

Any thoughts are welcome.

Small correction, it should be this pull request: https://github.com/theforeman/foreman_salt/pull/81

1 Like

Thanks @dannys, great work. I’ve found a small issue with one of the PRs for your attention, and then we can get this packaged up and released as 10.1.0 (I think)