Provisioning discovered hosts: hostgroup settings not applied

Problem:

In my setup I have created a lot of Hostgroups using Ansible, setting up LCE, OS settings (‘the works’). This used to work fine when deploying new discovered hosts. My workflow was:

  1. Boot new lab VM
  2. Wait for it to show in Foreman
  3. Click ‘Provision’
  4. Select Hostgroup, click ‘Customize’
  5. Set hostname
  6. Click ‘Submit’

And then Foreman would go it’s merry way and install an OS on the system.

However, something changed (I don’t know what, as I can’t find anything useful in logs with regards to errors and such):

1. Hostgroup settings are not fully applied when you are in the host customization page

a. The LCE and CV are not set
b. The OS tab setting are shown, not set

When opening the discovered host’s detail page, I noticed that the LCE and CV fields are empty. And the OS settings are shown as configured on the host group. However, when clicking ‘Submit’ after setting the hostname, I receive an error on the fields on the OS page as they are now magically empty. Below are the full steps I take:

  1. Select Host, click ‘Provision’
  2. Set the Host Group and click ‘Customize’
    image
  3. The LCE and CV fields are now empty (while the Host Group has them defined)
  4. Blissfully ignoring that, moving on to OS settings, I see all settings are present as seen on the Host Group. Click ‘Submit’
  5. See the error and changed settings on the same page
  6. Go back to the first tab, LCE and CV are suddenly present…

2. This ends in 404

This procedure has mostly the same results as the pictures above, so I didn’t make 'm again :slight_smile:

  1. Select Host, click ‘Provision’
  2. Do not set a Host Group, just click ‘Customize’
  3. Set Host group, check settings (both Host and OS), all in order
  4. Click ‘Submit’ → 404
  5. Check /var/log/foreman/production.log:
2021-11-14T23:04:16 [I|app|09b5af1b] Started POST "/api/v2/discovered_hosts/facts" for 192.168.255.154 at 2021-11-14 23:04:16 +0100
2021-11-14T23:04:16 [I|app|09b5af1b] Processing by Api::V2::DiscoveredHostsController#facts as JSON
2021-11-14T23:04:16 [I|app|09b5af1b]   Parameters: {"facts"=>"[FILTERED]", "apiv"=>"v2", "discovered_host"=>{"facts"=>"[FILTERED]"}}
2021-11-14T23:04:16 [I|app|09b5af1b] Import facts for 'macb274901d1d1d' completed. Added: 0, Updated: 0, Deleted 0 facts
2021-11-14T23:04:16 [I|app|09b5af1b] Detected IPv4 subnet: Beheer with taxonomy ["HTM"]/["DC-A"]
2021-11-14T23:04:16 [I|app|09b5af1b] Assigned location: DC-A
2021-11-14T23:04:16 [I|app|09b5af1b] Assigned organization: HTM
2021-11-14T23:04:16 [I|app|09b5af1b] Completed 201 Created in 147ms (Views: 0.9ms | ActiveRecord: 35.0ms | Allocations: 44472)
2021-11-14T23:04:23 [I|app|cb29cbce] Started PATCH "/hosts/4" for 192.168.255.1 at 2021-11-14 23:04:23 +0100
2021-11-14T23:04:23 [I|app|cb29cbce] Processing by HostsController#update as */*
2021-11-14T23:04:23 [I|app|cb29cbce]   Parameters: {"utf8"=>"✓", "authenticity_token"=>"dFOTYcDiBxzb/pKZwBjaCcckJveAjBgkeogEdq6RmlaG6iNxE+olHtwVhS55EiWEzK0kOim5RjyEbP1cTpzWPg==", "host"=>{"name"=>"macb274901d1d1d", "hostgroup_id"=>"1", "content_facet_attributes"=>{"lifecycle_environment_id"=>"2", "content_view_id"=>"12", "content_source_id"=>"1", "kickstart_repository_id"=>"53"}, "puppet_attributes"=>{"environment_id"=>""}, "managed"=>"true", "progress_report_id"=>"[FILTERED]", "type"=>"Host::Managed", "interfaces_attributes"=>{"0"=>{"_destroy"=>"0", "mac"=>"b2:74:90:1d:1d:1d", "identifier"=>"eth0", "name"=>"macb274901d1d1d", "domain_id"=>"1", "subnet_id"=>"1", "ip"=>"192.168.255.154", "ip6"=>"", "managed"=>"1", "primary"=>"1", "provision"=>"1", "execution"=>"1", "tag"=>"", "attached_to"=>"", "id"=>"4"}}, "architecture_id"=>"1", "operatingsystem_id"=>"2", "build"=>"1", "medium_id"=>"", "ptable_id"=>"167", "pxe_loader"=>"None", "disk"=>"", "root_pass"=>"[FILTERED]", "is_owned_by"=>"4-Users", "enabled"=>"1", "model_id"=>"1", "comment"=>"", "overwrite"=>"false"}, "media_selector"=>"synced_content", "id"=>"4"}
2021-11-14T23:04:23 [I|app|cb29cbce]   Rendering common/404.html.erb within layouts/application
2021-11-14T23:04:23 [I|app|cb29cbce]   Rendered common/404.html.erb within layouts/application (Duration: 3.5ms | Allocations: 5823)
2021-11-14T23:04:23 [I|app|cb29cbce]   Rendered layouts/_application_content.html.erb (Duration: 3.0ms | Allocations: 5608)
2021-11-14T23:04:23 [I|app|cb29cbce]   Rendering layouts/base.html.erb
2021-11-14T23:04:23 [I|app|cb29cbce]   Rendered layouts/base.html.erb (Duration: 5.7ms | Allocations: 7609)
2021-11-14T23:04:23 [I|app|cb29cbce] Completed 404 Not Found in 28ms (Views: 16.1ms | ActiveRecord: 2.4ms | Allocations: 28478)

Expected outcome: Provision the host without complaining (or requiring the user to completely redo all settings that are already set on the host group)

Foreman and Proxy versions: 3.0.1

Foreman and Proxy plugin versions: Katello 4.2

Distribution and version: Rocky 8.4

Other relevant data:

ping @katello

Do you get the same when you leave the hostname untouched?

The missing media can be simply consequence, form handling is buggy, what is causing the validation error actually, can you tell? Is that something in NICs?

Katello has nothing to do with this, I mean, discovery process is actually editing of existing host. It’s a hack. Hopefully the new host will be designed with discovery in mind.

I gave it a shot (boot VM → Customize → Submit), but the same thing happened. So even if the form is presented to me, with all options set correctly, it doesn’t process.

When I manually set the OS and media fields as it mentions in this screenshot from my previous post, it does process it.

Looking at the logs, I can’t really find anything, but I did compare the logged lines of the forms:

The first one (which failed):

"host"=>{"name"=>"macb274901d1d1d",
"organization_id"=>"1",
"location_id"=>"3",
"hostgroup_id"=>"4",
"content_facet_attributes"=>{"lifecycle_environment_id"=>"",
"content_view_id"=>"",
"content_source_id"=>"1",
"kickstart_repository_id"=>"172"},
"managed"=>"true",
"progress_report_id"=>"[FILTERED]",
"type"=>"Host::Managed",
"interfaces_attributes"=>{"0"=>{"_destroy"=>"0",
"mac"=>"b2:74:90:1d:1d:1d",
"identifier"=>"eth0",
"name"=>"macb274901d1d1d",
"domain_id"=>"1",
"subnet_id"=>"1",
"ip"=>"192.168.255.154",
"ip6"=>"",
"managed"=>"1",
"primary"=>"1",
"provision"=>"1",
"execution"=>"1",
"tag"=>"",
"attached_to"=>"",
"id"=>"8"}},
"architecture_id"=>"1",
"operatingsystem_id"=>"2",
"build"=>"1",
"medium_id"=>"",
"ptable_id"=>"167",
"pxe_loader"=>"None",
"disk"=>"",
"is_owned_by"=>"",
"enabled"=>"1",
"model_id"=>"1",
"comment"=>"",
"overwrite"=>"false"},
"media_selector"=>"synced_content",
"id"=>"8"}

The second one, which succeeded:

"host"=>{"name"=>"macb274901d1d1d",
"organization_id"=>"1",
"location_id"=>"3",
"hostgroup_id"=>"4",
"content_facet_attributes"=>{"lifecycle_environment_id"=>"2",
"content_view_id"=>"15",
"content_source_id"=>"1",
"kickstart_repository_id"=>"172"},
"managed"=>"true",
"progress_report_id"=>"[FILTERED]",
"type"=>"Host::Managed",
"interfaces_attributes"=>{"0"=>{"_destroy"=>"0",
"mac"=>"b2:74:90:1d:1d:1d",
"identifier"=>"eth0",
"name"=>"macb274901d1d1d",
"domain_id"=>"1",
"subnet_id"=>"1",
"ip"=>"192.168.255.151",
"ip6"=>"",
"managed"=>"1",
"primary"=>"1",
"provision"=>"1",
"execution"=>"1",
"tag"=>"",
"attached_to"=>"",
"id"=>"8"}},
"architecture_id"=>"1",
"operatingsystem_id"=>"2",
"build"=>"1",
"medium_id"=>"",
"ptable_id"=>"167",
"pxe_loader"=>"PXELinux BIOS",
"disk"=>"",
"root_pass"=>"[FILTERED]",
"is_owned_by"=>"4-Users",
"enabled"=>"1",
"model_id"=>"1",
"comment"=>"",
"overwrite"=>"false"},
"media_selector"=>"synced_content",
"id"=>"8"}

Looking at the diff, it also no really spectacular, as it’s basically the same information the validator is complaining about (and I forgot to fix the boot loader).

< "content_facet_attributes"=>{"lifecycle_environment_id"=>"",
< "content_view_id"=>"",
---
> "content_facet_attributes"=>{"lifecycle_environment_id"=>"2",
> "content_view_id"=>"15",
18c18
< "ip"=>"192.168.255.154",
---
> "ip"=>"192.168.255.151",
32c32
< "pxe_loader"=>"None",
---
> "pxe_loader"=>"PXELinux BIOS",
34c34,35
< "is_owned_by"=>"",
---
> "root_pass"=>"[FILTERED]",
> "is_owned_by"=>"4-Users",

I’m not sure if this is related, but it came to mind, this is a Foreman Katello server without Puppet installed.

Looks like content view and lifecycle interface does not carry over when you press Submit (that is essentially edit host and update). I need to reproduce this, not sure what is wrong.

1 Like

If you need anything, please let me know. We could also set up a session where we poke around in my server (on Fridays I have the time to do this :slight_smile: as I’m not tied up in client work then).

If you want to have a frank advice, change your workflow to avoid Customize host. That is really something I would love to kill with passion, discovery rules will work reliably, creating host from hostgroup without customization will work. This tho, it is full of bugs, woes and regressions.

Will try to reproduce but no promises. :frowning:

1 Like

I think killing customize host sounds like a good plan. In my experience, the only thing you reasonably need to be able to do are:

  1. Set hostname (though there are plenty options to do that later as well)
  2. Set the hostgroup (which contains activation keys etc)

The rest I could legitimately miss :slight_smile:

The only reason I was diving into Customize host was to set a hostname, if that’s something that’s on the small pop-up, I’d probably never dive in the Customize menus. But due to the bugs you mention, I need to correct a whole lot of other things…

I will test my workflow with just assigning a host group and setting the hostname later (I actually never tested changing the hostname of a system that is already registered, so an interesting aspect would be to see if Foreman is updated with the new hostname :slight_smile: )

I agree that hostname would need to be there, then we could remove it. Let’s wait until Edit Host form is redesigned and then we can either remove it or perhaps integrate it in a better way.

1 Like

My new VM workstation came in yesterday and I have re-deployed the whole enchilada again, but I found something odd.

  • I use Ansible to configure Host Groups with all required details
  • When boot a blank VM it’s discovered, but I can’t install it.
    • Manually clicking ‘provision’ and setting a host group prompts for a missing installation source.
    • Discovery rules provide the following error:
2022-01-13T11:26:38 [I|app|12ae8218] Detected IPv4 subnet: Beheer with taxonomy ["HTM"]/["DC-A"]
2022-01-13T11:26:38 [I|app|12ae8218] Assigned location: DC-A
2022-01-13T11:26:38 [I|app|12ae8218] Assigned organization: HTM
2022-01-13T11:26:38 [I|app|12ae8218] Match found for host macaab6aa1c2acb (5) rule FreeIPA (2)
2022-01-13T11:26:38 [W|app|12ae8218] Could not find a provider for macaab6aa1c2acb. Providers returned {"Katello::ManagedContentMediumProvider"=>["Kickstart repository was not set for host 'macaab6aa1c2acb'", "Content source was not set for host 'macaab6aa1c2acb'"], "MediumProviders::Default"=>["Operating system was not set for host 'macaab6aa1c2acb'", " medium was not set for host 'macaab6aa1c2acb'", "Invalid medium '' for ''", "Invalid architecture '' for ''"]}

This strikes me as odd, as the Ansible module calls contain all the required parameters and when I inspect the created Host Group it does show the contents as I intented them to be.

So just for the sake of it, I opened a Host Group, didn’t change a thing, clicked save and voila, it works.

But I don’t understand why

Any idea, what is Ansible doing differently then my browser?

P.S. Discovery rules are really interesting, thanks for the tip!

Do you use nested hostgroups? That is also a rabbit hole, try with a flat one :slight_smile:

Also check org/loc there can be issues with that too.

To wrap it up:

  • Avoid host customizations in discovery
  • Avoid nested hostgroups
  • Rename hosts before provisioning (can be done via CLI or mass UI action)

Sorry, no, plain Host Groups :confused:

I did do some tests with renaming machines, even if you’re too late with renaming it (so it already started installing before you made the change), it’s still very trivial to update it afterwards :slight_smile: DNS will update immediately (if you have it set up that way :wink: )

When you create a new (fake) host with that hostgroup, does it work that way?

Discovery provisioning is nothing but editing an existing host, converting it from discovered to managed type and saving it with new parameters.

I tried reproducing it, but ever since I manually saved that one Host Group I can no longer reproduce the error.

Or I have been doing something wrong.

However, as you mentioned, ‘Customize Host’ is still not working, but that’s fine, as I have adapted my workflow to set the hostname immediatly after creating the host.

:tada: yay, it’s still broken (that’s not good, but that means I can reproduce it :slight_smile: )

I made a lot of different host groups (for the different types of machines in my lab setup). And then I applied the ‘open and save it’ workaround, presto.

The full log is here: 2022-01-20T00:26:37 [I|app|b4a15647] Completed 200 OK in 4ms (Views: 0.1ms | Act - Pastebin.com
The clicks I did:

  1. Open the host in Discovered hosts
  2. Click Provision, assign Host Group (Host group ID 5 = Rocky8-Kubernetes-Beheer)
  3. Get prompted for the media
  4. Go to the Host group, open it, save it (no changes made)
  5. Repeat from 1
  6. Success
1 Like

Oh gosh, rebuilding the edit host form is on this year’s agenda. We will solve this for once and forever. I do not want to dive into this code, discovery overrides ton of stuff and its been PAIN to maintain this. Unless you crack it.

1 Like

@lzap it took me some time, but I found a clue!

This is a hostgroup after it has been newly created by Ansible

Id:                    4
Name:                  RedHat8-Base-Infra
Title:                 RedHat8-Base-Infra
Description:           
  Managed by Ansible, your changes will be lost
Network:               
    Subnet ipv4: Infra
    Domain:      rh.lab
Operating system:      
    Architecture:     x86_64
    Operating System: RedHat 8
    Partition Table:  Kickstart default first disk only
    PXE Loader:       None
Puppetclasses:         

Parameters:            
    autopart_options => --nohome
    kt_activation_keys => RedHat8-Base-Infra
    remote_execution_create_user => true
    remote_execution_effective_user_method => sudo
    remote_execution_ssh_keys => ['a bunch of ssh keys']
    remote_execution_ssh_user => ansible
Locations:             
    DC1
Organizations:         
    Lab-Inc
OpenSCAP Proxy:        
Content View:          
    Id:   15
    Name: COV RedHat8-Base
Lifecycle Environment: 
    Id:   2
    Name: Infra
Content Source:        
    Id:   1
    Name: sat.rh.lab
Kickstart Repository:  
    Id:

And this is after the workaround I wrote in my previous post

Id:                    4
Name:                  RedHat8-Base-Infra
Title:                 RedHat8-Base-Infra
Description:           
  Managed by Ansible, your changes will be lost
Network:               
    Subnet ipv4: Infra
    Domain:      rh.lab
Operating system:      
    Architecture:     x86_64
    Operating System: RedHat 8
    Partition Table:  Kickstart default first disk only
    PXE Loader:       None
Puppetclasses:         

Parameters:            
    autopart_options => --nohome
    kt_activation_keys => RedHat8-Base-Infra
    remote_execution_create_user => true
    remote_execution_effective_user_method => sudo
    remote_execution_ssh_keys => ['a bunch of ssh keys']
    remote_execution_ssh_user => ansible
Locations:             
    DC1
Organizations:         
    Lab-Inc
OpenSCAP Proxy:        
Content View:          
    Id:   15
    Name: COV RedHat8-Base
Lifecycle Environment: 
    Id:   2
    Name: Infra
Content Source:        
    Id:   1
    Name: sat.rh.lab
Kickstart Repository:  
    Id: 144

So for some reason the Kickstart Repository ID isn’t properly saved, but the weird thing is, when I open the hostgroup in the WebUI it is visible :thinking:

Does this help in finding out what causes this? :blush:

Something in the UI what Katello overrides does not play well with what Discovery overrides. Sounds so bad I know.

@lzap o/

Well… I have a confession to make, my ansible role made a boo-boo when configuring the hostgroups.

So even while the forms may a bit plagued (not all issues described earlier are fixed), this was a problem of my own doing. When looking at my code again I noticed that I didn’t add the kickstart_repository argument (I did add medium…). :man_facepalming:

Which totally explains the behaviour and why the workaround actually works. Opening the form to edit a hostgroup tries to autocomplete some field (including the kickstart repo) when the Content View has been properly defined (which it is). And then saving the hostgroup saves whatever kickstart repo has been found by the auto-resolve logic of the form.

It still doesn’t fix the ‘Customize host’ workflow, but in our current situation that’s perfectly fine, the workaround you suggested by just setting the name later is solid.

Especially as we’re also going to focus more on using discovery rules combined with Ansible (see https://github.com/theforeman/foreman-ansible-modules/pull/1431 for a WIP module that can make them) which eliminates the need to customize a host altogether.

So sorry for creating the wild goose chase with regards to the Ansible modules, but thank you a lot for all the input you’ve given! :partying_face: :rocket:

1 Like