Discovered hosts have no provisioning token on initial build

Problem:
New hosts provisioned via discovery image/plugin do not have their provisioning token added to the foreman URL in initial PXE file.

Expected outcome:
Clicking provision on discovered host starts build process and system completes build

Foreman and Proxy versions:
Foreman v2.0.1
Foreman Proxy v2.0.1

Foreman and Proxy plugin versions:
Discovery v16.0.1

Distribution and version:
CentOS v7.8.2003 (Core)

Other relevant data:
PXE file gets generated with this URL: http://foreman.internal.net:8000/unattended/provision
instead of http://foreman.internal.net:8000/unattended/provision/

If I cancel the build, then put it back in build mode foreman adds the token correctly. Only seems to happen when provisioning from a newly discovered host.

Hello and welcome here!

That’s new. Can you attach/pastebin complete log from this transaction using foreman-tail tool, preferably with Foreman (Ruby on Rails) running with debug log level (settings.yaml).

Pastebin foreman-tail during process:
https://paste.centos.org/view/55fee4bd

I noticed two things.

First, I see a hook defined via [D|app|3e204c12] custom hook after_build on mac525400b790f8.internal.orion will be executed if defined.. Foreman_hooks plugin can actually cancel saving in some cases, try agian without the hook.

Second, I see the host has set “build” to false. This is a flag that is set in hostgroup in your case which is named “Build when created” or something similar I actually don’t remember :slight_smile: Make sure this is checked for your hostgroup.

I take this back, this is a new debug message in core. However find that flag and check that, I see that JSON payload has that set to “false”.

I have verified that when using the “Customize Host” button after discovery that the Build flag is checked, but I get the same result, no token on the provisioning URL.

Hi @lewie67

Did you manage to get this sorted or is it still a problem?

Looks like it’s still an issue, sorry I had to prep a slide deck yesterday.

The paste is 404 already, can you reupload? I need to take a look once again. This is really weird.

Yea, it’s still an issue but I got tied up yesterday dealing with some broken hardware.
Here’s a new paste: https://paste.centos.org/view/39411798

Agree with the weirdness :confused:

Thanks!

  1. Dude… :wink:

Ugh, the centos ones only let you do 24 hours…
https://pastebin.com/zLJ0dr0N

This one should last “forever” :wink:

1 Like

So it starts normally:

2020-07-16T07:28:28 [I|app|aa1d4e9c]   Parameters: {"utf8"=>"✓", "authenticity_token"=>"[FILTERED]", "host"=>{"run_list"=>{"0"=>{"type"=>"role", "name"=>"default"}}, "override_chef_attributes"=>"true", "name"=>"mac525400b790f8", "organization_id"=>"1", "location_id"=>"2", "hostgroup_id"=>"2", "content_facet_attributes"=>{"lifecycle_environment_id"=>"", "content_view_id"=>"", "content_source_id"=>""}, "chef_proxy_id"=>"", "chef_environment_id"=>"", "ansible_role_ids"=>[""], "managed"=>"true", "progress_report_id"=>"[FILTERED]", "type"=>"Host::Managed", "interfaces_attributes"=>{"0"=>{"_destroy"=>"0", "mac"=>"52:54:00:b7:90:f8", "identifier"=>"eth0", "name"=>"mac525400b790f8", "domain_id"=>"1", "subnet_id"=>"1", "ip"=>"192.168.99.102", "ip6"=>"", "managed"=>"1", "primary"=>"1", "provision"=>"1", "execution"=>"1", "tag"=>"", "attached_to"=>"", "id"=>"15"}}, "architecture_id"=>"1", "operatingsystem_id"=>"2", "build"=>"1", "medium_id"=>"11", "ptable_id"=>"179", "pxe_loader"=>"PXELinux BIOS", "disk"=>"", "is_owned_by"=>"", "enabled"=>"1", "model_id"=>"1", "expired_on(1i)"=>"", "expired_on(2i)"=>"", "expired_on(3i)"=>"", "comment"=>"", "overwrite"=>"false"}, "media_selector"=>"install_media", "id"=>"15"}

The build flag is set. Everything seems to be normal, except:

192.168.99.253 - - [16/Jul/2020:07:29:17 -0400] "GET /unattended/provision?url=http%3A%2F%2Fforeman.internal.orion%3A8000 HTTP/1.1" 405 - "-" "Ruby"

You appear to be using Smart Proxy Templates module which forwards the query to Foreman over HTTPS but your Apache replies with 405. Our default configuration does pass this to Passenger or Puma depending on version, have you played around with that configuration?

Show me your rendered PXE template. It should contain the original URL.

Also double check that token_duration setting is NOT set to zero or some short period of time. Show me. :slight_smile:

1 Like

Ok, so I’ve verified the token_timeout and that the rendered template is correct.
Do you think it’s the templates plugin? I am using it just to make sure if I booger something up I can revert, but it’s not necessary…,

!

Verification of token_duration :slight_smile:

Then show me debug output of proxy.log running on that instance. It should have passed this into Foreman, there is no magic in there.

Here you go: https://pastebin.com/0majYNbc

Thanks!

I do not see the request in that log.

You have attached a screenshot of rendered template where I can clearly read url=http://foreman.internal.orion:8000/unattended/provision?token=XYZ. That is correct. The port 8000 indicates this should be a Smart Proxy running HTTP endpoint.

This by the way is in direct conflict with your original statement:

Are we talking apples and oranges now? :slight_smile:

Anyway, this is still weird:

What I’d expect here would be 200 result of course. I want to know why the token got lost. Normally your host reads that PXE file config over TFTP, which correctly have the URL and token, smart proxy forwards that to Foreman.

Now, on host foreman.internal.orion show me proxy.log with a line that receives the above call. You can find it by searching for /unattended/provision.

This all is weird, this works out of box. It has to be misconfiguration. Someone help me here, I am taking a break now.

It’s possible that I’m just not doing a good job of explaining the problem :confused:
When I provision a discovered host the PXE file gets written with all the correct information with the exception of the provisioning token on the URL. If I cancel build, then put it back in build mode it foreman writes out the PXE file correctly. Does that clear it up at all?
The 405 error is getting thrown because the client is attempting to request http://foreman.internal.orion:8000/unattended/provision which doesn’t exist.

Here’s what the pxelinux.cfg looks like on a discovered then provisioned host: https://pastebin.com/gDkHaR27
It’s like whatever process writes that out during the transition from discovery to new host doesn’t add the “?token=” part. Is there another debug flag I can turn on, or another log I can review?

Weird, can you debug this a bit? I haven’t heard about this one. Token creation is hardcoded in our codebase, the moment a host is created it has a token generated automatically. See HostTokenTest and HostExt::Token concern. Unattended and token_duration settings must be set. Specifically, put some debug lines in refresh_token_on_build to find out what if statement makes it to skip it.

Gotcha. Can I get a bit of guidance on how to debug refresh_token_on_build?

Thanks!