PuppetCA Orchestration: The future of autosigning

TimoGoebel · April 5, 2018, 9:26am

Hi, fellow devs,

TLDR: Autosign needs re-work. If we follow the RFC approach, we will potentially lose some features. I’d like to get an opinion on that.

A couple of time ago Shimon opened an RFC to rework the way we currently handle PuppetCA in core.

PuppetCA is required to issue certificates for hosts so they can get a puppet catalog and do a puppet run. Right now, we just write the hostname in a file via the PuppetCA smart proxy to allow PuppetCA to issue a certificate for that host. After the host leaves the build state, we remove that entry. To limit the time auto signing is possible, we just create the auto sign entry in unattended controller when a host retrieves the provisioning template.

While this generally works well, we have had a lot of hard to debug problems with this approach.

Hosts don’t get their provisioning template when a call to the PuppetCA smart proxy fails. Some calls are very slow if you rebuild a host a couple of times.
We often see that hosts don’t get a valid certificate and don’t install correctly. This is incredibly hard to debug. And the user doesn’t get notified about this.

As most of you know, our development schedule is mostly pain-driven. So I decided to implement the RFC.
The basic idea is to have PuppetCA do a callback to Foreman. Foreman receives the full CSR and can make a decision if the host is allowed to install. When a host enters build mode, a one time token is generated and placed on the host during provisioning. When the host requests a certificate, the token will be incorporated into the CSR and Foreman can verify and eventually invalidate the token. So this greatly improves security and is not coupled to template rendering. Please see the RFC for more details.

This approach is generally very good in my opinion, but it has some drawbacks we should be aware of.

We’d have to remove the “list autosign entries” feature at the smart proxies page. We could replace this via a new page that lists all hosts that are allowed to autosign. In my opinion, this would greatly improve the user experience. For better visibility, we’d also like to contribute a feature that tracks failed/rejected auto sign attempts at a later point (maybe via a UI notification, not sure yet).
We’d have to remove the “add autosign entry” feature at the smart proxies page. This means, that a user can not add a new host to Foreman via fact upload and use naive auto signing. The certificate would have to be signed manually (that should still be possible) on the PuppetCA. We could also add the ability to define custom auto sign entries in Foreman. Foreman would then also accept certificate requests for hosts that don’t have a valid token but are whitelisted. But I don’t think, this is required in the first place.

Any comments? Is somebody against removing the features? Can we safely proceed with this effort?

Timo

ekohl · April 5, 2018, 11:40am

tl;dr: good proposal, but let’s replace the autosign file with a smart proxy service the PuppetCA calls and add more metadata to our own autosign file.

We can remove the sudo rules to let foreman-proxy modify the autosign file (a good thing).

It does mean we have another callback from the proxy to foreman. Traditionally we always tried to avoid this, but we already do this in the templates plugin. Originally we viewed the smart proxy as a stand alone service with just a REST API. I’d like for that to be possible so some administration inside the proxy (a cache) could be the middleground, but that could also weaken the whole implementation.

Ideally we would write the generated template + autosign entry to the proxy any time something in core changes so the proxy never has to call back and provisioning is truly independent of core. To limit the scope, I’m not going into the templating now, but I do think it makes sense to move it from core to proxy.

My counter proposal would be:

Foreman host enters build state
- Foreman calls the smart-proxy with metadata about the host (build token, token expiration time)
- Smart-proxy stores this data
- PuppetCA calls smart-proxy whether it is allowed to sign the certificate
Foreman host leaves build state
- Foreman calls the smart proxy
- Smart proxy updates its data

The benefits are:

Signing is not dependent on a callback to foreman (= higher reliability)
Updating the metadata is a simple REST call and you would see during host creation if it fails
You could still implement listing autosign entries
You could still implement the manual add autosign entry

TimoGoebel · April 5, 2018, 12:44pm

Thanks for the counter-proposal. Actually, I am really fond of the callback to Foreman. What I’d like to see in core in the future is the build history of a host.

Something like this:

Host entered build state.
Host powered on.
Host requested PXELinux template.
Host requested provision template.
Host requested puppet certificate.
Provisioning was finished.
Host booted into OS.
Host issued ready callback.

That is going to be hard to implement without any callbacks.
In addition, we already have callbacks for e.g. fact uploads or ENC data. Without Foreman, puppet won’t work at all.

Currently, we’d have to generate the PuppetCA token in Foreman so we can inject it into the provisioning template. Where would you store it on the smart proxy? A YAML-file? Sqlite-Database?
How would you handle token invalidation? When SmartProxy verifies a CSR, the token should be rendered in also be invalidated in Foreman.
How would you signal back to Foreman, that a rogue host requested a certificate to make the user aware of that?

You could still implement listing autosign entries
You could still implement the manual add autosign entry

My proposal gets rid of the whole autosign concept. And I think this is actually a good idea to make things easier for users. You just need to pass the PuppetCA token to the host and then it can get a certificate. If that fails for whatever reason or you have a host in a reboot loop, the user will get notified about that as soon as possible. But if we want to support naive autosigning we can still do that. Even based on other criteria, like if this host is registered in Katello or if the domain is known in Foreman, or the VM is known on a Compute Resource. We can’t do all that without a callback.

ekohl · April 5, 2018, 1:26pm

You’re right, but they don’t have to be in the critical path. If we consider it logging, then you could async ship these logs with timestamps back to foreman and later merge it back. It will complicate things, but we already have the logging plugin in the proxy that buffers these kind of logs for similar reasons.

Though these are standalone scripts and not part of the proxy itself, you are correct.

We also have a build token in Foreman. That’s stored somewhere as well I assume. Could it use the same build token?

These points argue for a separate token (but could have equivalent logic). Perhaps even a more generic host token concept where build and puppetca are just 2 implementations.

I have heard of people in lab setups who use * in the autosign file to easily add a lot of hosts. This would complicate that use case.

I was arguing for a path without a callback, but that doesn’t mean I’m 100% against a callback. See it as a cache miss in a proxy: then it calls back to the upstream server.

Marek_Hulan · April 6, 2018, 7:05am

TimoGoebel:

This approach is generally very good in my opinion, but it has some drawbacks we should be aware of.

We’d have to remove the “list autosign entries” feature at the smart proxies page. We could replace this via a new page that lists all hosts that are allowed to autosign. In my opinion, this would greatly improve the user experience. For better visibility, we’d also like to contribute a feature that tracks failed/rejected auto sign attempts at a later point (maybe via a UI notification, not sure yet).

We’d have to remove the “add autosign entry” feature at the smart proxies page. This means, that a user can not add a new host to Foreman via fact upload and use naive auto signing. The certificate would have to be signed manually (that should still be possible) on the PuppetCA. We could also add the ability to define custom auto sign entries in Foreman. Foreman would then also accept certificate requests for hosts that don’t have a valid token but are whitelisted. But I don’t think, this is required in the first place.

Any comments? Is somebody against removing the features? Can we safely proceed with this effort?

I think that having an option to add * to autosign.conf is important for many users. But at the same time, I don’t think people necessarily do this via Foreman UI. So as long as they can still override autosign.conf manually, I think we’re good. I don’t think many users would manage autosign entries manually via UI. Since Foreman is required to be up during host build (templates rendering, host built callback) I think this does not make it worse. So from me.

TimoGoebel · April 6, 2018, 7:38am

You are right, this is probably the best approach. Maybe we could even use a message queue (well, thanks to katello we actually already have one) for this. The smart-proxy would create messages that Foreman can consume async and create status event in it’s database. But luckily, this is out of scope for the PuppetCA topic I believe.

The PuppetCA policy script would also be standalone.

No, because we’d need to invalidate the PuppetCA token after use. The build token would still need to be valid, e.g. for the unattended/built callback.

Actually, that’s exactly my plan. The current implementation makes Token an STI model, so you end up with Token::Build and Token::Puppetca classes.

Yeah, it makes sense to keep that. We could either implement this in the policy script and skip the callback to Foreman or add a setting to Foreman that makes Foreman signs all certificates without any (token) checking.

Your comments got me thinking if it’s wise to implement most of the login in the smart-proxy. Then smart-proxy would receive the CSR, process it and do a callback to Foreman to check & invalidate the token. So we’d have a Policy Script -> Smart Proxy -> Foreman chain. When (and if) we store templates on the smart-proxy we could then rewrite PuppetCA handling to store the tokens on the proxy. What do you think, would that be a middlegroud that’s acceptable for you?

Timo

ekohl · April 6, 2018, 8:31am

That’s very much in line with what I was thinking even though I didn’t explicitly write it down.

On a related note: we should consider alternative names and at least prepare the allow process to handle subjectAltName validation, even if Foreman initially can’t send any alt names.

juliantodt · April 10, 2018, 1:14pm

Hey,
I’m working together with @TimoGoebel to implement these new features and we had a question that we liked to have your opinion on.
In which project should we put the policy script that calls the SmartProxy and is named in the puppet.conf?
It would make sense to have it in the smart-proxy project as this is where the rest of the logic for the policy based autosigning is. At the same time it will be executed by puppet (potentially have to set permissions manually) and requires a config file that is set by the foreman puppet module.
An alternative would be to also have it in the foreman-proxy puppet module, whose main disadvantage is that using non-corresponding versions of the puppet module and the smart proxy might break a lot of things, but also that if you don’t use the foreman-proxy puppet module to set up your smart-proxy you will have to copy-paste the script from there, making the documentation for the smart-proxy rather confusing.
What would you prefer or do you have any other ideas?

Julian

ekohl · April 10, 2018, 2:14pm

Until now we’ve stored puppet integration scripts in https://github.com/theforeman/puppet-foreman/tree/master/files and deployed them via foreman::puppetmaster which is called by puppet::server::config in our installer.

That said, we’ve talked here about the script talking to foreman-proxy instead of foreman directly. That means puppet-foreman_proxy could be a better place.

We’ve had the discussion for years what’s the best place to store this and could never reach a conclusion on a better way. A separate package could be nice. This has always been hard because we used to have templates with inline configuration. Luckily we’ve factored that out and now they’re static files (huge improvement). We still support non-AIO installs (notably on Debian which has unbundled Puppet 4) so we still have multiple paths to support.