The Road to Making Puppet Optional

Isn’t this already the case?
I’m asking because we might want to gather facts or reports from Compute Resources, e.g. for a VM import and it would be good to do the heavy lifting in core as the proxy doesn’t know anything about compute resources.

Good point. Gathering facts from CRs is definitely a great idea! The intention is that the heavy lifting (i.e. processing) is still done in core (or perhaps in a background task), the proxy is there to handle communications between foreman and external services. I know right now CRs all communicate directly to Foreman but perhaps we should ask ourselves if that is correct.
One way I can see this is a “Facts and Reports” module in a proxy that defines what sources it accepts and passes them along to the Foreman server, so you don’t have to have your external service authenticated directly with foreman for it to work.

Currently the Puppet integration scripts (ENC, reports) call directly to Foreman. My idea was to provide a Proxy module where Puppet can call. So currently we have:

Foreman -> Proxy -> Puppet
Puppet -> Foreman

The Proxy already has a setting where the Foreman server is located. It has the credentials to call back. Then we only need to ensure Puppet can talk to the Proxy. Another argument is that I thought it’d be easier to use separate certificates between Proxy <-> Puppet. Perhaps set up a minimal Proxy instance using the Puppet certificates that knows how to securely call back to Foreman.

I haven’t thought it through too much so I’d like some feedback on this.

I think this actually makes sense. The enc/report scripts currently hijack the credentials from the smart-proxy. It would be cleaner if they sent their data to smart-proxy.
How would you handle the case if the smart-proxy is up and running and is receiving data but it cannot reach Foreman. Would you make this a synchronous action? Would Foreman poll the smart-proxy for data?

That was why I was thinking about a very small service that can actually be multi process/threaded. A blocking API is much easier to deal with. If it runs as the foreman-proxy user it should still have the credentials and config.

Do I read this concept correctly that client puppet agents would talk to the smart-proxy instead of Foreman? If yes, is this effectively using the smart-proxy as a reverse proxy?

In a way yes. However, I’m not entirely sure how to do the auth. In the Katello setup you would have Foreman and Foreman Proxy both using the Katello default CA. Then if you have a Puppetserver, you’d ideally use the Puppet CA certificates because it allows using the built in HTTP report processor instead of our custom one.

One possible way is to let Foreman Proxy bind on an additional port and serve the Puppet CA. Another is to use Server Name Indication and use separate DNS names. A third I’m not sure that will work is make sure the Puppet CA itself is signed by the Katello root so it’s the same hierarchy.

Why is that? Isn’t the Katello CA part of the trusted roots of a host? The proxy would then only need to trust the PuppetCA (client auth) and could forward the data to Foreman with whatever CA.

I believe that depends on your point of view. Currently the smart-proxy has to be installed on the same host as the puppetserver so the scripts can use the credentials of the smart-proxy to send data to Foreman. If we send the data via smart-proxy we could decouple this.
If we don’t need to do any processing or normalization, it’s a reverse proxy. Correct. Would we want to do any post-processing on the proxy?

The current issue is, the interface defined in FactParser doesn’t well define, what values are expected and how they create objects. E.g. operating systems - it’s not clear what values parsers should derive from facts, in the past, some facters created RHEL as RedHat, some as RHEL. The same aplies to values format e.g. ram. We have a comment we expect the value in MB but I think it would be good to have a proper documentation for the full parser interface.

2 Likes

This is what I would like to tackle in the fact parser rewrite - to find a common model. Basically to follow facter3 with some changes (facter does some things incorrectly - e.g. reporting boot uptime instead of boot time which does not change every single fact upload). The CFM would not only define which facts should be reported but also contents of these facts, I’d definitely kill all those “123 MiB” crazy formats which are maybe human readable but that’s totally wrong. It should be reported in bytes, machine readable, it’s our UI job to format them for human beings! :slight_smile:

1 Like

I wanted to see if I understood the outcome of this correctly. I’ll phrase things as a series of questions:

  • If I have Foreman without the Puppet plugin, I can “register” hosts to Foreman via Puppet still?
    • If yes, does this assume I have a smart proxy present with the Puppet CA feature?
    • If yes, is the same true for other host sources (or does it require their plugins)?
    • Ansible
    • subscription-manager
    • Chef
    • Salt

This has never been true. You only need the Puppet CA feature if you want to provision and deprovision.

Registration happens via fact upload. Currently we use the Puppet feature as authentication and authorization. The Puppetmaster does a POST to the facts endpoint using its certificate. In the certificate is the common name of the smart-proxy. Foreman checks whether this smart proxy exists and has the Puppet feature and then allows uploading facts. Reports happen using the same authentication.

I believe foreman_ansible uploads facts and reports using the same way and abuse the Puppet feature being present to authenticate.

In theory you don’t even need the Puppet feature for that since you can use username/password with sufficient credentials instead.

No abuse is needed here. We already have the concept of FactImporters. They can register a smart proxy feature that is allowed to send facts.

Does this still work with the Puppet master and smart-proxy are on separate hosts?

Do I have these “registration” workflows correct?

So for Puppet (requires smart-proxy with Puppet feature):

Puppet agent -> Puppet master (impersonating smart proxy) -> Foreman

For Ansible (requires foreman_ansible + smart-proxy with Puppet feature):

Ansible on host -> Smart Proxy -> Foreman

For subscription-manager (requires Katello):

sub-man on host -> Foreman

This depends on how you configure it. The installer sets it up for REX:


I don’t know how many other users have set up the callback on non-smart proxies.

Correct, but I’m not sure if it’s implemented today and must be verified.

Not with our current implementation. This could be fixed (if it needs fixing) by proxying the requests via smart-proxy as we’ve already discussed.

It works as I’ve used it for foreman_omaha. If you want to verify it yourself:

Hi everyone,
I have been working on Hammer and the V2 API of the new Foreman Puppet plugin, and I am looking for a more concrete direction to go:

Users of Puppet should not see a deterioration in their workflows - ideally, some workflows would even see an improvement.

Taking this quote from @tbrisker, I would assume that the main objective is to keep the new Puppet Plugin API as similar to the old one as possible. Ideally, it has exactly the same interface and the user will never notice a difference.

Advantages:

  1. No changes for Hammer-related interface
  2. No changes for FAM
  3. No changes for Foreman users that use the API directly.

Disadvantages:

  1. Implementation is a little bit messy for Foreman 2.X as functionality from Foreman core needs to be overwritten by the plugin.
  2. Changes to the API might bring advantages for the new hammer-cli-foreman-puppet plugin.
  3. We might miss a chance to clean up the API (in case this is necessary).

As of now, the implementation aims to go somewhere in between by prepending the current API commands with foreman_puppet, e.g. foreman_puppet/api/.... This is done in other Foreman plugins as well. As consequence, people using the API directly are the ones eventually “suffering”.

Please correct me if anything I wrote is not correct.

So the remaining question is:

  1. Continue with the current implementation?
  2. Move towards a complete copy of the current Foreman core API?
  3. Add any other changes to the new API?

Thank you for your thoughts!

2 Likes

Thanks for raising the concern @nadjaheitmann !
Currently, the new plugin still supports the old API endpoints as well, only logging a deprecation warning when users use them instead of the namespaced ones:

This means that users will have time to update any scripts or tools to use the new endpoints instead of the old ones and will not have a breaking change until we actually drop support for the non-namespaced endpoints.
We can certainly consider making changes to the API in the future if we feel it is needed, but I would suggest to avoid any breaking changes right now as the move to the plugin is a big enough change already.

2 Likes

I was already hitting issues with these smart varibles and them over riding anything in hiera. Is the hostgroup way of selecting classes staying?

We have started to move away from setting hostgroup varibles via foreman and redesigned the Hiera a little have an extra lyer for hostgroups

extract here , this is getting hostgroup fact, will this still be around in the newer versions.

" - name: “Hostgroup level settings/overrides YAML”
data_hash: yaml_data
paths:
- “environment/%{::environment}/roles/%{::hostgroup}.yaml”

We put this just before any node over ride and after env , default

I test and build foreman a lot with my dev platform, we also use it in prod rather extensivily.

steve