Infrastructure roles

aruzicka · January 15, 2021, 11:47am

Intro

Historically Foreman treated itself, Smart Proxies and Hosts as completely unrelated objects, even though you usually have Foreman and Smart Proxy machines registered as Hosts in Foreman. So far we managed without knowing the relationship between Host, Smart Proxies and Foreman, but lately use cases, which would be rather hard to implement without this link, started popping up. Most of the use cases originate in remote execution, but I’m sure that once we establish this, it will find its use in other places as well.

Use cases

Because Foreman treated itself and Smart Proxies the same way as any other Host, it was possible to run remote execution jobs against Foreman itself or against its Smart Proxies. What however wasn’t really possible, was requiring a special permission for doing so. The users could add a special permission by hand, however they had to keep the permission’s filters in sync with their Hosts and Smart Proxies.

In Satellite land we have two ansible playbooks which are meant to be run against the Satellite infrastructure itself. One sets up the cloud.redhat.com connector and is meant to be run against Foreman, the other is meant to run against Smart Proxies and upgrades them. Again, because Foreman doesn’t know the relationships, we cannot offer only relevant hosts when triggering these jobs and keeping track of what host does what is left to the user.

Even in Foreman land, we could think of many places where this would make our life easier. E.g. we could have a job to install a plugin, perform a backup, clean up ArfReport storage on OpenSCAP enabled smart proxies and so on.

Proposal

To address this, I propose we establish a link between Host and Smart Proxy object. Because only a small number of the Hosts will actually be linked to Smart Proxies, this relationship will be tracked by a InfrastructureFacet, which will be created on demand. The association between Host, InfrastructureFacet and SmartProxy would be as follows:

Host 1 -- 0..1 InfrastructureFacet 0..1 -- 0..1 SmartProxy

Since there is no Foreman object, there will be a field in the facet, marking the Host as Foreman. This will allow us to have permissions based on whether a Host is Foreman and/or Smart Proxy and filter Hosts by the same criteria.

Implementation details

To be able to link Hosts against other objects reliably, we need to have a piece of information that will be available on both sides, so we can use it as a key. On the Foreman side, we already have instance_id and as part of my POC I introduced the same instance identificator to Smart Proxy as well.

Deployment

When you run the installer to install Foreman, it will generate a new UUID, store it in the database and then create a custom fact out of it. This way we will be able to compare the UUID from Host’s facts against the actual instance_id. I can imagine three cases there:

There is no fact and therefore the Host is not a Foreman.
There is a fact and it matches Foreman’s instance_id, meaning the Host is this Foreman
There is a fact, but it differs from Foreman’s instance_id, meaning the Host is a different Foreman instance.

For Smart Proxy the process is similar. Installer generates an UUID, puts it into Smart Proxy’s settings.yml, creates a custom fact out of it and also sends it over to Foreman when registering the Smart Proxy. Smart Proxy’s instance_id will also be available through the /versions API, so that clicking Refresh in Smart Proxy details updates its instance_id.

Each time Foreman receives facts, it will look for the custom facts, creating or editing InfrastructureFacet accordingly.

In situations where we’re not installing a new Foreman instance, the way how we’d deploy this stays the same for Smart Proxy, but is slightly different for Foreman. Foreman already does have its instance_id, but it is generated when Foreman first starts and then kept in the database. To be able to perform this flow, we will need to get the value out of the database so the installer can work with it, more details in Add 1.

Edge cases

Of course, this proposal is not perfect and does not address every single possible eventuality under the sun, such as:

multiple Smart Proxies with different instance_ids on a single Host
multiple Hosts reporting the same Smart Proxy instance id, although this could be partially addressed by one-to-many association between SmartProxy and InfrastructureFacet
multiple Hosts reporting different Smart Proxy id, but being behind a load balancer. If the users put Smart Proxies behind a load balancer, then they should make sure all the hosts report the same, which reduces this issue to the previous one

POC PRs

Add 1 - Upgrade considerations

I can see three paths we could take here, but I wouldn’t call any of them nice.

Don’t do anything special and behave the same way as if a clean installation was being done. This would result in Foreman’s instance_id getting changed, which may not be an issue for some users, but it would break things for users using find-it-fix-it from cloud.redhat.com as it relies on the instance_id.
Adding a note to the release notes, saying that if a user needs to keep the same instance_id, they should retrieve it manually and then pass it as an argument to the installer.
An installer migration/hook, which would essentially 2) behind the scenes.

ekohl · January 15, 2021, 12:20pm

Having an instance UUID on the Smart Proxy feels very complicated. Why is that needed? It adds a lot of complexity and I’m not sure it’s really needed for the constraint.

Here’s another thing that I thought about in another context. Sometimes you need to link to yourself from Smart Proxy code. However, there is no external servername (i.e., smartproxy.example.com) variable. That means you can’t construct https://smartproxy.example.com:8443 other than guessing. This could also be useful to identify multiple Smart Proxies behind a loadbalancer (where the system hostname is not the same as the service name). Would this be a better alternative to UUIDs?

What I’m concerned about is matching via facts. If a user has root on the system, they can imitate to be any smart proxy as long as they know the UUID. We always kept these relations as a separate registration process for security.

Overall I’m not very happy with the additional complexity. This is really a lot for IMHO an edge case. The relationship can make sense, but the implementation doesn’t feel right.

aruzicka · January 15, 2021, 1:07pm

We need same piece of data on both sides (Smart Proxy and Host) to be able to establish the relationship. Without it we could make an educated guess at best. It doesn’t have to be an uuid per se, but since Foreman already uses uuid as instance id, I went with it for smart proxy as well. Additonally I’d say having the shared piece of data be random makes it harder for someone to guess it.

Are you suggesting we teach smart proxy its own external name and then match host’s fqdn against the proxy’s external name?

If a user has root on the system and they spoof the custom fact, then either

There’s no proxy with the spoofed uuid and nothing happens apart from us storing another uuid in the db
There’s a proxy with the spoofed uuid and it gets linked against the host. This means you need higher permissions in foreman to manipulate with that host. It could be an issue if we kept the facet-proxy relationship 0…1-0…1, so this new host would “shadow” the old one. But if a proxy changed the relationship to has-many, then it should have no negative effect.

In any case, it doesn’t allow the host to do anything more than it can do now, quite the contrary. The host can imitate a smart proxy, but it won’t really gain anything by doing that

ehelms · January 15, 2021, 2:54pm

Is this a kind of “dumb” matching where if there is a host and smart-proxy with the same FQDN within Foreman we assume they are the same entity and link them up? I am trying to understand the data structure and workflow compared to the UUID proposal. Would this be:

I register a host, then I register a smart-proxy, if smart-proxy reports same FQDN as an existing host, link them
I register a smart-proxy, then I register a host, host checks if smart-proxy with same FQDN exists, link them if so
If I do either of #1 or #2, and they have the same FQDN as the Foreman server itself, link and mark as Foreman?

Are there edge cases or mismatches that can occur here?

ekohl · January 15, 2021, 3:25pm

When the Smart Proxy connects to Foreman via an authenticated channel, it presents a certificate with a common name. Foreman then searches its database for a Smart Proxy with this common name. Technically this certificate is optional, but in practice it’s always present. That means there is already a name for a Smart Proxy as Foreman would know it. I’m wondering if this could be reused.

I think that essentially the goal that we’re trying to achieve here is that we want a full topology of Foreman and its services. Another feeling I’m getting is that where in the past Foreman stated for all its hosts what should run on those machines. Configuration management (typically Puppet) then made this happen. I have a feeling that it’s now trying to reverse the relationship and the machine reports what it’s running and figure it out.

Just spitting out thoughts, but this feels similar to how we handle (DHCP) subnets. On the one hand you can define them in Foreman. Then configuration management can query Foreman and realize it on the actual Smart Proxies, like configure ISC DHCP. Another approach (also implemented) is to scan subnets on the Smart Proxy and import them to Foreman.

That’s also similar to how we deal with interfaces on hosts. You can let facts report it and update the host representation. The other way is to run configuration management on hosts to align the configuration with what Foreman thinks it should be. Again, both are implemented in Foreman.

This is yet another instance of where the data can flow one way or the other. We’ve never answered it for our users and let them choose.

lzap · January 15, 2021, 3:32pm

This was exactly my initial though, thing is, when a host presents a valid X509 certificate that also has Common Name (hostname), it is guaranteed it’s the host with private key possession. If someone (an installer, an operator) then registers proxy with the same name, we know for fact it is the proxy host do the association. This could be practically some kind of activerecord callback on proxy.

If you still want to be explicit (you mention UUID which assumes you want something to be able to explicitly pair the hosts), then we can encode an extra information into the certificate. This assumes we have finalized our own certificate management utility that would be able to issue such certificate. Since certificate is trusted, it does not have to be UUID, just information if the host is foreman or proxy or both should be enough since the cert is signed. Something like:

generate-cert --type regular_host xyz
generate-cert --type foreman xyz
generate-cert --type smart_proxy xyz
generate-cert --type foreman_with_smart_proxy xyz

This will make sure than even on infrastructure that has incorrect DNS, or whatever, when such a host checks in the association is automatically created correctly on fact upload (or any kind of action that is done via the secure channel).

aruzicka · January 15, 2021, 4:47pm

Let’s say we

In theory you can have as many certs with the same CN as you want, right? If that is true, then it is guaranteed that it is a host with private key in posession, but it may or may not be the same host.

Let’s say we go with certificates. How would we use that to answer “is the host a foreman?” question? It could probably work if we “baked-in” some additional information into the certs as @lzap suggested, but what about deployments where foreman and a smart proxy are not colocated on the same box? Foreman would never call to itself so it would never report itself. I know that in typical deployment the two are colocated, but it is not a strict requirement.

ehelms · January 15, 2021, 9:05pm

Could Foreman, on application start up, seeding, somewhere in the process of getting spun up, create a host entry itself and ensure that it exists and has the right information? When we talk about wanting to manage our own infrastructure objects, I find it strange that we have to wait for something else to create the host object so that we can then link them up rather than having a first class object that represents our infrastructure objects by the sheer existence (you could extend this to smart-proxy as well).

Marek_Hulan · January 18, 2021, 10:35am

I don’t think this is the best identifier. The hostname or the domain can change altogether with the certificate, while the instance is still the same. I don’t mind what authenticated channel we use to get the identity of the proxy, but ideally it’s not tied to one. Or we’d need to officially say we no longer support HTTP only proxies and keep HTTP option only for features that require it (provisioning to get templates). Regardless of the transmitting channel, IMHO we should create a new identifier. Proxy should also report it in capabilities API.

Although today, we don’t have a way to deploy a Smart Proxy from Foreman. We can only manually inform Foreman about its existence. I think that’s not a bad flow. You either auto-discover or in cases where it’s not possible, you define manually.

I think UUID is sane for consistency with the Foreman. Also, there can be two proxy.example.com on one Foreman and it’s perfectly valid setup. FQDN is not a unique identifier. At the same time I doubt it really works with our taxonomies, but that’s another story.

so do we no longer support pure HTTP proxies? I’m fine with that but I think then we really need what you suggested, storing this identity to the certificate and therefore the certificate management being done. That probably does not prevent us introducing the UUID first and do the certitifaces change later.

ekohl · January 18, 2021, 11:06am

We do and have had so for the better part of a decade: provision one with Puppet. That’s a pattern that I see in a few open RFCs: we’re reinventing configuration management. It feels to me that Red Hat Satellite never really understood or embraced Puppet. Now it has Ansible and it’s finding out all the things Puppet is used for.

A long time ago I wrote up a proposal to import the installer post installation and use that to manage it. It’s even still up on Foreman :: Contribute. Post-installation import idea · theforeman/theforeman.org@e64b7c5 · GitHub dates back almost 7 years now and this feels like a similar initiative but with Ansible behind it.

ekohl · January 18, 2021, 1:51pm

Having thought about this more I can define 3 separate areas that we can talk about. Each area can also divided into Foreman and Smart Proxy.

Database modeling

Smart Proxy

We want a way to store this relation in the database. In the opening RFC there is a model.

From what I know about deployments is that in most cases there is 1 host which runs the Smart Proxy. There are some cases where they are load balanced. In some cases it doesn’t run on a (managed) host.

The implication is that a host can have 0 or 1 Smart Proxy (through the InfrastructureFacet). From that it also follows that a Smart Proxy has 0 to n Hosts.

To me this sounds correct. In my experience it matches with how those are treated in practice. I am assuming that all these values will just be foreign keys.

Foreman

In practice Foreman typically runs on a single host, but my feeling is that in the community it’s more common to load balance a Foreman instance than a Smart Proxy instance. That means you can have n hosts run a Foreman instance. Of course these hosts don’t have to be self managed so it can also be 0 hosts.

This effectively means that you can have 0 to n hosts which are a Foreman host. Those may or may not have a Smart Proxy as well.

The proposal is to store this in an InfrastructureFacet on the host. However, it doesn’t specify how the host will be marked. Will it be a boolean (marked implies that) or store the Foreman instance UUID.

Management of relations

For both relations you can choose how to manage them. There are several options:

Manage by hand. Arguably the most correct but also the most tedious.
Manage via fact imports
Manage via some other way.

Note that there may be multiple way of managing it.

Automatically managing Smart Proxy to Host relationships

The proposal is to add an instance ID. I think this adds a lot of complexity while there already is something to identify the Smart Proxy (common name on the client certificate).

Another thing that came to my mind is systemd’s machine-id. Perhaps that’s easier to correlate with facts.

Both have their flaws as well so at this point I’m not sure what’s the most reliable way of doing this.

Automatically managing Foreman relationships

Implementation detail: I saw that the instance ID was written as a database setting while writing out the fact statically. This means they can easily go out of sync. It also means that if you don’t use Puppet to continuously run the admin can go into settings, change the instance UUID but then run the installer again and reset it to the old instance UUID.

I think it would be better if it was implemented as a dynamic fact. Reading something from Foreman is usually slow if you need to initialize rails so it may be better to read out a file somewhere.

To prevent the admin from changing the instance ID, it can be written to settings.yaml. This makes values read-only from the UI/API.

Combining these 2 (implement fact by reading settings.yaml) might be an issue with file permissions though.

Other notes

Something that I haven’t seen in this discussion is how to clean up entries. If a host report comes in without the fact, does it remove the relationship? What if you run Ansible and it doesn’t report the right facts but also have Puppet running which does?

Ondrej_Prazak · January 18, 2021, 2:00pm

We have something similar in openscap - proxy sends information about itself with a report so that we can identify the source proxy if it is behind load balancer.

github.com

theforeman/puppet-foreman_proxy/blob/master/manifests/plugin/openscap.pp#L23


      
          #                               So we will not request the XML from Foreman each time
          #
          # $reportsdir::                 Directory where OpenSCAP report XML are stored
          #                               So Foreman can request arf xml reports
          #
          # $failed_dir::                 Directory where OpenSCAP report XML are stored
          #                               In case sending to Foreman succeeded, yet failed to save to reportsdir
          #
          # $corrupted_dir::              Directory where corrupted OpenSCAP report XML are stored
          #
          # $proxy_name::                 Proxy name to send to Foreman with parsed report
          #                               Foreman matches it against names of registered proxies to find the report source
          #
          # $timeout::                    Timeout for sending ARF reports to foreman
          #
          # $ansible_module::             Ensure the Ansible module
          #
          # $puppet_module::              Ensure the Puppet module. This only makes sense if Puppetserver runs on the same machine.
          #
          # === Advanced parameters:
          #

github.com

theforeman/foreman_openscap/blob/master/app/controllers/api/v2/compliance/arf_reports_controller.rb#L107-L117


      
          if !params[:openscap_proxy_url] && !params[:openscap_proxy_name] && !@asset.host.openscap_proxy
            msg = _('Failed to upload Arf Report, OpenSCAP proxy name or url not found in params when uploading for %s and host is missing openscap_proxy') % @asset.host.name
            upload_fail(msg)
            return
          elsif !params[:openscap_proxy_url] && !params[:openscap_proxy_name] && @asset.host.openscap_proxy
            logger.debug 'No proxy params found when uploading arf report, falling back to asset.host.openscap_proxy'
            @smart_proxy = @asset.host.openscap_proxy
          else
            @smart_proxy = SmartProxy.unscoped.find_by :name => params[:openscap_proxy_name]
            @smart_proxy ||= SmartProxy.unscoped.find_by :url => params[:openscap_proxy_url]
          end

Marek_Hulan · January 18, 2021, 2:11pm

What I mean by that is a one-click experience for creating such proxy. Not a “configure provisioning, import puppet module to your puppet server, import all to Foreman and set the smart class params, assign this to the host and provision it” kind of procedure. I think the use-case is, linking the Host to he Smart Proxy, so we can run certain operations against such host. Deploying a proxy through Puppet is one way we should consider to be compatible with this linking. I think you then well described in second post, there may be a need to other mechanisms, such as manual linking.

IIUC Puppet can still be used in this RFC. What I like about custom fact is, it’s universal. All cfgmgmt systems we have can easily add custom facts. In fact the RFC relies on Puppet custom fact I believe. It just addresses the detection, not the deployment part. You can still use Puppet to automate the deployment.

aruzicka · January 18, 2021, 2:14pm

Although I haven’t mentioned it explicitly in the rfc, in the POC PRs I store the UUID in the facet for both foreman and the smart proxy. This allows us to break the association if for example proxy’s uuid changes. In foreman’s case it allows us to distinguish if a host is this foreman, another foreman or not foreman at all.

I’m not sure if relying on something outside of our control is a good idea. Especially since you can run smart proxy on various non-systemd platforms, such as windows.

Originally foreman’s instance id was generated on first start and stored in the db. In this proposal and poc prs I tried to do as few changes as I could, leading to the possibility of the fact and the setting getting out of sync. If we decide matching using facts is the way to go, then it would make sense to make the setting more static.

Currently it does not. Should it? My thinking was that once a machine reports itself as foreman/smart proxy, it keeps being foreman/smart proxy until reprovisioned.

Marek_Hulan · January 18, 2021, 2:22pm

I mentioned earlier, FQDN is not a great identifier for local networks and example.com-like domains. I’d prefer some other identifier, the machine-id sounds cool. I only wonder if we can rely on that in future e.g. inside a container. If yes, then plus one, if not, randomly generated UUID sounds more appealing to me.

I know that this was required ealier to be changeable. Therefore the DB was set to be the source of the truth and we let users to modify this easily. I’d be OK with this getting out of sync and letting user manually change the relationship, since moving Foreman between hosts seems quite rare, but I know if can happen during upgrade of underlaying OS for example.

Good question, I’d like to echo that question. My assumption is, we’d not do any changes for missing fact but we could clear on existing fact key but nil value?

ekohl · January 18, 2021, 2:28pm

UUID was also what I think would be the correct implementation. This allows for a use case where there is a Foreman instance, but not the one currently used (like managing a Foreman instance in a lab under management of another Foreman).

Good point.

Maybe not automatically, but if it was linked we should provide a way to break the relationship. This allows correcting mistakenly linked hosts or after a migration. For example when a Foreman is cloned for a lab but then starts to live its own life. Then it should not be linked. Being able to manage this via the UI/API is probably sufficient.

lzap · January 18, 2021, 9:28pm

If two hosts checks-in via HTTPS with the same X509 client certificate (the same CN), Foreman will only keep a single inventory record ultimately leading those hosts overwriting their facts (thus UUID) over and over again.

What I am not comfortable with is ability for any host with a valid certificate to upload a UUID and pretend it’s a smart proxy. I am talking security. I am assuming that any UUID checked-in via facts would “upgrade” a host to a smart proxy. If I don’t understand your proposal correctly, then fill me in.

My thinking is, if there was information in the cert itself “this is a smart proxy”, then this could be verified on fact upload. There must be human involved in the process confirming the association, in my idea this happens before a host can check-in for the first time.

Now that I am thinking about it, I see that a human could confirm which host is the correct one in Foreman UI in a more comfortable way. That would work too as long as it is mandatory.

aruzicka · January 19, 2021, 12:41pm

I believe I mostly answered it here.

To rephrase, it only matches a host aginst an existing smart proxy and creates a link if the uuids match. It does not create a new proxy, it doesn’t grant the host any new privileges or capabilities and it doesn’t alter the proxy in any other way. If a host gets linked against a smart proxy, then users might need additional permissions to interact with that host in foreman. What is the attack vector here?

evgeni · January 20, 2021, 7:41am

I can think of the following attack vector:

create a normal host in Foreman (that’s what most users are allowed to do)
install a smart proxy on the host, but don’t really run it
inject the instance_id fact of an existing proxy and thus let Foreman think that userhost1.example.com is proxy1.example.com
wait
the admin updates proxy1.example.com using the proposed “upgrade proxy” playbook
as proxy1.example.com is really userhost1.example.com, THAT gets upgraded (and as it has a proxy installed that works)
the user now owns all credentials (oauth keys, certs, blah) of proxy1.example.com as the upgrade process made sure those are refreshed during the upgrade
do whatever you want with the permissions proxy1 has

This is a rather long running attack, but I think we’ve all seen that those are the ones that are most interesting

PS: I’ve of course did not ensure that the upgrade playbooks (so they exist) do refresh any credentials or anything, but just because they don’t today, doesn’t mean someone won’t add that tomorrow, not knowing that the proxy can be impersonated (it really shouldn’t be).

aruzicka · January 20, 2021, 9:09am

This is indeed possible. I always assumed that even if using those shortcuts the user would still go through the remote execution form, where they could spot that they’re trying to update proxy1.example.com on suspicious-host.somewhere.else as a last line of defense.