RFC: Simple & automatic host registration WF

lzap · July 16, 2020, 10:22am

I don’t mind tools or languages - I have never sticked to a single language or stack in my whole career, don’t worry.

I just feel like if we want to have facts during registration the most natural and logical tool is, my preference aside, is Puppetlabs Facter. Because we have so much code in Foreman to parse these, granted it needs a LOT of improvements. But at the same time, I would like to have opportunity for Foreman users to live without both subscription-manager and facter. I am heavily focused on provisioning and I think it’s the strongest Foreman selling point (bare-metal provisioning), therefore I am not that much interested in both content and configuration management. That’s why I am so much trying to find a good solution for users like me.

If we end up with maintaining a shell script which uploads core facts in Puppetlabs Faceter JSON format I am good and I will be the first one implementing Linux version because that allows me to get rid of the original facter in my workflows.

Or maybe the answer is that we need Puppetlabs Facter and there is no way without it. I am ready to accept that, however I will likely try to find some time to finish uFacter and use it instead of the original for my own use.

illumino · July 16, 2020, 10:40am

Ditto - That is why I’m investing time in understanding Foreman for myself And I agree, for provisioning you are unlikely to be provisioning “older” OSes (for some definition of older, and newer hardware is not as diverse as it was).

It was my understanding from reading the thread, the scope was a little wider, and hence my extended discussion on a fallback mechanism (and possible development strategies to handle the better fact providers when available). But I likely (ok, definitely) got more than a little carried away

Also thanks for the write up of the minimum facts and "nice to have"s, that is useful to know.

lzap · July 16, 2020, 11:16am

Yeah I feel like this discussion should have started with description of what are we trying to achieve with such registration. Because this is quite broad term and it also means different things for different parts of Foreman: Foreman configuration management host, provisioning host, Katello content host + various other plugins.

Whatever we do, it needs to be a flexible and extensible process.

Marek_Hulan · July 16, 2020, 11:41am

One of the requirement should be - if I want to register 1000 hosts, I shouldn’t need to go to Foreman 1000 times. Perhaps just once to get the registration command that can be then reused on multiple machines.

If I understand correctly, you suggest JWT token only allowing to fetch the global template, usable many times, perhaps with optional expiration. If that’s so, I think we can use it. The JWT token would be probably still user specific, so all other bootstrapping activities can be done only by the user, who has permissions for them. We can store user id in it right?

TimoGoebel · July 17, 2020, 10:53am

Yep, we can store the user id in it and it supports expiration out of the box. Maybe we want something like this, where the user can choose if he wants a command that just allows to register a single host or that allows to register multiple hosts.

This is how Office 356 does that when you want to create a “share this document” link:

Marek_Hulan · July 20, 2020, 1:25pm

Now thinking about this more, this does not work. If it’s part of the response generated on the server side, the MITM can just give you the script with another fingerprint (or skip the check entirely). It either has to be a local wrapper around curl, that would do the fingerprint verification locally or we’ll go with TOFU model when we call curl with ?insecure=true. And by default we’ll rely on user to install Foreman’s CA prior the registration.

lzap · July 21, 2020, 8:28am

Leos, I see redmine feature tickets being created. Can you post a summary about what is the planned solution?

I’d like to avoid creating a third unattended rendering endpoint at all costs, assumed it will also work via unauthenticated. If the new controller will be authenticated only, then we are good.

TimoGoebel · July 21, 2020, 9:25am

@Marek_Hulan: I meant something like this (leveraging curl’s https://ec.haxx.se/usingcurl/usingcurl-tls#certificate-pinning command.

curl --pinnedpubkey "sha256//83d34tasd3rt..." https://foreman.example.com/register/host?token=123 | bash

The actual hash of of the cert would be rendered by Foreman.

Marek_Hulan · July 21, 2020, 11:45am

Sounds great, I wasn’t aware of this curl arg. This will work well for registering to Foreman directly, we’ll need to think more about registering through smart proxy. It’s technically doable, but it may be a good idea to start caching the proxy fingerprint, when we refresh its features.

Thanks!

lstejska · July 21, 2020, 12:14pm

We are not going to create new unattended endpoint, we need to authenticate user and we want to use JWT, so the final endpoint is going to be under API scope.

Yeah, I split the work into smaller tasks so I do not end up with one big PR.

Summary of the solution:
(More details in the tasks)

Feature #30441 API endpoint for Global Registration Template
Feature #30442 JWT for Global Registration Template endpoint
Feature #30443 Global Registration Template - Content
Feature #30444 Add host registration template to Global registration template
Feature #30445 Host registration template - REX and Insights
Feature #30446 GRT parameters - validations & default values
Feature #30447 GRT - Secure communication
Feature #30459 GRT - Capsule callback support

ekohl · July 21, 2020, 12:24pm

Sadly, this can’t be used everywhere. From the man page:

              PEM/DER support:
                7.39.0: OpenSSL, GnuTLS and GSKit

Then:

# curl --version
curl 7.29.0 (x86_64-redhat-linux-gnu) libcurl/7.29.0 NSS/3.36 zlib/1.2.7 libidn/1.28 libssh2/1.4.3

Another concern is that Foreman has no setting for its own CA. Apache can serve with a different CA than the client certs so the only thing it can do is call out to the URL under foreman_url and get it from there.

We have the CA certificate configured on Foreman in the ssl_ca_file setting. I wonder if that could be used somehow.

Sadly DANE never took off but that would have been a possible way to establish the trust on an infrastructure level.

lzap · September 7, 2020, 8:53am

Generally this is not the first time we need a “global” template. For example iPXE bootstrap and also user-data cloud init bootstrap templates are both examples when we introduced new “global” endpoint. I know it is little bit late but I am wondering if we should name the new endpoint simply “/global” and this registration would be “/global/register” so we can reuse the same controller and render stack for other global templates.

Basically what I am proposing is a slight change in naming and HTTP path.

ekohl · September 15, 2020, 4:23pm

So there’s a lot of discussion spread over a lot of PRs. It turns out that at least I misunderstood subscription-manager and the proposed workflow. Perhaps @TimoGoebel as well.

This is how I propose it should be implemented and does differ from the original design.

Roughly speaking we have 4 flows. The first 2 are both situations where the client talks directory to Foreman and there is no Smart Proxy.

Direct connection to Foreman

Vanilla Foreman

I used Mermaid JS to draw some sequence diagrams with how I think it should work.

sequenceDiagram
    autonumber
    Participant Foreman
    Participant Client

    Client->>Foreman: GET /register
    activate Foreman
    Foreman-->>Client: Global Registration Template
    deactivate Foreman
    activate Client
    Client->>Foreman: POST /register
    activate Foreman
    Foreman-->>Client: Host Registration Template
    deactivate Foreman
    Client->>Foreman: POST foreman_url('built')
    deactivate Client

This renders to:

In step 1 the client uses curl to retrieve a global registration template. This template is rendered and returned to the client (step 2), which then executes it. Effectively the client runs curl https://foreman.example.com/register | sh.

Then execution starts. Within the template there’s another curl request (step 3). The goal of this is to create a host entry within Foreman. That’s POST /register. If this is successful, the host object is created (in the state building). With that data, a Host Registration Template can be rendered and returned (step 4). This is then executed by the client. This execution is part of the original Global Registration Template and the user doesn’t have to do anything. After everything is completed, a POST to the built URL is sent to mark the host as built (step 5).

Authentication wise the user is responsible for providing credentials for the initial GET /register, for example by passing --user to curl. In the returned Global Registration Template a token is returned that allows the POST /register to happen. This token has an expiration time; currently JWT can’t be revoked so there is a risk of replay attacks if the token is intercepted and used multiple times. If everything is retrieved over HTTPS, this should be sufficiently mitigated as long as the user doesn’t store the rendered template insecurely (in /tmp with bad permissions for example).

I believe the HRT also generates a token for the built URL update, but I’m uncertain about the details.

Subscription Manager (Katello)

Subscription Manager (subman) works different because there are some additional steps required. That means the workflow is on comparable at a very high level, but implementations are very different.

This implementation is only relevant for Red Hat-based workflows, at least for now.

Again, providing the diagram:

sequenceDiagram
    autonumber
    Participant Foreman
    Participant Client
    Participant SubMan

    Client->>Foreman: GET /register
    activate Foreman
    Foreman-->>Client: Global Registration Template
    deactivate Foreman
    activate Client
    Client->>SubMan: subscription-manager --register
    activate SubMan
    SubMan->>Foreman: POST to RHSM API
    activate Foreman
    Foreman-->>SubMan: certificates
    deactivate Foreman
    deactivate SubMan
    Client->>Foreman: GET /templates/hrt
    activate Foreman
    Foreman-->>Client: Host Registration Template
    deactivate Foreman
    Client->>Foreman: POST foreman_url('built')
    deactivate Client

The first step is still the same: user runs curl on /register. The GRT has code to detect subman should be used and runs step 3. I’ve drawn SubMan as a separate actor, but it happens on the Client machine. Perhaps Client should be read as shell, but that’s also an implementation detail.

During step 3, subman needs to register itself. It collects facts and prepares an API request (step 4) to the RHSM API (implemented by Katello which proxies it to Candlepin). Based on the data, Foreman ends up creating the Host object. Not drawn, but Candlepin also creates client certificates which are returned to subman (step 5). It will probably not be in status building, unless some custom fact is implemented for this. I’m not too clear on the details so I’ll invite others to correct me.

Since the HRT can’t be returned via the normal way, another way must be devised. I’m suggesting a dedicated endpoint (step 6). The exact URL is not that important.

Subman authentication (step 3) happens via Activation Keys (AKs), which is already built into subman. These keys already exist today an can be reused. It should be noted that are secrets and users should treat them as such.

The HRT template endpoint (step 6) should accept client certificates and let clients identify themselves. This avoids the need for yet another token and we know exactly which host it is due to the properties on the presented certificate.

Communication with a Smart Proxy in between

I have ideas about how this should happen, but they’re based on the previous 2 proposals. That’s why I’d suggest we first agree on those and then expand on the other case.

TimoGoebel · September 15, 2020, 4:50pm

I’m not too sure if this is actually the case. Why can’t we use the POST /register call (#3 in your first diagram) as well? The registration via subscription-manager would just optionally happen between (2 and 3 in your diagram).

Something like this:

sequenceDiagram
    autonumber
    Participant Foreman
    Participant Client
    Participant SubMan

    Client->>Foreman: GET /register
    activate Foreman
    Foreman-->>Client: Global Registration Template
    deactivate Foreman
    activate SubMan
    Client->>SubMan: subscription-manager --register
    SubMan->>Foreman: POST to RHSM API
    deactivate SubMan
    Client->>Foreman: POST /register
    activate Client
    activate Foreman
    Foreman-->>Client: Host Registration Template
    deactivate Foreman
    Client->>Foreman: POST foreman_url('built')
    deactivate Client

ekohl · September 15, 2020, 5:05pm

IMHO it’s odd to do POST since that’s supposed to create an entity. The entity already exists.

It would be important that you actually get the host identity correct so you don’t end up with 2 host entries that differ. For example, if hostname --fqdn returns something different than subscription-manager does. One example where that can happen is having a hostname set up in /etc/hostname and a different reverse DNS. An identified request with the right credentials avoids that.

(related to that - I didn’t check if built was POST or PUT)

Marek_Hulan · September 15, 2020, 6:20pm

Looking at case 1 for now. Can you please highlight, what are the pros of the new endpoint POST /register compared to the current design? And how does the picture change if we also want to run $facter in GRT and use that information during HRT rendering?

If I understand that correctly, this is the only difference comparing to the current implementation in PRs, that split POST /register into two requests, one is for host creation, second for rendering HRT (both existing API endpoints). I can draw the diagram tomorrow if that helps.

ekohl · September 15, 2020, 6:47pm

Rather than using the POST /hosts API call that returns JSON. The current PR “parses” JSON using grep to then perform another HTTP request. That’s fragile since JSON aren’t just strings. By using a dedicated endpoint that returns the HRT directly, you avoid one HTTP request and manual JSON “parsing”. It also avoids the need for a token on the second HTTP request (since that never happens).

Later when we add a Smart Proxy, it also avoids the need to proxy /api/hosts. That is a good thing, but something I kept out of the scope in my post.

Right now data is POSTed to /api/hosts and I imagine later on it would include facts. By making POST /register special, you could even accept a hash of facts and let the fact parser sort it out for you. I didn’t think about the exact format, but a dedicated endpoint gives you this flexibility.

Turning it around: if you want to add facts to the current method, how would you do so?

Yes, if we ignore subman and only focus on case 1 that is accurate.

TimoGoebel · September 16, 2020, 6:50am

That’s true. But we can’t know for sure if the entity exists or not. The host might have been created importing the VM, the OS might not have been linked to Foreman/Katello though and a user might just want to do that last step.

The call to /register should contain all fact/metadata as part of the payload imho.

Marek_Hulan · September 16, 2020, 7:01am

That would use the existing endpoint for uploading facts and replace the POST /api/hosts. Facts upload would create the host for us. The whole point was, the host creation part is customizable. The creation could be done by pure API call, subscription-manager, facts upload or whatever else people want to use. Handling it by explicit POST /register IMHO takes the flexibility away. Also all plugins will need to extend it with additional params, POST /api/hosts is a well known and already extended endpoint.

On the flip side, what I like about this new endpoint, we could modify the parameters per need. We may not want to allow setting e.g. compute attributes of host during the registration.

@lstejska is this acceptable for the case 1, meaning vanilla Foreman, no proxy involved.

Marek_Hulan · September 16, 2020, 7:21am

Now for the second case - Subscription Manager (Katello)

What is Red Hat specific here? This seems as generic RHEL workflow when used with Foreman+Katello.

There’s one thing to be mentioned here as well, we deploy katello-ca-consumer.rpm, which configures rhsm.conf and deployes certificates between steps 2 and 3. That is necessary for step 4 to work. The deliver mechanism may change in future, but we still need to deliver those certs and configure rhsm.conf before step 3.

Why a dedicated endpoint? This is exactly what we already have today, we can ask for a HRT for a given host since fixes #26925 - support host registration by timogoebel · Pull Request #6813 · theforeman/foreman · GitHub.

The existing endpoint already supports JWT authentication. I may sound as a broken record, but I’ll repeat that once again. If we rely on host certificate, we can identify and authenticate the host but not the user, who performs the registration.

The HRT may access additional resources, such as subnet, domain, various parameters. We need to make sure, the user does not use resoruces he or she does not have access to. We should not use any system certificates here. We need something that authenticates the user.

Also, while not that important given above, certifiactes makes it harder when during through-proxy registration, JWT makes it much simpler. The only difference I see comparing to x509 is, a specific JWT can’t be revoked, we could only revoke all user’s tokens or deactive his or her account completely. But that’s why we have short expiration window. All goes through SSL and we use similar mechanism for the initial GRT request. I see no problem here.