RFC: Host registration and Load balancers

Global registration is missing an option to use load balancer when registering and configuring new hosts. For some users [0] this is considered as a blocker for moving to the global registration from other client tools that are going to be deprecated in upcoming releases.

Implementation

Foreman
In the Smart Proxy field list load balancers that are associated with them. Selected LB URL will be then used in the registration command and in the templates for subscription-manager configuration. List of load balancers for smart proxies could be loaded the same way as we load features right now.

For users that didn’t update smart proxy configuration or don’t have a version that supports it yet we can display a simple text field for load-balancer URL where they can put the URL they want.

Smart Proxy
Store list of load balancers in yaml config file. It can be either in settings.yml or it can be a new file.

Foreman installer
New option for adding / removing load balancers

User scenario

  • Add load balancer(s) to the Smart Proxy
  • Refresh Smart Proxy in Foreman
  • Generate registration command with LB
  • Register host with it
  • ???
  • Profit

Having covered this scenario would help us to improve user experience, moves us closer to deprecation of other registration clients and having information about load balancers in Foreman could be used in other components, like for example in provisioning.

[0] 2105995 – Need Proper Registration method in Load-balancing capsule setup for a clients
[1] Satellite docs - Configuring load balancers

I’m not sure I’m following. Could you provide some diagram which would show how things are deployed in this scenario?

There is a diagram from the Satellite documentation:

The deployment that is described in the docs feels more like a workaround rather than anything else. There have been discussions in the past about how to solve properly support HA proxies [1,2]. Honestly I’m not fond of adding more functionality to the workaround.

Does proxy need to know behind what load balancer it is running for any other reason than reporting it to foreman?

[1] - Supporting HA Foreman Proxies
[2] - Highly Available Smart Proxies (part 2)

For at least content, the documented load-balanced proxy setup is a supported and working setup as it has existed for many a release and implemented by a number of users across Katello and Satellite footprints. The goal of the RFC as I understand it is to bring parity to global registration with our other forms of registration and aim to make the user experience when using the setup less painful.

As of today and this RFC? No. However, when using global registration this information is important for users. When planning and performing upgrades of the proxy this information is also useful for the user.

At least with the Smart Proxy Templates module there is the option template_url, which is the external URL which clients should use. In a load balanced setup, the user is expected to use --foreman-proxy-template-url $URL. That can point to a load balancer. The registration module reuses this setting. Similarly, the Pulp plugin has a setting for the RHSM URL, but that’s automatically set to the common name on the certificate (I think --certs-node-fqdn $LB_HOSTNAME).

It is at best a workaround because in the UI you still select just a single Smart Proxy, but it should allow you to move forward even today without any code changes.

Many years ago we already talked about adding an entity in Foreman which has 1 or more Smart Proxy instances so Foreman would understand the topology. It depends on the Smart Proxy module whether this is needed.

With that knowledge, I’m :-1: on your proposal.

This is basically rewriting the entire design. Today we have the Smart Proxy which knows its external URLs. Everything is auto discovered and you can just register a new Smart Proxy. Your proposal rewrites that. Suddenly you have to modify everything where the Smart Proxy settings are used and apply logic to change those settings.

Can you elaborate how your solution improves the situation over what we have today?

Can you elaborate what use this is?

Are any of these reported to Foreman and stored for reference?

Yes. The template_url predates the capabilites framework, so it’s exposed as a REST endpoint and usable in Foreman here:

Then it’s used here:

Nowadays it can be patched to use it via the settings without a live round trip.

proxy.setting('Templates', 'template_url')

Then for the RHSM URL we have to look in Katello where it is defined here:

And just below it is the Pulp content URL to be used by hosts. This is an extension of the Smart Proxy model, so proxy.rhsm_url and smart_proxy.pulp_content_url should work.

Thanks for the information.

Here is my attempt to summarize some takeaways. The way content proxies are configured today there is no indicator of a load-balancer. The rhsm_url is configured based on the FQDN of the host running the smart-proxy and, today, cannot be relied upon to set the correct information. The load-balancer documentation does not reference the configuration of template-url either for configuration.

First, a reminder that the current configuration string documented for installing a proxy with content behind a load-balancer is (taken from Satellite docs):

# satellite-installer --scenario capsule \
--foreman-proxy-register-in-foreman "true" \
--foreman-proxy-foreman-base-url "https://satellite.example.com" \
--foreman-proxy-trusted-hosts "satellite.example.com" \
--foreman-proxy-trusted-hosts "capsule.example.com" \
--foreman-proxy-oauth-consumer-key "oauth key" \
--foreman-proxy-oauth-consumer-secret "oauth secret" \
--certs-tar-file "capsule.example.com-certs.tar" \
--certs-cname "loadbalancer.example.com"
  1. The load-balancer documentation should be updated to include --foreman-proxy-template-url to set the load-balancer when one is used. Do this by including setting --foreman-proxy-template-url https://loadbalancer.example.com when configuring, e.g. (Configuring Capsules with a Load Balancer Red Hat Satellite 6.11 | Red Hat Customer Portal)

  2. The rhsm_url within puppet-foreman_proxy_content should be updated to calculate it with certs::apache::cname if the value exists and fallback to certs::apache::hostname as it does today.

The idea being, if these values are configured correctly, the registration page should continue to work as is without the need for modification but calculating all the right values?

Loading proxy’s load balancers we can use them in the registration form.

To have stored information about load balancers in the smart proxy, like this

# settings/load_balancers.yml
load_balancers:
  - balancer1.example.com
  - balancer2.example.com
  - balancer3.example.com

I’m thinking about different solution, with much simpler change and not affecting many components:

We can just add one new field custom_url to the form, with basic validation, where users can put whatever they want and use that URL in the registration and configuration of the host.

No changes in smart-proxy or foreman-installer required. The downside of that is that users have to enter the URL manually every single time they create the command.

There are 2 ways to go about this. We’ve been having this discussion with various people for year so I’ll give a short introduction.

The first is to architect everything so the whole load balancing is transparent to Foreman. This means Foreman also talks to the Load Balancer to talk to the Smart Proxy. There is only a single Smart Proxy entity in Foreman. All URLs/certificate names on the Smart Proxy are configured to use the load balanced hostname.

In practice this means you have something like smartproxy.lab.example.com as the service hostname. This is backed by smartproxy01.lab.example.com and smartproxy02.lab.example.com.

From an architectural perspective this is very clean: no modifications need to be made in Foreman and end-user applications. The only downside is that you need to make sure both hosts are configured exactly the same. Also, there are stateful services.

For example, Pulp has state. This needs to be addressed somehow. One solution to this is using shared storage (loading /var/lib/pulp from NFS, replicating it using ceph, etc) and using an common database. Then you have failover capabilities.

Additionally, some services (like ISC DHCP) need much more complex handling. In that case it may not be feasible at all to run the Smart Proxy in a load balanced setup.

The other approach is to make essentially build a Smart Proxy cluster mode. It means you make Foreman aware of the topology (essentially what you’re proposing now). There are many issues to consider here.

For example, does Foreman keep both Smart Proxies in sync? If so, how? When we discussed this in the past we came to the conclusion that it depends on each Smart Proxy module and often even provider how that should be done.

Concrete example: with the DNS module there are multiple providers. Some just talk some network API and that’s trivial to load balance. nsupdate is a special case. By default we install a local ISC BIND server locally and then connect to localhost. That’s not going to work as expected if you deploy it twice.

Similar concerns are there for Pulp. If you effectively have multiple Pulp instances they can go out of sync with each other. If there’s a load balancer in front of it, results can be unexpected.

So to make it concrete. For a load balanced Smart Proxy setup you must very carefully consider which Smart Proxy modules you enable. Depending on which modules, you can choose a solution.

Your proposal barely touches the surface.

It refers to the Smart Proxy HTTP interface, so I think it should be include port 8000: --foreman-proxy-template-url https://loadbalancer.example.com:8000.

CNAME is an array. Do you take the first value?

I think technically the Apache certificate only needs the load balanced hostname and doesn’t need the exact Smart Proxy hostname on it so that’s also something to consider. Though I may be missing something.

Yes, that how it was designed.

Note that this is exactly the same problem you would run into when you would dual home the Smart Proxy. For example, you could have a VLAN for Foreman <-> Foreman Proxy communication and a VLAN for Foreman Proxy <-> Clients. Then Foreman would also use a different hostname than what clients use.

Bother, that is right. That makes it a bit tougher to configure the right value and would have me lean towards a dedicated parameter to configure and thus indicate the correct endpoint if it’s set. Something akin to --foreman-proxy-content-rhsm-loadbalancer-host or --foreman-proxy-content-rhsm-host.

That is my biggest concern – the user experience is rather ugly and as @ekohl has pointed out there are mechanisms in place to handle this if we configure things correctly.

I also considered an explicit parameter. Perhaps it should drive certs::apache::hostname given that’s also the hostname that ends up on Pulp:

Note that for Pulp we don’t have any CNAME support. Perhaps that’s also a bug, but likely one you’ll run into if you go the CNAME route.