User Survey: Supporting HA Smart Proxies

sean797 · April 18, 2018, 2:45pm

Were looking to support Highly Available Smart Proxies in a future release of Foreman and plugins. We have some up with what we think is a good proposal, but would like you, the users, input before we go ahead with this.

Please express any question or concerns below!

The Problem
Today Hosts & Subnets are assigned a Smart Proxy for every feature; you select a Smart Proxy for Puppet on a Host and a DHCP Proxy on a Subnet for example. This means that when you create a host for every feature Foreman is going to communicate to a single Smart Proxy to do something for that feature & host/subnet combination.

The Scope
They say a picture speaks 1000 words…

This just shows 2 features, but we would be supporting more features, they generally fit into 2 categories:
Yellow: Foreman must do something on 1 of the Smart Proxies, it would try one, if that fails try the next.
Purple: Foreman must do something on all Smart Proxies in a Cluster.

Yellow would be features like DNS, Realm, Puppet CA ect…
Purple would be features like Content (Katello), TFTP, ect…

The Proposal
Create a new Smart Proxy Pool object, this would hold a Name (used for reference, like other things within Foreman) and a Hostname (used for client communication) attributes. It would also be assignable to Smart Proxies and Hosts/Subnets. So when you create a new Host or Subnet, you would select a Smart Proxy Pool for each Feature instead of the current Smart Proxy selection.
There would be no limit on the amount of Pools a Smart Proxy is part of.

The current use-case still works by using a Smart Proxy Pool including just 1 Smart Proxy, with the Smart Proxy URL set too http://proxy.example.com:8443 and the Smart Proxy Pool Hostname set to proxy.example.com **

A new use-case will work where Foreman can connect via one interface (or hostname/url) and client connects via another. Using a Smart Proxy Pool including just 1 Smart Proxy with the Smart Proxy URL set too http://proxy.example.com:8443 and the Smart Proxy Pool Hostname set to client-proxy-name.example.com. You could also create a new Smart Proxy Pool per network (or interface) the Smart Proxy is serving.

You can make your Smart Proxies active/active by using a Smart Proxy Pool with 2 (or maybe more) Smart Proxies with the Smart Proxy Pool Hostname set to your load balancer.

Some real world examples:

When a Smart Proxy Pool is selected for DNS and a Host created,
- With 1 Smart Proxy assigned:
  Foreman would attempt create the DNS record using that 1 Smart Proxy, host building would fail it that doesn’t work.
- With 2 Smart Proxies assigned:
  Foreman would attempt to create the DNS record using 1 of the Smart Proxies, it would then try the other if that fails.
When a Smart Proxy Pool is selected for TFTP and a Host created,
- With 1 Smart Proxy assigned:
  Foreman would copy the TFTP Content to 1 Smart Proxy.
- With 2 Smart Proxies assigned:
  Foreman would copy the TFTP Content to 2 Smart Proxies, when the client boots it would use the Smart Proxy Pool Hostname to grab content via TFTP. (this should be set to the Load Balancer you are using)
When assigning 2 Smart Proxies to a Smart Proxy Pool with Katello’s Content feature we would verify the Smart Proxies are in the same organizations, locations & lifecycle environments.

As part of the upgrade, we would create a Smart Proxy Pool for every Smart Proxy and also Hosts/Subnet Feature associations would be moved to Smart Proxy Pools(see ** above). Obviously there are more features where this would be very useful, especially with ones plugins provide, the ones uses above are just examples

How does the use of Smart Proxy Pools sounds to you? Do you have any concerns?

Chris_Duryee · April 19, 2018, 2:16pm

This is interesting, I especially like the ability to have an internal network and also client-facing network, as well as the ability to break out and handle singleton services.

@evgeni and I are working on forklift #684, which gives an example of smartproxy load balancing on Foreman 1.17 and Katello 3.6. It does not split out the interfaces, and also only splits out the puppet cert signing onto one server (in this case proxy01). We ran into a few things that your solution handles. For example, the user is responsible for making sure all proxies are in the same org/loc and have the same content.

In our example, Pulp regenerates repo metadata on each smartproxy after syncing, and the repo metadata filenames are checksums, so its possible to request repomd.xml from proxy01, then fetch a (repodata) file from proxy02 and it will 404. We changed the LB mechanism so it routes requests from the same client to the same server for 443. It would be interesting if identical metadata could land on each smartproxy.

In your proposal, how do puppet signing requests get routed? We ended up making port 8141 on the load balancer a specially designated port that routes to 8140 on proxy01, and proxy01 handles the signing requests. We had some difficulty in inspecting the certificate request without requiring SSL termination (which loses some important info in the original request before it gets sent to puppetserver), so it was easier to just pipe 8141 to 8140 via tcp load balancing.

sean797 · April 20, 2018, 12:54pm

Thanks for your feedback!

If Pulp does (or did in future) support that, we could configure it.

There are a couple of different options here, I think TCP passthrough to one Smart Proxy (like you have done) is the simplest approach. You could also configure the Puppet CA option to another machine on the client or use SRV records.

Though I suspect you could create an active/acivte Puppet CA using GFS2 to hold the certificates or something similar, I would be super interested in trying that out.

Marek_Hulan · April 27, 2018, 6:22am

One concern I wanted to highlight is the API change. If my understanding is correct, this means that everywhere in API where smart_proxy_id was used, we’ll need to start using smart_proxy_pool_id or some similar attribute. The same will apply to hammer. This is not optional and will affect every user including those, who don’t want/need the HA setups.

Second concern is regarding user experience. Will it become too confusing in non-HA setups, if one needs to choose pool that contains single proxy instead of that proxy right away?

Does that concern anyone else or are people ok with migrating to new API attributes?

lzap · April 27, 2018, 10:55am

Thanks for the writeup, very well done.

First, what defines if a service is “real world example A” or “B” (in your example DNS or TFTP). I fail to find a good term, but I mean if request must be done on all proxies from a pool, or at least one? It is not very clear if this should be hardcoded in Foreman codebase or some flag.

There is a lot of flexibility in deploying some services, let’s say TFTP is technically possible to keep in sync via Foreman “apply on all proxies on the pool” or via rsync/SAN/NAS. While I like keeping things simple and I really appreciate the work behind finding the best possible approach to the problem, I think it might sense to have this a configurable flag (with some sensible default).

Second, while this is quite clean design I think that vast majority of Foreman users will simply use 1:1 mapping of pools and proxies. I think we should think from the day one about UX simplifications on both WebUI and CLI level to make things seamless. I can imagine a checkbox/option “create also pool for this proxy” in New Smart Proxy screen or “delete also associated pool” setting.

sean797 · April 30, 2018, 4:19pm

Agreed, the initial version it will not be configurable. We will code the feature to be how we believe most users will want it.

Currently, there is no checkbox - we always create a Pool some every Proxy since its mostly useless with a Pool.

That’s a nice idea, should be simple to add - thanks.

@Marek_Hulan As far as I’m concerned this would just be another API change for the better, much like the host_params macro. Honestly, I don’t think users who don’t want to use any of the extra functionality will really care, but I’m also interested in hearing from Users directly.

Marek_Hulan · May 1, 2018, 2:20pm

IMO there’s a big difference in public, versioned REST API and templates DSL that is a subset of internal objects API. I’ll defer to API users opinions.