Smart Proxy: Future Design, Scaling and Use Cases

Perhaps another way we need to consider thinking about is to stop looking at a proxy as a mini-monolith providing many features in one service and think about a possibility of looking at a proxy as a microservice that provides just one service.

That way we could scale out just the services that are needed by spinning up additional proxies providing them. It would also resolve/simplify the security concern of an e.g. template proxy gaining access to network config etc - as each proxy instance would have it’s own permissions (and maybe a container encapsulating it?). We could even optimize if needed by switching a specific service to a different language, as long as some basic API structure is maintained (IIRC @ekohl even did a POC once of creating a proxy in python). Perhaps in the long run this could also be a path to simplifying Foreman itself, by offloading some of the logic into these new services (e.g. template rendering, initial fact parsing etc)?
The downside of a microservices approach is it would require a bit more work to make sure all services are running and set up properly, and in some cases it would probably also need some changes in the way some of the services work internally (e.g. to enable scale-out by additional nodes).
This path could be done gradually though - for example, have most services run in one proxy, but a few that require scale/optimization/lower security would each be run as separate instances. I believe in some cases users are already running some services in a separate proxy rather than go the all-in-one route, so if we agree about it, we just need to double down on this approach as the recommended path.

1 Like

This would help a lot with running proxy in a container.

We could also get rid of smart_proxy_dynflow_core and instead deploy a rex-only smart proxy.

To aid me in thinking on this, do you envision this as a 1:1 mapping between service and port? For example, REX smart-proxy on 9000, template smart-proxy on 9001, registration smart-proxy on 9002. Or is this a microservices behind the scenes with a single web interface and single port?

Making a note that another RFC just opened plays into some of these design considerations and discussions happening there: Infrastructure roles

I think that could be an implementation detail, depending on how we want to proceed - it could be multiple containers running on one host exposing different ports, it could be multiple webservers on the same port behind a reverse-proxy with vhosts or even answering to different paths, it could be completely different machines. It might also depend on how it makes sense to scale different services.

Another thought that came from a discussion with @Marek_Hulan just now, is that maybe we can have some predefined proxy “bundles”, e.g. “provisioning proxy” that contains DNS, DHCP and TFTP, “content proxy” that only contains a content proxy with pulp preconfigured to work with it, “REX proxy” that contains dynflow (and maybe a mqtt broker in the future?) and so on.
Whether we go with a proxy per one feature or set of features, we would need to maintain some sort of stable API with foreman indicating what the proxy can do (and I think capabilities API really enhances our ability to do so).
Different features have different requirement (e.g. content requires a lot of disk space, some dhcp providers require file access for managing leases, openscap requires heavy report processing and so on) and can thus be scaled/optimized/secured differently without trying to find a “one size fits all” configuration.

I indeed wrote a PoC in Python

The commits show the steps you generally take. Then an additional blog post helps you understand the registration.

My initial goal for that was to implement the Smart Proxy registration directly in Pulp 3 as a plugin so that you can deploy content without a Ruby Smart Proxy. (This is why I think RFC: Container Gateway Smart Proxy Plugin (Container registry access for Pulp 3 Smart Proxies) is moving in the wrong dirction.)

In Smart Proxy Feature classes I tried to start a similar discussion.

As much as I like the idea of moving toward containers in a non-intrusive way by not trying to break RoR app but cherry picking features that needs to scale up, remember that containers do not contain. But you are right as long as we stick with SELinux turned on, it will be improvement.

However, one big advantage of smart-proxy is easy deployment. Linux, Windows, BSD. Small VM. Easy installation. No dependencies. This would only work if we supported also non-cluster installation. Just podman pull or docker pull and run.

Note that it doesn’t have to necessarily run inside a container - it is just one option. It could also be that for linux base oses it runs in a container and for windows/bsd it runs as a regular process. (TBH, I wonder how many users actually use proxy on a non-linux system - we might be spending a lot of effort supporting something that isn’t even used).

1 Like

I’m not sure if we can integrate with Active Directory (DNS, DHCP) if we don’t run on Windows. In the past we also had users on BSD.

However, we don’t test it so it’s not guaranteed to work. It would be nice to utilize Github Actions to test on Windows if we want to support that.

If I have not missed any option, DHCP is only possible with native_ms which requires the cli of a windows server. DNS is no problem with nsupdate with GSSAPI. I have used the later one quite often, DHCP on Active Directory never in production and I am only aware of one environment a colleague is using it.

The manual installation is off-putting on Windows, so really supporting it would be not only testing but also packaging it. Perhaps this would also be easier with something different from Ruby as this option was mentioned.

1 Like

Anecdotally, we recently had an issue where public traffic was crafting bad packets, which would hard lock the smart proxy service and force a manual restart before it would serve content again. We changed the firewall to only allow requests from the Foreman Server, and never had any issues again.

While the smart proxy was locked, we could not provision machines, since the TFTP server couldn’t be updated with the appropriate configs, etc. The smart proxy was a vulnerable point to a denial of service attack on our provisioning system.

In our case we were not making use of any of the client-facing features, and so blocking requests from anything except Foreman was a reasonable solution to the problem.

Separating concerns to allow (and encourage) only communicating with the smart proxy from the Foreman server sounds like a great design decision.

I am a tad late to the party, but is “Smart Proxy” the process running theforeman/smart-proxy or the “whole machine” (including other services)? This seems a bit mixed up in the “features targetting” section, and makes it hard for me to follow along.

The introduction seems to aim at the process, but then I don’t follow how sub-man/global registration fits into this as it’s purely an “Apache proxy” thing.

I agree that “the process” should not have user/client facing things, but it does today:

  • template feature will expose the Foreman templates on the proxy
  • the users can (and do) fetch SSH keys from /ssh/pubkey
    Granted, both are not strictly long-running or high-traffic (but neither is the container-auth redirect in the linked proposal, if I read it correctly).

However, the process can be co-hosted on the same machine that also has client-facing things (like apache), thus giving up a bit of security based on the users preferences.

Global Registration & Smart Proxy

There is no documentation yet (WIP), the functionality of module is pretty simple: forward GET & POST requests to /register endpoint to the Foreman and add url parameter (proxy’s url) to the forwarded params.

Is the smart-proxy the right place for client facing actions

For Global Registration it can be useful for cases where host machine don’t have direct access to the Foreman instance.

What architectural changes are needed to the smart-proxy to support increased traffic?

I’m not sure if it is needed, since the GR module is just forwarding requests & responses. Maybe adding some caching of rendered template could improve R&R times.

What should be our scaling guidelines for smart-proxy deployments based on deployed features?

Not sure to be honest.

Oh, I see, there is a proxy-like module, much like the templates one I mentioned above.

Then I vote for extracting the Windows-only code into a super tiny Ruby Sinatra “mini proxy” and rewriting the DHCP MS module to call it. Then dropping all Windows code and calling it a day.


I like where this is going quite frankly. Containerizing Foreman is something we can probably only dream about at this point. But having an instance (Capsule aka Smart Proxy Node) with a good container infrastructure could be an answer to our scaling issues.

Thank you all for the great discussions thus far. I am going to attempt to re-cap the highlights and proposals. At the end I am going to try to set the stage for further discussions.

First off, from its original intent and this thread we can take away that we should think of the current smart-proxy as a control plane intended to provide APIs and discovery of services. And that this fact can and does impose a heightened security need for the smart-proxy that client end points pose a risk to. That smart-proxy traffic should aim to be limited to largely Foreman <-> smart-proxy or, in some cases, smart-proxy <-> service. And that a user ought to be able to have a fairly strict firewall setup for the smart-proxy to reduce attack surface.

The general proposal is that there should exist at least one additional service that is dedicated to client traffic. And possibly a further break out of services into either groups of related services or dedicated services that map 1:1 with functionality supported. A quick recap of client services today or proposed (I may miss one):

  • templates
  • global registration
  • container gateway
  • subscription-manager proxying (today handled by Apache reverse proxy)
  • facts
  • openscap
  • REX
  • SSH keys

I think it is important here, that as we consider this, we look at the software vs concepts and ensure we draw the lines correctly. We have the smart-proxy software, that is a Sinatra based web application serving multiple end points and providing a base set of functionality such as handling SSL, certificates, configuration, and plugins. Concept wise we have the Foreman Proxy, the process that runs on a system and represents an instance of the smart-proxy. And in the UI/API we have Smart Proxies that are registered and managed. Stretch this out to the Katello use case and we end up with what we often call a Foreman Content Proxy that both adds a defined set of services and functionality (Pulp, reverse proxy, Qpid) and is treated conceptually as a single entity. That is, Katello tends to think of managing the entire host as the Content Proxy, not just the Smart Proxy software even though that is how it’s surfaced in Foreman as an object.

Additionally, we have an RFC aimed at enabling Remote Execution against the underlying host that we think of as the conceptual Foreman (Content) Proxy to be able to perform management actions on it from Foreman itself.

Let’s take the easy split to further discuss the various layers of software and concept. Let’s assume we strictly split functionality into what we traditionally think of as a Foreman Proxy (service API and discovery) and new concept, a Client Proxy (for lack of a better term). What would those look like at:

  1. The software level, is this a new project? A creative configuration of the smart-proxy software? How do we ensure at a user and developer level that it is clear what does what, what got deployed and prevent mis-configurations that can lead to some of the security and conceptual concerns?
  2. Conceptually how does this surface inside Foreman? How do I view and manage Foreman Proxies vs Client Proxies?
  3. Should the two be allowed to be co-located? Does this put an additional burden on the user infrastructure wise? Does this make it easier for the user infrastructure wise? Does it give them more choice?
  4. Does this increase or decrease deployment burden?
  5. How would dual purpose features be handled? For example, Katello uses the service discovery nature of the smart-proxy to expose Pulp 3 attributes, but Pulp 3 is client facing.
1 Like

Non-developers perspective:

I would be totally fine with the same software running in different instances because it is a component I know, can debug, secure, … And with Katello we already have a case where it runs on a different port, so having another one for client connects would be not so confusing. Not sure if systemd can already help here to multi-instance or if the code would need adjustments, this I would leave to the developers.

As I said I like would like Client Proxies to be simply another instance of Smart Proxy with different features and a different port, so no different handling would be needed.

I see co-located as a must because in some environments it would require additional machines if they can not co-exist on one or even worse to save a machine it would be installed on an unrelated one because the component is so small. Of course separate install for those that do not care about an additional machine but security has also to be possible.

As pulp is already special, it could also be special in this case. But then others would do the same for a feature that is normally not exclusive to a Smart Proxy, so I would say dual purpose would mean running two different routes through two different instances.

Maybe it’s the time to discuss terminology change again? I vote for Foreman Capsule to standardize with Satellite for the node, then figuring something out for both services. Maybe we should be breaking those (micro)services out because eventually if we really go into containers smart-proxy could break not into just two, but dozen of individual small services plus some router component for the endpoints. Thinking loud:

  • Foreman Capsule - the node and the term used in Foreman UI
  • foreman-proxy - the router (httpd?) service exposing ports to both Foreman and clients
  • smart-proxy - the process(es) handling requests (running either standalone or in containers)
  • smart-proxy-host-reports - an example of a (micro)service handling specific requests

In the first phase we should probably only change deployment. Once the infrastructure is ready (router, containers) we can start adding new projects and probably move some features out of the smart proxy codebase.

A client proxy is just another smart-proxy for Foreman, it still has a REST API, therefore I’d assume we keep the current behavior. Via capabilities API proxies could advertise which type of proxy is it and we should be able to get that information in the code.

For client proxies Foreman should not need to initiate any operation other than refresh features, status, logs etc I suppose. But it should still be a proxy object I think?

I undestand the design in a way that it gives great flexibility. By default we would still deploy on a single node, but later on once we containerize and break things a bit, users would be able to create clusters for proxy services. The question is if Foreman should be managing those clusters, I think yes, then my question is how much of burden and work is that.

Increase. Increase… :slight_smile:

I don’t understand the problem Pulp 3 have but a wild idea: a feature could be something that is either foreman or client facing, not smart proxies themselves. Not sure if this helps in this case or makes things worse tho.

I was assuming that two separate proxies (client and foreman) would mean two different pairs of ports, but now I am thinking about it, if we had a router (simply httpd can do the work) we could actually forward the traffic depending on the URL. This assumes that we are able to distinguish request if it’s client facing or not. Then maybe we can simply keep the whole design of smart proxy in Foreman unchanged.

Uh, not easy. It’s late.

Even though it is still a dream, initial work has been done to make it reality and I think this is a good opportunity to make additional steps in that direction. The biggest challenge in this moment is how to support some of the existing functionality when moving to a container infrastructure, such as installing additional plugins.

I agree, breaking up things into separate pieces would make things easier and more flexible. Those microservices could initially run as container images under systemd. That way we could decrease the efforts needed for packaging.

While having common router for proxy services may solve some problems, it also brings new problems on its own. Let’s say I’d like to enable Ansible and add smart-proxy-ansible service to my existing setup, which also means changing the router configuration so that requests for smart-proxy-ansible service are routed correctly. I am not saying that this is not solvable, it is just another thing that we need to keep in mind.

An approach requiring less changes could be having each service registered in Foreman as a separate smart-proxy, which is also in line with:

The number of services could be decreased by the grouping of features as Tomer mentioned.

Please, keep in mind, that it should still be possible to run a smart proxy on other platforms like Windows.

What about using “Foreman Proxy” instead? A capsule is something you see in the context of a satellite.