Design pattern proposal for connecting services to Foreman

One of the discussions we had at cfgmgmtcamp was about developing a proper design pattern around connecting external services to Foreman.

I will start to write down my suggestion based on an example of a service that I need to connect in the near future

Service description

  • A simple API to access data from the remote server.
  • The data is organized on a per-organization level.
  • The endpoint is using certificate based authentication per organization

Architecture

  • We will use a smart proxy plugin to access the data.
  • Existence of a smart proxy relation to a specific org will be used as a on/off switch for the functionality in that organization.
  • The authentication can be configured either as a configuration in the proxy or passed dynamically on each request form the plugin. Specifically for my example, since we want to use the certificate stored in the Katello manifest, we will need to supply it to the proxy on each request.
  • We can use a “handshake” call to initialize the smart proxy in case the proxy needs more configuration
  • HA and load balancing should be done on the proxy level, including the management of a shared state.

Points to improve

Currently the proxy has to be online for the communication, but there are some use cases where we should implement a “fire and forget” style of communication. This will require more design, since we will need to think about all aspects of queued communication, for example as outlined in Microsoft’s documentation.

Thanks to @ekohl @iballou and @aruzicka for the great discussion!

1 Like

I’m sorry if we already talked about this and I just forgot, but where would the cert come from in deployments without katello?

Currently we don’t have a way to manage certs without Katello, but @iballou mentioned the possibility of splitting certificate management out of Katello in the future.

@ehelms I remember you saying there was work as well around splitting cert management out of katello too?

I think it would be remiss of us not to include a discussion of why we have multiple patterns, and what we perceive the value of each if our goal is to arrive at a standardized pattern. Bear with me if I mis-speak on some of the specifics and I think that points to a good outcome of a discussion like this. And that is ensuring we’ve well defined this, the reasons behind it and our goals and that we formally write all this down at the end.

The Two Patterns

My understanding is that we have historically taken two different approaches to connecting what is being referred to as external services. I’ll try to define the two types services that are most common.

Service Types

Backend Services

These are services that Foreman or a plugin has a hard requirement on to function. Examples include Postgresql for Foreman, or Pulp and Candlepin for Katello. These have a singular existence and are typically treated as a system of record.

Integrated Services

These are services that Foreman or a plugin can optionally integrate with based upon a user’s preferences. For example, SSH or Ansible for remote execution, DHCP and DNS services for Foreman, Pulp’s operating as a mirror for Katello. These often existence in multiples per external smart proxy deployment.

Connection Patterns

The two connection patterns are direct connect and smart-proxy based.

Direct Connect

The direct connect method involves configuring Foreman/plugin to know where and how to talk to the service. Foreman or the plugin then communicates directly with the service via APIs, and sometimes the service talks back to Foreman. Foreman holds the communication and credential information. Services talking back to Foreman maintain credentials and knowledge of Foreman to communicate back.

This is most common with backend services given they are a hard requirement.

Smart Proxy Connect

The smart-proxy connect method involves the smart-proxy broadcasting the available capabilities to Foreman, and Foreman or the plugin using this information to determine where services are available. Communication with the service is routed through APIs that exist on the smart-proxy itself. If the service needs to communicate with Foreman, the service communicates back through the smart-proxy. The smart-proxy holds the communication and credential information.

This is most common with integrated services.

Why two patterns?

The contention between the two different designs often comes down to two points:

  1. Do we need a middle-man for backend services? Why is that complexity needed?
  2. When deploying the core application, do we want to require a smart-proxy to always be deployed alongside Foreman?

Role of the Smart Proxy

Central to the discussion is to understand the roles the smart-proxy can play.

Provides restful API to subsystems

Originally this was to provide restful APIs to lower-level systems that do not have web-based APIs. In some cases, this means providing a standardized set of APIs where multiple services implementations can be chosen. This can also mean pass through APIs where the smart-proxy is the consistent interface and bearer of credentials.

Service Discovery

The smart-proxy surfaces to Foreman that set of capabilities present along with additional metadata that the application can use. This allows Foreman or plugins to know which smart-proxies are configured for which capabilities. For example, if a Pulp mirror is present and at what URL or what remote execution method is deployed.

Authorized Authority

As the smart-proxy establishes a trust relationship with Foreman, and has certificate auth the smart-proxy can be used as a trusted source of information. Services only need to know about the smart-proxy providing isolation and deployment flexibility for external services. The services can send information to the smart-proxy and rely on the smart-proxy to relay that in a trusted manner back to Foreman.

Client Host Interface

This is a special case of the authorized authority where hosts interface with their local smart-proxy to achieve isolation and allow the smart-proxy to communicate back to Foreman on behalf of the host.

Network Isolation

This is hinted at by other roles, but important to call out. The smart-proxy can be used to provide network isolation by existing at the edge of the network and being the trusted point for ingress and egress.

Smart Proxy vs Foreman Proxy/Foreman Proxy with Content/Capsule

An important distinction to call out in this is the difference between the naming and how we conceptualize the identities of the smart-proxy. I am going to try my best to draw that distinction. I attempted this before in a proposal to rename smart-proxy/foreman-proxy.

Smart Proxy

The Smart Proxy is both a concept and an implementation. The primary, ruby-based implementation is representated by it’s github repo. However, the smart-proxy protocol and APIs can be implemented in other forms [INSERT @ekohl blog HERE]. This entity is represented in the Foreman manual and within the UI/API/CLI. Theoretically as a software component is independent of Foreman and can operate with anything implementing the interaction APIs.

Multiple smart-proxies can exist on a single machine. And a single smart-proxy can provide functionality for one or more services. Within Foreman, a smart-proxy is represented by a name (most often hostname) and a URL. Traditionally, this limits the ability to deploy multiple smart-proxies her host but is possible if something other than hostname is used.

Foreman Proxy

The Foreman distribution of a smart-proxy. This naming manifests itself within packaging and within the installer via the puppet module. Additionally, the process and accompanying assets (such as logs) are all branded as foreman-proxy. Due to how it’s packaged, only a single Foreman Proxy can be installed per host. This is represented as smart-proxies in the UI and API and is not conceptually shown to the user other than at install time.

Foreman Proxy with Content

This takes the Foreman Proxy concept further by adding further definition and expectation. Tihs is primarily surfaced by way of the puppet module. The Foreman Proxy with Content requires the existence of a Pulp server, intends to represent a single smart-proxy on a single host that is dedicated to Foreman workloads. Further, the Foreman Proxy with Content adds the expectations of client host isolation via an Apache reverse proxy.

Satellite Capsules

The Satellite product takes the Foreman Proxy with Content concept further by adding branding and additional specificity of what the expectations are. This takes the branding all the way to the UI and in some cases the APIs for a smart-proxy. Turning the smart-proxy UI into a representation of a single host within the infrastructure that has named properties and expectations further restricting the core definition of the underlying smart-proxy software.

Common Deployment Pattern

There are two common deployment patterns for smart-proxies.

The first is to deploy a smart-proxy (or one it’s incarnations) on a server separate from Foreman to manage an external service, or exist at the edge of a network.

The second is to deploy a smart-proxy onboard the Foreman to provide an all-in-one experience or because we have made it required to connect some backend services such as Pulp to Katello. In the Foreman Proxy with Content and Capsule worlds this is known as the internal proxy / capsule. This is because, as noted, the smart-proxy is required in order to deploy the core Katello application and it’s backend dependencies. And that some foreman-proxy-like capabilities are provided by this deployment (e.g. content serving).

So my takeaway from this is that the Smart Proxy should be a singleton service on a machine, hence we will need a mechanism on the proxy side to authenticate requests per organization.
Another thing that came to mind is the concept of network separation - we cannot expect that the Foreman server will be exposed to the internet. On the other hand, having a smart-proxy living on the edge of the internal network and exposed to the internet that will forward the needed information to the isolated Foreman using a narrow and predefined channel is actually a good architecture choice.
So again, creating a well-defined and narrow communication channel between the Foreman server and other backend services will make our live easier down the road, when we want to create isolation “boxes” (a.k.a. containers) for each one of the foreman components.

Why should it be a singleton? I mean yes, having different smart proxies for different organizations colocated on the same machine feels like a too big of a hammer, but having multiple smart proxies on the same machine with each one handling different service doesn’t sound all the far stretched to me?

According to this, it would require changes to Foreman side of proxy registration if we want to break the assumption of a proxy per host.

I updated this text, because I was recently show how this was wrong by work @evgeni has been doing with containers. I think there was some question around authentication being properly handled given how Foreman represents a smart-proxy.

Well, then even better, if we are not bound to a proxy per host, it would be really easy to have a per-service proxy instances on the same machine, making deployments like containers much easier.