Containerizing the Foreman Ecosystem

ehelms · August 31, 2018, 4:51pm

This RFC serves as the entry point to a series of RFCs related to the containerization and next generation platform for the Foreman ecosystem. There are many parts to this RFC which underscores the overall scope. To that end, a sub-set of these RFC’s will be sent along to discourse to encourage discussion and refinement by the community. The entire set of design documents and proposal have been sent to a Forklift PR. I would ask that as much discussion as possible happen on discourse, but given the breadth, folks take the time to read through and comment on the PR as well especially on topics not sent directly to discourse. I will be keeping the two sets of documents in sync to foster as much communication and feedback as possible from the community.

There is a lot to take in, think about and comment on. I appreciate any and all time that folks are able to apply towards this effort.

Containerizing the Foreman Ecosystem

This document aims to capture all design and discussion relating to every aspect of containerization including architecture, upgrades, plugins, CI/CD, etc. The RFC is broken out into multiple documents given the breadth of each. This should provide focus on a given topic and allow more focused discussion.

Breakouts

Executive Summary

Also read through Project Management and Community for more on developer and user community consideration and impact.

Objective: Provide a scalable and highly available platform for running a set of management services.

Goals

Be agile in delivery of functionality
Evolve into a platform that can support the changing needs of the community
Increase developer awareness of production deployment
Increase ability of developers to deliver functionality with more agility
Reduce build-measure-learn loop

Pros

Development environment moves closer to production through image reuse
Developers learn containers Kubernetes (K8s)
Container native scaling, high availability, orchestration
Plugins can evolve into services easier and deliver functionality more asynchronously
There is a large community around Containers, K8s, etc.
Scaling means adding more CPU/RAM or an additional cluster node
Can reduce deployment pain by decreasing OS and stack support matrix
Reduction of packaging burden by using native packaging to build container images

Cons

Developers have to learn containers, K8s
Uncertainty of how services will behave in a containerized deployment
Some Smart Proxy functionality may still require non-container or non-container-orchestration world requiring a hybrid approach
Hybrid approach may require more knowledge straddling host and container deployment of services
Stable Foreman base of installation, supportability exist already
Community must become container and K8s aware
Running K8s may require more CPU/RAM increasing minimum requirements for users for base installs

Devil’s Advocate

A non-containerized approach means continuing to deploy services on a bare metal or VM host and managing through systemd. Scale out for these services involves native scaling of the application itself (e.g. increasing threads) or multiple application nodes with a load balancer in front. The host machine either has to have increased CPU/RAM or work has to be done towards a multi-node setup whereby one or more services can be either moved off to a new box or new instances of the service spun up. This is effectively scale out via additional bare metal or VMs. This scale out will require additional tooling to effectively manage and orchestrate. Additionally, services such as proxies or load balancers will need to be added to deployments.

** Pros **

Build on the base of existing stable Foreman gradually
Current installer can mostly be re-used with orchestration from Ansible wrapping it
Less knowledge overhead for Developer and Users as there are no containers involved
No investment in new resources and technology to support containers
Existing investment in support, debugging, tooling, documentation

** Cons **

Requires adding in new scale services such as proxies and load balancers
CPU/RAM requirements go up on the host, or multiple nodes are required for development and testing
Foreman developers are less inclined to learn containers
Bringing container native services into Foreman may require more work to package and run natively
Bringing in future install bases may require building modules and SCL

James_Shewey · August 31, 2018, 7:54pm

Why are you opting for K8s instead of Docker in light of the Docker plugin for Foreman? Will the first bullet (“Kubernetes integration”) on the plugin page under “Planned” section see some more development attention if the project moves in this direction?

ehelms · August 31, 2018, 10:09pm

Thanks for the question James.

This RFC is focused on deploying Foreman as a containerized application. Given the complexity of the entire ecosystem I chose to focus on Kubernetes as the platform for running and scaling out containers. Docker handles simple deployments easy but does not provide full fledged orchestration and easy scale out compared to Kubernetes.

As for the Docker plugin for Foreman, I am not seeing the connection to running the application as a set of containers. Can you expand on what you are thinking more?

dLobatog · September 3, 2018, 10:35am

Thanks, in general I’m all for it too, but one question arises.

How do we plan to manage the data vs server code separation? Would postgres/mysql run in a separate server or container?
Is the effort to figure out the connections for each service and allow the proxy to use those? Even if it is, sometimes the proxy is actually changing files in the host itself.

lzap · September 3, 2018, 12:33pm

Thanks Eric for useful writeup(s). This is how RFCs are supposed to work, great job (everyone).

The first question we need to answer in the community is - do we really want to do this? Is Foreman a good example of subject for containerization?

Personally, I do not like arguments of “the world is shifting towards containers” kind. Because the majority is not always doing the best decisions, there are lots of examples from technical field to politics I won’t go into I don’t see containers as something bad at all, everything has pros and cons. You present bunch of reasonable what I’d call “soft” arguments, except one:

Fair, this confirms the following over-simplification, this is the way I see it:

Monolithic applications are easy to develop and debug, difficult to scale and maintain.
Microservice* applications are difficult to develop and debug, easy to scale and maintain.

* - thus containerized in our context

Easier scaling is the strongest argument in my eyes in general and also in your RFC as per today. Therefore I need to ask myself: Does Foreman have problems with scaling? Are we hitting some limits?

I’ve seen or worked with deployments of 50k and 80k nodes in the past, we’ve seen presentations from Paypal/eBay about 100k deployments. These are indeed challenging deployments on beefy hardware with specific tuning configurations. But I have an impression (again correct me if I am wrong) that Foreman scales well, there is some room for improvement.

Of course anyone is more than welcome trying new things, exploring areas and doing prototypes. Lot of great work has been presented so far. But going “all in” eventually will affect all Foreman users.

What I am trying to say is let’s not get on board on the containerization train just because of better scaling. It comes with cost we can only guess at the moment. This is request for all - please present more “hard” reasons for doing this.

iNecas · September 3, 2018, 4:51pm

I think we should not forget about Katello’s extended ecosystem, where there are more services that influence how the system is handling the scaled environments. One of the benefits I see with this transition is targeting the scaled environment by default, instead of having to thread them as special snow-flakes. It basically forces us stop thinking about the services running on the same machine. I don’t thing anything is for free and moving to k8s will just make Foreman scale magically: I believe the difference actually is the constraints the k8s is giving you so that you need to think about this stuff, getting the better scaling options as a side-effect.

It should be also better prepared for the growth in the infrastructure: you can start small and grow infra as you need it: k8s should give us more elasticity here.

As k8s is becoming (if it didn’t become already) a de-facto standard for next-gen infrastructures, I think it’s the right time to start thinking about deployment strategy for Foreman & Friends.

I’m also thinking, if this move will not make our lives easier when pulling in third-party services (taking prometheus as an example). If there is a community maintaining the way how to deploy the service in k8s, I guess it could be better to pull the stuff in, if we share the common ecosystem. You could argue that Puppet modules should have solved this issue long time ago, but the industry has moved on a bit since then, and this time it actually could be done right - including multi-host deployments, dealing with upgrades etc (or not, but we should be looking at this aspect of the transition as well)

I’m sure there is complexity hidden inside the nice shiny k8s box, but I agree with Eric that it’s time to explore it more to see how we could employ the work that has been done in the containers world, as well as understand better what the containers world is about, potentially delivering features that could support it better as well.

lzap · September 3, 2018, 4:59pm

Yes, I am thinking in the Rails App context, backend services are subject of containerization of course. Although the scaling is not “magic”, there’s hard work behind

I will just quietly ignore these soft arguments.

It’s great time indeed, I only have concerns about rewriting stuff and breaking things (again the Rails App or plugins) into microservices.

ehelms · September 4, 2018, 6:57pm

Any persistent data would be pushed into persistent volumes to make the containers as stateless as possible. If you look at the upgrade RFC there is a notion to keep the databases outside of Kubernetes to begin with in part to make transition easier.

Can you expand on which connections you mean? By proxy do you mean an HTTP proxy or the smart proxy? By proxy changing files, I assume you mean the smart proxy modifies files on disk. If yes, this is where the smart proxy itself gets tricky and we may have to reduce the scope of actions a smart proxy running in Kubernetes can do vs. one running outside of Kubernetes. I have some thoughts around how to handle the smart proxy I have not shared yet. I will work on getting these available.

James_Shewey · September 5, 2018, 5:39pm

Presumably, Foreman will need to orchestrate the deployment of it’s own containers - for installs, upgrades, or both. I ask about Docker only because Foreman already has some code for managing these containers, but if that isn’t the right tool for this particular job, that’s OK. I’m not partial to either technology. I just wondered if what was already there for Docker might be more expedient.

Ultimately though, I am a huge fan of eating my own dog-food. If there needs to be some code infrastructure surrounding containers to make Kubernetes containerization for Foreman happen, it seems like skinning/creating a GUI for that orchestration engine and exposing that to users by either extending this plugin, or creating a new plugin for Foreman might be in order. Basically, if containers are important to the Lifecycle of a machine, it raises some philosophical questions for me regarding Foreman as Lifecycle management software for me like:

If Kubernetes is important to the Lifecycle of a machine, why isn’t that integrated into Foreman
If Containers aren’t so important, why bother containerizing Foreman?
If Foreman will need to orchestrate containers for installation/update why wouldn’t the project provide that same orchestration capabilities to users to manage other containers?

Ultimately, I have encountered a lot of issues surrounding ruby/rails version issues on my deployments, so I think containerizing those components is definitely the right move even if the whole of Foreman isn’t containerized.

ehelms · September 5, 2018, 6:35pm

Given that Kubernetes is a widely adopted container orchestration engine and there is a large ecosystem around deploy and maintenance of a containerized deployment I don’t want to get into the business of re-creating this concept inside of Foreman. I don’t see the value in taking Foreman down that road. I agree with you that eating our own dog-food can be useful but I think that does not apply here.

Kubernetes is software like anything else that can be deployed, configured and managed by Foreman on a given host. I’m not sold we need to create anything special at present for this. I consider that outside the scope of this RFC.

I’m not sure that the management of containers and the running of Foreman as a containerized application go hand in hand. This RFC is focused on running Foreman the application and it’s ecosystem via containers for the benefits stated herein with respect to scaling, tracing, bringing additional services into the deployment.

I touched on this but worth noting again. It is my belief that Foreman will not need to orchestrate containers. I’d prefer that the tools that exist within the Kubernetes ecosystem be utilized to handle this for us rather than creating our own way. To put it another way, I think we should be using Kubernetes native methods to deploy Foreman as a containerized application. Foreman should continue to do what it does best – manage the lifecycle of your infrastructure with the ability to scale to the growing needs of the users. This includes but is not limited to the plugins that exist for Foreman that have evolving compute and backend service needs.

James_Shewey · September 5, 2018, 9:43pm

To be clear, I am not suggesting that the wheel be re-invented - Foreman doesn’t need to do what Kubernetes does - I’m merely suggesting that a Kubernetes smart proxy might be in order.

That was what led to part of my original question: Foreman can’t currently manage Kubernetes on a given host… But it can manage Docker containers on the host and was what led me to wonder why the choice for Kibernetes.

That may be. It kind of depends on if/how Forman is to interact with the containers during install/upgrade. If that is done via a smart proxy, then while it is probably technically outside of the scope of the RFC, we can at least keep in mind that this could be used by the Forman UI at a later date with some intentional choices ahead of time.

Perhaps someone could expand on this a bit: my understanding is that under this paradigm, Forman’s default deployment method would be with containers. What does that look like? Does that mean the current foreman-installer goes in the trash heap and there is a guide I now need to follow to deploy with Kubernetes instead? All of the deployment is now done with Kubernetes instead? Or would the installer somehow need to interact with either a local or remote Kubernetes instance? Or am I misunderstanding this altogether and users would still be instructed to deploy a non-containerized version by the Foreman Manual?

TimoGoebel · September 10, 2018, 8:16am

I am asking myself the same question: Do we want this? In my opinion, Foreman should be the tool to set up a kubernetes cluster. That is going to be hard, when you already need a working cluster to deploy your Foreman instance. How are we going to address this chicken-and-egg problem? Maybe foreman-installer could setup a one-node kubernetes cluster (via puppet modules) and deploy foreman in there? Just a thought.

We decided not so deploy Foreman in our Kubernetes Cluster to reduce the blast radius if we have issues with one of the systems. Let’s say we need to redeploy our broken Kubernetes Cluster, we can currently still do that because Foreman runs outside of the cluster. If we’d deploy Foreman inside Kubernetes, we’d be screwed.

Another concern I have is if the software is actually ready for containers. We tell our developers that their apps need to follow the 12factor rules. When we looked into deploying Katello on multiple hosts, there were some architectural issues we hit. I think we should address these before moving to containers. Also Katello relies on a lot of apache configuration. I think we should talk about moving these to the rails app to reduce the dependencies on the web server configuration.

To sum up, I’m still pretty reluctant on this topic. Kubernetes is definitely the new industry standard for container deployments. But Foreman needs to happen before containers in my opinion. And I think the app (core + common plugins) is not ready for the move yet.

Happy to hear more thoughts on these topics.

ehelms · September 11, 2018, 12:06pm

Thanks for sharing your insights. I had some follow up questions to get more details.

That was the primary thought. A dedicated Kubernetes cluster for your Foreman. Treating Kubernetes more like an application server than an organization wide container orchestrator. That my put some burden on administrators and is worth discussing if so.

How do you redeploy your Foreman if you have to? Or are you saying you treat Kubernetes differently than you do something like Foreman?

To be fair, that is a goal of the effort. However, containers do change the nature of the deployment and some of those current architectural issues go away and potentially some different challenges arise.

By this, you mean that Pulp relies on a lot of Apache configuration? Katello itself on cares about serving up a /pub directory with some content available over HTTP.

This is something I think worth digging into more. I get the idea that Foreman could bootstrap Kubernetes and deploy it. I also see Foreman as the application that is managing the state of your servers whether they are bare metal, virtual, cloud and running traditional applications or containers on orchestrated platforms. We want the management of those 10s to 100 thousands of machines to be scalable, reliable, highly available, etc. Is the bootstrapping nature of Foreman and the long term maintenance nature of Foreman at odds in this respect?

TimoGoebel · September 12, 2018, 10:09am

Thanks for your questions. I hope I can give some more background with these answers / comments.

Well, ideally it’s just hitting the button “redeploy” in the foreman ui. that works fine for all stateless servers. Servers with state are still a problem, so we don’t really redeploy Foreman. Our Kubernetes-Cluster is designed in a way that you can redeploy single VMs without an issue. I played with writing a plugin that can automatically redeploy a hostgroup so an admin would have the same experience as with a cloud provider with autoscaling (immutable infrastructure).

I believe if we get Foreman and it’s ecosystem completely containerized we should not have a lot of issues with a traditional rpm based deployment. I think @ekohl started a tracker issue somewhere with issues we had when separating the services. Unfortunately, I can’t find the redmine issue anymore.

Yes, I was thinking of the /pub directory and the reverse proxy.

I think that should be the main purpose of Foreman in a cloud-native world.

Let’s face it: You don’t need Foreman anymore if you are all in the cloud. It’s the legacy thing. But if for some reason you can’t run your app “in the cloud”, IT is all about bare metal again. And that’s the strength of Foreman.

Kubernetes looks like the technology that is going to be the universal language that every cloud provider understands. Docker is the universal packaging format for your app, the Kubernetes specs the universal language that tells the cloud provider how to run your containers. If you want to benefit from modern application deployments, but don’t want to go to the cloud, Kubernetes on bare-metal is there for the rescue. At least that’s how I assess the current situation in the industry.

Just for context: We run Kubernetes (as an enterprise wide cluster) on Container Linux deployed with Foreman. Foreman’s templating engine is used to render the ignition templates to configure the systems. All the components are deployed in containers. Therefore you can easily redeploy the servers (one at a time) as there is no single point of failure and the state is always shared. Because of the awesome update system of CoreOS, the cluster takes care of itself. That is btw something I’m going to miss when it’s time for Fedora CoreOS. I think we need to write a blog post about our Kubernetes deployment at some point.

Do you mind clarifying your question?

ekohl · September 12, 2018, 10:29am

ehelms · September 13, 2018, 7:31pm

If I could dig a bit more into current deployment combined with a question with an eye towards this concept given your experience running stand-alone Foreman and a Kube cluster. Easier as a list of questions instead of a paragraph:

What are your thoughts on Foreman running on a stand-alone Kube treating it independent from the full cluster?
In the current deployment style, to scale and add Foreman along with the ecosystem of plugins is to start spreading those services out to other machines and managing that topology as well as scaling of the services. If using Kubernetes, one would add a new node, then let Kube start spreading the load out more. Do you consider this equivalent in difficulty? Benefits/downsides of either from your experience?

dLobatog · September 14, 2018, 5:54am

Personally I see it as a way to manage your hybrid infrastructure (cloud/bare metal) from a single pane of glass

As far as I can see in their FAQ, that will remain untouched, or am I missing something? https://coreos.fedoraproject.org/faq

Do you mean that bare metal users would deploy something like Minikube (just one node, easy to deploy, no-frills)? Would any user have to deploy it on a complicated Kubernetes setup? I don’t know how our current HA deployments, or anything larger than 1 foreman host for that matter would be deployed in this model (e.g: several Foreman nodes or proxies in different networks)

TimoGoebel · September 17, 2018, 11:16am

Generally, I like that idea - if maintaining the cluster is not too much effort for us and the Foreman admins out there. We have toyed with the idea of having a dedicated “infrastructure” k8s cluster. The approach you propose would be similar.

Let me think out loud here:

I’m a bit concerned with the admin overhead this might introduce to our users. Managing a Kubernetes cluster should not be underestimated. We currently have 3,5 skilled engineers working full time on our kubernetes ecosystem. Just to throw out this number. I believe Foreman’s audience has more knowledge in classic config management tools (Puppet, Ansible) than in Kubernetes. Our community probably needs to learn these technologies, potentially just to use Foreman. We definitely need to take care of the Kubernetes setup effort, ideally via puppet. Setting up Foreman needs to be easy. We’re not doing a great job there right now.
I think it’s fair to say I know Foreman pretty well, but even I struggled last time when I set up a demo environment. If you the Foreman community to grow, we definitely need an easy deployment that just requires an ordinary VM and not a Kubernetes cluster.

I know there are projects like puppetnetes that create their own Kubernetes distribution, but this sounds like a lot of effort to maintain if we do it ourselves. rpm packaging might be a lot easier. How takes care of Kubernetes upgrades? The foreman-installer? One upside of the kubernetes effort is that we can potentially unify the debian and el packaging efforts.
Do we want to support kubernetes as a service options that most popular cloud providers offer? Amazon’s EKS? Google’s GKE? Ideally we provide a CloudFormation stack that sets up a demo Foreman environment on a fresh AWS account.

Would just the main Foreman run in the cluster? What about content proxies with a full pulp setup?

From my experience, maintaining an rpm based, distributed Foreman/Katello cluster is not an easy challenge. This is mostly because of the complex architecture that katello requires (or actually the whole subscription manager eco-system if you want). This does not go away if we use Kubernetes.
When stretching the components on to different hosts, the most issues we had was that the existing code assumed that everything ran on the same host. There were some cross-dependencies we had to resolve, e.g. services just working because another service needed the same requirement as well. For load-balancing we use haproxy/keepalived configured by puppet. That works quite well.

If we add containers to the mix, we can’t actually get rid of something else in return. We just add yet another layer of complexity (ingress, loadbalancing etc.). How are we going to build the containers? Run foreman-installer in a Dockerfile? With Quay and a Dockerfile per repo? What will we use as a base-image? How are we going to configure the applications? Ideally via environment variables. Will we continue running the rails app in passenger? (Ideally: no) How do we make sure the base-image is up to date? How do we enfore security best practices in the cluster? How are we doing service discovery? Via DNS? How about an overlay network? Or network policies? How do we make webpack js code from plugins available to core?

My current gut feeling is, that we’re not ready for this now. I’d start with simple Docker containers and orchestrating them via puppet and systemd (systemd just starts the container, not the native os service; puppet rolls out systemd units to start the container. the config is injected via environment variables defined via systemd). That way we have a goal that is not impossible to reach and we see progress. Divide and conquer. If we have official container images available, let’s then talk about how we orchestrate them. We get the benefits of containers as universal “packages” but not at the cost of complexity. Kubernetes can then help us with scaling and service discovery, but let’s not make this the first step.

Sorry to not just jump in on the effort, I do like kubernetes and a lot of ideas in these RFCs. But sometimes taking it slow is the better option.

TimoGoebel · September 17, 2018, 11:25am

From my experience, it’s not the right tool for a cloud native approach. Foreman lacks support for infrastructure as code, immutable infrastructure, self-healing infrastructure… If you run your apps on a cloud provider just as you would do it in an on premise datacenter, I agree that Foreman is still a nice tool.
But as most of you know, I love Foreman no matter what.

I read there, that you can install packages for debugging purposes. Although it’s discouraged to use, it means there will be some kind of package manager. And if it’s there people will use it. But it’s probably best to patiently wait for the first official release before starting to judge. I think the FAQ just does not make it 100% clear what we’ll get. If they keep the current workflow, I’ll be very happy.

ehelms · September 17, 2018, 12:37pm

On this point, I didn’t share this RFC directly to discourse, but I did link it above. There is a document targeted at container images here that I can instead send to discourse if that would be useful. I didn’t dig into current existing work as well but there are the current image sets present here.

Thanks for the idea, I’ll think on it a bit more.