Containerizing the Foreman Ecosystem

TimoGoebel · September 17, 2018, 12:56pm

Thanks for pointing me to this. Apologies, but I did not see this.

I have some remarks here. I think the Dockerfiles should live in each corresponding code repo (at least where we control the repo) and ideally be built for every merge to master. Building the container should be part of the CI workflow.
These Dockerfiles are based upon rpms. We should investigate if we can deploy the apps differently to reduce the required packaging effort.

ehelms · September 17, 2018, 1:20pm

I do talk about why I went the way I did and chose the approach of centralization of the images rather than splitting them out. I can see value in both, and I went for reducing complexity to start with. We’ve also had good success with centralized packaging which this mirrors to me with much lower scale which might make it less required.

Given we build and will have to build RPMs, this was an easy way to get started and provided benefits. Moving to source based is fair and do-able. However, until we go full container we can’t avoid packaging but I agree it provides a road to get there.

iNecas · September 17, 2018, 6:17pm

We’ve been just independently talking with @Marek_Hulan and @ekohl about this and I think we have some agreement that “getting Foreman up and running” deserves some attention to make it easier for people getting started as well as starting using more capabilities. We have a lot of features out there, the problem is that many people don’t know about them or don’t cross the level of effort needed to get them helping with their use-cases. I think this is quite orthogonal to the containerization. Running in k8s will not solve the issue for us. On the other hand, if done right, it might make the containerization effort smoother. Anyway, thas’t probably for separate discourse thread.

ehelms · September 19, 2018, 2:48pm

First, thanks everyone that has participated in the conversations thus far. One of the goals of posting these RFCs was to engage the community in discussion to help arrive at a direction that makes the most sense and this discussion helps. I am going to clarify (I hope) one idea @TimoGoebel mentions here but was refined in IRC conversation as a proposal for some feedback.

One commentary heard through this discussion is around how to get the community comfortable with containers. That is how do we get comfortable operating, building, debugging and understanding just the container aspect rather than the full blown container ecosystem. To this end, the question has been raised could we focus on building containers and replacing existing proccesses with them. For example, currently the default Foreman install runs under Passenger within Apache. This could be replaced by a container running on the host with Apache handling SSL and doing a reverse proxy. The idea being this lets us focus on building containers and running them without changing the overall installation dramatically. This stepping stone approach could be propagated across the set of services and in theory provide a gateway to then move those containers into Kubernetes. This would not necessarily provide us the out of the box HA/Load-balancing gains that Kubernetes could give us but would be targeted at stepping the community that direction while aiming towards reduction of packaging burdens as one benefit.

The reduction in packaging burdens being from focusing on building containers using source and native dependency managers (e.g. gem, npm) which would drop the requirement for packaging both Deb and RPM.

This approach would obviously be a slower transition given we’d want to provide incremental value and changes between each release.

One target of this work has been how to handle migration of Katello to Pulp 3 in a staged approach moving content types one by one and thus running and supporting both for a time being. Given the potential to clash, the RPM effort for Pulp 3, one idea had been to put Pulp 3 directly into Kubernetes to start with and thereby keep the two apart. There is an open question whether the above approach helps with that or provides the same issues that native processes would.

Please consider the above, and the original idea and provide any feedback you can around this approach.

TimoGoebel · September 19, 2018, 3:05pm

I like the ideas you mentioned a lot. I think divide and conquer is key to success here.

Today we started to work on provisioning puppet masters and smart-proxies on Kubernetes. What I thought to be a quite simple task showed a lot of challenges while we discussed our desired approach. We’ll continue this effort and I’ll be happy to share what we come up with.

One challenge is definitely providing container images and configuring the running services. In a foreman-installer world, you can easily tweak every setting you need. With containers, this is currently not possible. While the Dockerfiles available in the forklift repo are great to get started, they are very opinionated. I’m wondering if it would be a good approach to toy with moving smart-proxy configuration to support configuration through environment variables. That way you can reuse an image more easily.

That apache instance could then be replaced by a k8s ingress eventually.

This could also be done by starting the containers via systemd. Don’t get me wrong, I don’t want to torpedo k8s at all, but it’s not a trivial task to maintain a k8s cluster. You have to take care of state, secrets, ingress, ha-ing your ingress, authentication and a lot of certificates. As soon as you go multi-host, it get’s complicated.

sean797 · September 19, 2018, 3:32pm

I want to take this discussion on a slight tangent if you don’t mind. If you think it warrants a separate discussion feel free to reply in a new thread.

As a user, whats the benefits to me to run Foreman or Katello (and its components inside a container)? I don’t see this being discussed, at least not in detail, maybe I missed it, but I think its a really important to discuss why users would (or wouldn’t) want such a change.

Looking at the benefits for running an application inside K8s vs VMs I see a couple of main benefits:

self-healing
automated scaling
easy development
more upgrade options/strategies (e.g blue/green strategy)
higher hardware efficiency

Looking at that list (which while there are more benefits I feel that captures the main ones, shout if you think I should include others ) some like automated scaling and easy development are not really applicable to users or Foreman/Katello, due to the current application architecture we cannot scale smart proxies and most of its features for example.
So that leaves us with self-healing, more upgrade options/strategies and higher hardware efficiency. While I totally agree they are good things for any application, I just question if for those benefits if it’s worth the time & effort from both developers and users to embrace such a change? I wonder if we should tackle the scaling issues with Foreman/Katello before a containerisation effort so users can reap the full benefits.

lzap · September 19, 2018, 3:45pm

I very much like the idea of not having k8s as a hard dependency and providing the standard installation based on as little components as possible (e.g. systemd mentioned). This could be focused on a single-node deployment with no scaling options beyond the single node. Users who like the “traditional” deployment could still somehow run and manage the node while the technology we would build from (containers) would enable us to ship also “advanced” and much more scaleable type.

ehelms · September 19, 2018, 11:50pm

I do think this is valid for a separate topic. I do agree the two intersect in some ways, but if making the application more scalable is a requirement to reach containerization then this would occur as part of the effort. To that end, I’ll start a separate thread geared towards starting on this scalability discussion.

This is a fair question, and there was some touchpoints on it in the initial post. I think what you touch on are some benefits to users. I do think automated scaling is a user benefit since it’s a goal of the effort, which as I stated above. If solving ability to run services at scale is required to containerize I would consider that being tackled as part of the effort.

Easy development can have indirect user benefits. If ease of development increases issue turn around, feature development or the ability for developers to add more value to Foreman than users will inherently reap these benefits. If containers means less time spent building, packaging and dealing with deployment, those developers are freed up to harden the project, and build more functionality which is an end user benefit.