Containers: Image Build Strategy

ehelms · October 18, 2018, 2:57pm

Container Images

The basic unit of a containerized deployment are the static images that are used to run services. Similar to artifacts like source archives, RPMs or Debian packages the images need to be built for myriad concerns such as nightlies and in conjunction with Foreman releases. The images must be defined, built and tested. This document covers the strategy for how images are defined, built for different release targets and tested to ensure a stable deployment.

Defining Images
Build Strategy
Deployment Strategy

Defining images

Container images for the Foreman ecosystem will live in an images directory where each image is represented by a directory by the same name as the the image within Grapple. This is to mirror the current packaging setup by consolidating images together for ease of re-build orchestration, testing and mindshare. Each image will contain a Dockerfile that defines the base image and all instructions required to build the image. Assets that need to be copied into the container will live in a container-assets directory to identify them. For example, an image for Foreman would look like the following:

images/foreman
├── container-assets
│   ├── database.yml
│   ├── entrypoint.sh
│   ├── settings.yaml
└── Dockerfile

The various images that will be required to support Foreman deployments will have some inter-dependency between them. Keeping them all co-located should make re-builds easier when orchestration is required.

The base image would be Centos 7.

Source vs. RPM

The initial image builds used RPMs to re-use the effort put into building RPMs and more closely replicate the current deployment model and use tested artifacts that had passed nightly pipelines. However, this is not a hard requirement. Switching to source (e.g. git, rubygems, NPM) moves closer to the bits and provides an on-ramp to reducing the packaging effort that occurs within the Foreman ecosystem today. There has been a desire to go directly to source to better understand the nuances and to use it from the outset. Unless there is major opposition this is the direction builds would go.

Scripting

For simple, 1-2 line statements that need to be wrapped in a script a bash script should suffice. For more complex orchestrated actions that are run as part of the build or a start-up script Ansible should be the preferred solution via targeted playbooks that perform some targeted set of actions.

Build Strategy

The build strategy reflects how changes to container images are handled to ensure that images build properly, and deployments continue to work. The built images will be stored and referenced from quay.io/organization/foreman.

On PRs to change images/ directory

Run a test build with docker or buildah
Modify any references to the image(s) in the deployment code
Run a deployment
Run smoke tests
If tests pass, mark PR as green
Nightly builds based on source changes
When a PR is merged to a target source repository (e.g. Foreman core) initiate a rebuild of any images tied to that repository
For example, when code merged to Foreman core, build Foreman image and then build Dynflow image
Modify any references to the image(s) in the deployment code
Run a deployment
Run smoke tests
If tests pass, promote new image to latest tag on quay.io

Container Build Mechanism

As the container world has evolved, different methods have been built to build container images. There are methods to build on platforms, to build locally with different tools and push to external registries. A major consideration is how dependent images are handled given this involves build orchestration in a prescribed order. There are two options for build mechanisms:

Option A: Build Local, Push Remote

Use docker or buildah to build images locally and push them to quay.io
Requires storage of quay.io robot credentials

Option B: Build Remote

Orchestrate hitting quay.io API to perform builds on quay
Requires storage of quay.io organization “application” credentials to interact with API
Orchestrate tooling with Ansible
Express container build relationships through configuration
Use Ansible to build out tools that are run locally or in CI/CD for building images

Image testing

The use of base containers means that container builds must be orchestrated in a given order to ensure all child containers are running the same stack.

Changes just to the Dockerfiles or container-assets do not completely represent when an image should change. Both the build artifacts and source repository changes need to be considered to determine rebuild. Further, image or source changes affect not only whether an image will build but the runtime behavior of the image. Both aspects should be tested whenever a change is made to ensure stability.

The general strategy for any image change is to kick off a CI job that builds the container (and children) when a change is made to the related source or image for the container, perform a deployment and verify with a set of smoke tests.

Example:

A pull request (PR) is merged to Foreman’s develop branch
The merge triggers a CI job that re-builds the Foreman container and tags it with test
After building the Foreman container, the Dynflow container is rebuilt and tagged with test
The deployment code is updated to reference the test images
A deployment is initiated
Smoke tests are ran against the deployment
If the smoke tests pass, the test images are promoted to latest tags

TimoGoebel · October 19, 2018, 10:01am

Thanks, some comments below.

While this is true, it would be easier to build containers as part of every PR it the Dockerfiles live in the same repo as the code. Same for advanced integration tests.

Why? There are more lightweight base images out there. What are the benefits?

I like that. Can we package containers as rpms? That way we could maybe resolve the discussion around npm package building.

lzap · October 19, 2018, 11:00am

The major benefit of CentOS7 as a base is longest support in the world out of all non-commercial Linux distributions. If you stay on Foreman particular version for whatever reason, having ability to respin your container with an important bugfix is crucial. https://linuxlifecycle.com

There are voices that in the new container era that OS is irrelevant. This is so false and the word “lightweight” in this context is very misleading. Foreman application will not levitate inside containers, there are hundreds of megabytes of libraries and support code that is required in order to run any piece of software. We pull the code from the distribution, these are our low-level dependencies. And those should come from high-quality linux distribution with big enough community with a decent know-how.

ehelms · October 22, 2018, 3:52pm

I assume here you mean as an artifact of the PR itself to give reviewers the ability to take the built image and test the code directly in some environment? Rather than the idea of building the container after every PR is merged which I think both concepts support.

I think Lukas answered this well. I would also add that image size has not been a huge concern I’ve heard. Once the image layers exist, they won’t have to be re-downloaded as I understand things. So this would be more of a one time cost for users and affect testing environments more than anything else. I think a base CentOS 7 image is 70-80 MB which isn’t terrible.

I’m sure you can. What is the benefit you see of making a source container then wrapping that in an RPM vs. pulling the container from a registry?

TimoGoebel · October 22, 2018, 5:52pm

Well, my idea was that we could start in small steps. Our users have to learn the container environment and our developers need to learn it as well. Currently, we spend a lot of effort building and maintaining rpm packages for our apps. I believe we can drastically reduce a lot of this effort by just moving the Foreman app to a container. We’re currently talking about putting the app behind Apache as a reverse proxy. In my opinion, this could be the groundwork for putting Foreman in a container. Just Foreman (and puma), built from source.
I assume most user’s Foreman instance does not have direct access to the internet but have a mechanism for installing rpm packages. That’s why my suggestion is:

Move Foreman to puma behind Apache
Add a Dockerfile to Foreman’s repo
Change the rpm package of Foreman to just ship the container and some systemd start scripts.
Install and configure docker in the installer (puppet modules).
Delete a lot of (packaging-) code.
Ship a better Foreman.

I’d make the Dockerfile to build Foreman part of Foreman’s source. It’s easier for users to find it, users can more easily build their own Foreman (with custom patches) and developers have full control over the artifacts. If we build the container as part of every PR we have direct feedback if it breaks. We can even add some smoke tests. And therefore guarantee a fast release. Currently, we have to synchronize every core change with packaging. I see, that this is necessary for rpms, but I generally consider it an anti-pattern. So let’s stop that if we don’t need to. In a container world, we’ll probably make changes to the container more often.

If we go this approach, we can gain some experience and solve the plugin situation. We can easily mount certificates, configuration etc. inside the container and benefit from configuring the app through our awesome and mature puppet modules.

I’m still very reluctant regarding this change. We currently have one skilled engineer trying to move a Puppetmaster, a Puppet CA and our Smart Proxy (just with the Puppet and PuppetCA modules active) to Kubernetes. After four weeks, full time, we’re still struggling to make it work. That’s just crazy. I want to save the Foreman Community that effort.
Trust me, this is not me trying to prevent change and I do feel like “one of the old guys” (no offense) when I type this, but let’s do this in small steps. I love new technology, but we’re not ready to go all in.

Martin Alfke has a great talk, “Why ops hates containers”. This is a great reality check. I can highly recommend it.

Can we also try Alpine Linux as well? Just so we have a comparison? I have very good experience with Alpine Linux and I believe official ruby containers are based on it. The ruby version shipped with CentOS is just very old. I’m not saying, we can’t use CentOS. I’m just saying: Let us try both and decide then.

Konstantin_Orekhov · October 22, 2018, 6:15pm

Love this approach!

ehelms · October 22, 2018, 6:29pm

Thanks for laying it out. I’ve not stated this directly, but I have been pushing the pieces of this from behind the scenes as the target set of steps to achieve your idea of a systemd-based container.

I think you make some valid points. The part I struggle with is what to do with the other containers and their Dockerfiles. We need things like Dynflow, qpid, Pulp, Candlepin, Squid if I look across both Foreman core and Katello ecosystems. Some of these should/could be handled by the communities themselves. In other cases, I’d argue there are not necessarily great images, may need customization or there are a split of images across multiple OSes. Further, so far there have been some useful things to share such as scripts, build tooling, etc. and how to manage that when the Dockerfiles have been strewn across multiple repositories.

I’m willing to give this a try but I have my reservations.

If we are building from source, we should see a severe reduction in this given the container would be running bundle install and npm install. This wouldn’t have to care about things like aligning versions.

Reluctant to move to containers or reluctant to move to Kubernetes? I only ask for clarification cause your anecdote is about Kubernetes and I think it’s important to split that from newer systemd focus.

How do we “try it” per say? Two Dockerfiles one for each distro and tag the builds based on the base OS? That could work with a bit of extra overhead. There are SCL’s to handle the Ruby versions so I don’t consider that an issue. The Openshift S2I has proven them out a good bit and I’ve been able to run with them. I worry about Alpine being easy for others to adapt given it has its own packaging system that is not the two major ones (RPM or Deb). And to be honest, the moment we commit to Alpine, a decent chunk of the community will have to start maintaining and testing a CentOS version of the Dockerfile and resulting image.

TimoGoebel · October 22, 2018, 6:50pm

Thanks. Very appreciated. Just to clarify, this does not have to be the last step. We can continue from there on.

I feel the same. Let’s try it. If it does not work out, we can change it at a moment’s notice. As you said, we’re going to need to maintain containers for applications where we don’t own the source. The container sources can live in a dedicated repository.
I also believe, that when be build containers “for others”, we should rely on whatever artifacts the other communities provide. If they’re still rpm based, we should install them from rpms (but inside a container) and not install them from source. We probably don’t want to swim against the stream.

To Kubernetes. Treating the applications as cloud-native, when they aren’t. I guess that’s cloud-naive then.

You’re right. Partitioning the community should be avoided at all cost. And on second thought, one goal of this effort is to ship a single artifact. If we start having two different artifacts, users have to tell us which artifact they used when filing bug reports etc. That’s not worth it. CentOS it is.

ehelms · October 22, 2018, 11:56pm

On the topic of where the Dockerfile should live and how it affects CI/CD, does the options from the plugins discussion and the ensuing discussion affect this at all? I’m thinking of the case of if all plugins were built into a single image, how we might test/handle gating for those plugins.

lzap · October 23, 2018, 12:53pm

I agree we should take it slow. But also I don’t see anybody rushing things, Eric RFCs are great example that this is taken seriously and slowly.

Can you elaborate what kind of good experience this is? Things are nice when they work, the problems you don’t usually see show up when they break. I am just curious.

But we would not use Ruby 2.0 from CentOS, I assume we would continue using SCL Ruby from Red Hat Software Collections (with 3 years guaranteed support cycle). What is more important that libc, libpthread, libcrypt, openssl and many others in CentOS7 have guaranteed support for ten years. Those are used by Ruby binary, that’s the big deal. Alipne releases are supported for two years on average and this can change in the future.

Also, there is no reason in making containers lightweight by using very different libc implementation which is few megabytes smaller and it was originally written for embedded space when you then pull 350+ MB of dependencies with Foreman. I don’t want to see that anyone from the Foreman community is spending a minute hunting a bug that only appears in Ruby + musl libc from Alpine. We have much more to do.

Foreman community has lot of Red Hat employees as well. When I encountered some low-level bug in kernel (kexec) or OS library (openssl compilation issues, systemtap Ruby problems) I simply made one call and problem was solved by my colleague. With anything other than Fedora, CentOS or RHEL-compatible OS, I would not be able to do this. I’ve used this multiple times in the past to be honest.

End of very biased opinion about CentOS vs Alpine

TimoGoebel · October 25, 2018, 5:38am

I agree, I just want to make sure we don’t aim to high in the first iteration and this effort is going to happen.

I was quite skeptical at alpine linux, but it has served us well as a base container in several places. As you said, things are nice when they work. Alpine Linux has worked well so far. And without SCL it was pretty easy to use.

It’s fine, I just want to get us talking. I’m totally fine with CentOS. I don’t believe we need 10 years support for our container base image and the additional complexity of SCL’d ruby, but it won’t make much of a difference in the end. We can make the packaging effort easier and the product more stable with basically any base image.

Thanks, guys, I really appreciate the discussion.