Nightly Releases: Where we are and Where we want to be

I want to revisit our nightly testing and elevate how we end-to-end test and thereby verify the code that is landing within the ecosystem. This starts with how we think about and design our nightly testing and releases. And transitions to where do we want to go as a community, what do we want that process to look like and what other testing scenarios do we want to enable to reduce churn and provide more value.

This is a bit long as I wanted to cover how nightly pipelines are designed as of today. The goals they aim to meet and where I think we want to go based on conversations I’ve had with developers and users.

Nightly Testing Today

This first section covers background on the state of nightly testing and releasing today to inform the conversation of where we want to go and how we are going to get there.

Types of Builds

There are generally speaking three types of builds that flow through our nightly pipelines:

Continuously built packages: packages that are rebuilt on nearly every code change committed to the mainline development branch (e.g. foreman, katello, foreman-installer)

Dependencies: packages that Foreman, Proxy, Installer, Selinux, Katello or Plugins require to build or run

Plugins: packages that require Foreman or Foreman proxy and add functionality to either one

Types of Releases

Continuously Built Packages

Continuously built packages are built on nearly every commit to their respective source repositories. When a PR is merged, a job kicks off to run a series of unit and integration tests to produce a verified source for the project (e.g. a foreman tarball or katello rubygem). The source job kicks off a packaging build job that uses the verified source as input to generate a package into the respective build systems.

Dependencies

Dependencies have PRs created against foreman-packaging and run a series of tests:

  • Did the PR properly bump the version or release
  • Does the package build
  • Are all dependencies of the package satisfied by the update

When the packaging update is merged, the dependency is built into appropriate build system(s) (i.e. RPM, Deb).

Plugins

Plugins have PRs created against foreman-packaging and run a series of tests:

  • Did the PR properly bump the version or release
  • Does the package build
  • Are all dependencies of the package satisfied by the update

When the packaging update is merged, the plugin is built into appropriate build system(s) (i.e. RPM, Deb).

Release Pipelines

Foreman, Proxy, Installer and SELinux

At least once a day, the Foreman nightly pipeline kicks off generating an RPM and Debian staging repository with any of the continuously built packages and dependencies that have recently been built. The repository is then verified by:

  • Installing the base Foreman scenario for each supported OS
    • Running smoke tests
  • Installing N-2 Foreman and successively upgrading to staging
    • Running smoke tests against final upgrade

Katello

At least once a day or when a Foreman nightly pipeline completes, the Katello nightly pipeline kicks off generating an RPM staging repository with a new rubygem-katello and dependencies that have recently been built. The repository is then verified by:

  • Installing the base Katello scenario (which includes an external proxy) for each supported OS
    • Running smoke tests
  • Installing N-2 Katello and successively upgrading to staging
    • Running smoke tests against final upgrade

Plugins

At least once a day, the Foreman Plugins nightly pipeline kicks off generating an RPM and Debian staging repository with any newly built plugins or their dependencies. A test is ran that verifies all dependencies for all packages are met and then the staging repositories are pushed out to the nightly release repositories.

Nightly Releases (Where we want to go)

There are some issues we commonly see with the current approach:

  • Foreman can push a new version of foreman, installer or proxy that breaks plugins or Katello
  • Changes can build up in staging repository leading to compound failures that amplify debug
  • Nightly pipelines are currently built to be continuous integration testing environments; some bugs are identified only in this environment and can take days or weeks to resolve given they are emergent
  • Consumers of nightly plugins or Katello can be blocked for days or weeks if a change to Foreman propagates before the updated plugins or Katello make it into the repositories

Continuous Nightly Releases

Consistent, green nightly releases provides a basis for developers and some classes of users to test functionality, use new features and preview how the next Foreman release will look. There is a growing base of users wanting continuous stable nightlies. I believe this requires us to shift nightlies away from being the output of a continuous integration test to be treated instead more as a release that is continuous.

A change to help achieve this is to move the point at which continuously built packages are end-to-end tested to ensure that changes they introduce are tested prior to landing in the final release test bucket. The process would look like:

PR merged -> Test Source -> Build Source -> Build scratch package -> Run end-to-end test -> Build package into build system

Nightly releases would still contain a final end-to-end test of the staging repository but continuously built packages would only arrive after having been vetted previously.

Added Benefits

Building out this workflow would enable plugins that desire to, to have continuous built packages that are vetted against an end-to-end workflow targeted at their use case.

This change can also propagate to versioned releases of Foreman allowing the source code releases to be vetted before committing to a version bump or package build.

Katello and Foreman

Foreman and Katello have always had a special relationship as Katello is the largest plugin in the ecosystem and the only one to contain it’s own scenario in the installer. There are myriad architectural changes needed to align the two as discussed in another thread. From a testing stand point, the two butt heads more than other plugins due in part to how connected they are from a release stand point.

  • Katello has a tightly coupled relationship to Foreman code wise
  • Katello’s installation scenario is tied to releases of foreman-installer
  • Katello and Foreman have divergent levels of testing
    • Katello pipelines test external proxies
    • Foreman just recently gained upgrade testing

Like other plugins, Foreman’s pipeline is gated only on Foreman itself and therefore can push a new nightly release that breaks both the nightly release of Katello and the nightly Katello release pipeline. This can often result in a compounding effect for Katello developers as they try to deal with both.

Further, the recent survey results show a growing trend of users who are deploying Foreman with Katello in their environments.

Without gating the two on each other, which has been long undesired approach, what changes can we make to ensure the two do not result in broken releases or have a reduced downtime between breakages?

What about End to End PR Tests

Our end-to-end tests are resource and time intensive and prevent us from running that level of tests at scale on PRs. For some projects, the amount of churn and benefit of testing on PRs would be a worthwhile trade off (i.e. foreman-installer). This even is not full proof as, for example, puppet module changes do not flow through installer PRs and therefore need that CI based end-to-end test.

1 Like

There’s a lot to unpack here. First of all, I want to be explicit in that I consider working nightlies as crucial to the project. Any critique I have is about implementation.

On the technical level, I’m afraid our current pipelines wouldn’t handle the concurrency. The way we call jobs is based on kicking off a job and waiting for one to start. If two projects start the same pipeline (foreman + foreman-installer), it’s hard to track which one started which.

Another is who picks it up when something fails. There was Proposal - nightly pipeline monitoring rotation but it hasn’t been implemented yet. Adding more things that might fail might just make it worse.

My suggestion would be to start small. In the Build Package phase, I’d suggest we add more checks on dependencies. I use a script to wrap update-requirements. My suggestion would be integrate that into CI and if the RPM spec file differs from what the package requires, the nightly build should fail rather than attempting to build something and possible generate garbage.

The benefits is that it helps the quality in a way that’s easy to solve. It can allow us to build a process where we investigate the build failures.

I also believe this would catch more issues than actual end to end tests. Our coverage is very low (at least in Foreman, Katello is in a better position - which is why it’s red for longer). I’ve tried with smoker but increase coverage but got stuck on UI and received no help.

Regarding gating Foreman on Katello passing, I think we’re too soon on that. I’d actually consider moving the Katello GIthub organization into theforeman and treating it like any plugin. Right now the groups of people with access on both is limited which limits our ability to fix things, even if we know the solution.

tl;dr: good long term vision, but let’s start small and iterate.

1 Like

Each continuously built package would have an independent pipeline for verification prior to being committed to the package build system. There is no concurrency I can perceive.

This proposal would gate projects much earlier and block them from releasing into nightly. If projects want to avoid looking at these pre-release pipelines then their projects will not get released. I’m not sure how to do better other than block PRs if it’s failing like we did in the past.

I think this provides value, but does not get to the heart of the issues I believe we see today and that commonly lead to unreleased nightlies. Katello’s end-to-end tests cover, while not indepth, the core paths for it’s functionality for example. This has proven to catch big issues which is what we want. My proposal just moves that detection to place that does not result in entirely blocking releases. The same with the installer. Today, we test the default scenario configuration which is the primary workflow and does catch architecture changes, config changes and other issues we’ve introduced. If we move installer testing early and prevent those from entering the nightly release pipeline then nightlies can go out more often and cleaner.

Take a recent example, a change to the installer broke Katello nightly pipeline because it changed the order of operations within the installer which broke Katello expectations. Katello tests caught this change but then Katello remained blocked for weeks due to an installer change. So Katello nightly pushes suffered due to a change in the installer. If we prevent the installer from being committed to the packaging system if it fails tests on a scratch build then we keep code flowing for Katello.

We need this change to our workflow to treat Katello like another plugin. We’ve discussed in the past. But we need a way to verify Katello before committing it to the plugin repository. And I’ve heard other plugins would like to have a similar verification path for nightly.

@ehelms not being as familiar with the inner workings of the pipeline, I’m not exactly sure what the proposed change is, do you mind giving a summary what would actually change?

Talking from a developer perspective regarding the nightly releases, here are some areas of improvements I see:

  • Nightly pipeline is relied on too much to catch problems after they merge, rather than before, especially in the installer and packaging areas. When nightly breaks, its often compounded as other issues are introduced during the meantime (as you already mentioned).
  • In general, we have too many repositories and releases to manage and it becomes hard to juggle them all within the larger context of the Foreman ecosystem.
  • When pipelines break, there aren’t clear procedures on how to proceed and who is responsible for what. We don’t often revert breaking changes and the designation of responsibility to investigate breakages isn’t clear.

I can see some challenges when we gate, but some things need a release in both. For example, Foreman Rails 6 will break Katello, but you can’t merge the Katello changes until Foreman is merged. While I get you do this in the package release step, I think you’ll run into the same problem.

How do you plan to deal with this “I know I have to break things” situations?

I’ll try to re-summarize. Effectively, the idea is to run the same end-to-end tests we do today after we generate a new nightly source (e.g. tarball or gem) and gate building that package into our staging repositories unless the end-to-end testing passes.

1 Like

Without much detail, the two options that come to mind are:

  1. Gating Foreman nightly releases on more checks including plugins
  2. A more semantic versioning scheme for Foreman’s code to allow it to be treated like the “library” that it is to some aspects of our ecosystem. Along with multiple versions existing in the repository like plugins we could do more staged releases of code without having to worry about known breakage situations.