This thread is dedicated to a round of discussion and improvement projects to speed up parts of our CI towards the goal of faster feedback to developers across various projects and parts of our testing ecosystem. With this thread I want to:
Bring together discussions, and ideas to a central place for further discussion and hopefully some implementation
Highlight work that anybody does that improves the speed of CI tests as these changes can pay for themselves time and time again through our developer workflows
We have been adjusting the NPM jobs to display more output around where time is spent by splitting the dependency generation step (aka creation of package-lock.json) from the downloading of NPM modules step, that is running the following as two steps:
npm install --package-lock-only --no-audit
npm ci
So far we have opened PRs to do this for Foreman and Katello:
So far these investigations are centered around Foreman and Katello as our most churned on and longest running CI jobs. I would like to ask if any plugin maintainers read this that they reach out to identify places they think we could apply similar logic in their test suites.
The “purge useless npm deps” PR drops another ~20 minutes from the Katello pipeline. Mainly because it’s the “assets precompile” step, which just needs build dependencies, but we were installing also all the test deps (and then multiplied for foreman, katello, rex, tasks, as katello pulls those in too)
So let me bring that up again – is there a way we can better “categorize” the dependencies we (core and plugins) have? Or can we at least agree on a simple nomenclature given the current package.json constraints (there are only dependencies and devDependencies)? Something like dependencies is what we need for building the assets and devDependencies is what we need for development and tests? (I know, that’s not how Node defines dependencies, but we don’t really have node packages anyways here, we just abuse package.json as a way to express dependencies.)
I think we should be able to fix that by using project-specific, not project-build-specific RVM gemsets. That means that e.g. katello-pr-test would always run in the 2.7.0@katello-pr-test gemset, and not create a new 2.7.0@katello-pr-test-$JOB_ID gem set every time.
Another thing that I wanted to mention: Right now, getting the “numbers” is rather annoying, you need to look at the individual job runs and click a lot. I was thinking we should try something like Jenkins OpenTelementry plugin and let it report against honeycomb.io (their free tier should be totally sufficient for us) or similar.
I think the per build gemset exists because we cleanup the gemset at the end and we could face race conditions and clashes. Maybe we consider dropping that all together? I also wondered about installing to a central spot for caching on the node, e.g ‘bundle install --path ~/.rubygems’
To get better data about Foreman PR tests, I have started work on re-writing Foreman PR tests to pipelines. I am taking a different approach than we have previously and would enjoy feedback from maintainers of CI as well as developers who work in the Foreman code base.
The old foreman job took 35 minutes to run.
The new unit test job took 20 minutes to run.
The new integration test job took 21 minutes to run.
The new Katello job took 22 minutes to run.
Splitting out the unit and integration test jobs takes more total time but because they are split into parallel jobs they are able to run ~15 minutes faster with results for contributors. Given this I would like to move forward dropping the old test jobs:
Not mentioned here (since it’s not a speedup), but this also fixes the long standing issue where Foreman’s stable branches were checking out Katello master, so it was completely pointless to test. Now it looks up the correct branch.