So @ehelms managed to nerdsnipe me yesterday when he asked why our Debian builds for Foreman take hours (compared to <10 min for RPM on Koji)… Well, take some and sit by the , it’s story time!
When building a (RPM or Debian) package, you roughly do the following steps:
- obtain the source (usually a combination of the project source code and some packaging recipe)
- obtain all build dependencies (usually defined in the packaging recipe)
- execute the actual build step(s)
- take the built result and put it into an actual “package” (or multiple), this includes:
- copying files into a useful structure
- scanning the files for dependencies (like if the file is a bash script, add an automatical dependency on bash)
- scanning the files for provides (like if the package contain /bin/bash, allow other packages to depend on that)
- create a list of installed files for later verification
- compressing the installed files into a package (or multiple)
RPM and Debian builds do roughly the same “steps”, so why are Debian builds so much slower?
Obtaining the source and the (packaging) build-dependencies isn’t much different, but now the differences start.
For RPM, we package all Ruby gems and NodeJS modules as RPM, so when they are needed, we can just install them in binary form and they are present almost immediately. For Debian however, we only build-depend on Ruby and NodeJS itself, and all gems/modules are installed with bundler/npm at build time (step 3 above) from their sources. This takes quite some time (Ruby: ~8min, NodeJS: ~22min) already.
But then, because we ship these gems/modules inside our Foreman package, step 4 also needs to process all these additional files (and there are many in
dh_install(step 4.1) takes 40(!) minutes
dh_shlibdeps(step 4.2) is sometimes slow (the runs I looked at today weren’t, but I’ve seen ~10min in the past)
dh_makeshlibs(step 4.3) is sometimes slow (the runs I looked at today weren’t, but I’ve seen ~10min in the past)
dh_md5sums(step 4.4) takes 25(!) minutes
dh_builddeb(step 4.5) takes 25(!) minutes
This gets amplified by the fact that our Debian build node is running on slower HW than our Koji (no ultra fast NVMe disks).
So, uh, what can we do about that?
Obviously, if we wouldn’t vendor all those Ruby gems and Node modules, we would instantly win, as we wouldn’t need to build them every time, and not include them in our packages. But this is quite a lot of work (also long-term for updating those packages), and we just don’t have that.
The next best thing is to limit the number of vendored packages and make the “put into package” step smarter.
- Today we install all Node dependencies from
package.json, but this also includes quite a few CI/Test/Lint packages, which we don’t need to build our assets. When building RPMs, we exclude these, and we should do the same with Debian. – I tried this and it cuts down the
npm installtime to 2 minutes (from 22 minutes!),
dh_installto 40 seconds (from 40 minutes!),
dh_builddebto 2 minutes (from 25 minutes!)
- Creating checksums for package contents is optional, and while I think we should still provide checksums for most of our core packages,
foreman-assetswhich includes all the Node modules needed to build assets could probably be excluded (this package is used for building plugins, but users hardly ever need it installed on their systems) – I tried this, and it cuts down the
dh_md5sumsrun to 30 seconds (from 25 minutes!)
- We know that the vendored Ruby/Node packages don’t include any files we want to offer other packages to depend on, neither they contain anything that we would need to generate dependencies for, so we can exclude these from
If you’re curious (you wouldn’t have read until here if you weren’t) you can find my experiments in
Yepp, that’s 106 minutes saved on every build. Each build takes only ~30 minutes now
Oh, and this accidentally cuts
foreman-assets from ~218MB to ~56MB in size.