ARM builds - what should we do?

tbrisker · June 25, 2018, 11:17am

Hello,

Recently I had a conversation with @Ondrej_Prazak and several other developers regarding our package build process and what takes time, during which he mentioned that ARM builds take several hours to complete compared to minutes for x86 packages. I didn’t give it much thought at the time, until I read @Gwmngilfen’s Community Survey Analysis, which indicated that just about 3% of the community is actually using the ARM architecture. When we realized similar numbers were using Fedora about a year ago we stopped packaging for Fedora (Dropping Fedora 24 packages).

Given the amount of resources that goes into packaging for ARM, the delay it adds to our release pipelines, and the small number of users, I propose we change the way we handle ARM packages. There are, AFAICT, three ways we can go: drop ARM packages completely, therefor asking ARM users to use only source installations; Extract ARM packages from the regular pipeline and only build ARM packages separately after the main pipeline has finished (perhaps even only for stable releases and not nightlies); or keep it as is. If I’m missing some options, please feel free to add in the comments; Otherwise - please vote on your preferred path forward:

Drop ARM packages completely
Package ARM separately after main pipeline
Continue building for ARM as part of the regular pipeline

0 voters

cc: @packaging and @infra teams

ekohl · June 25, 2018, 11:36am

I like the fact we support multiple arches but it shouldn’t hold up the main arches given the low percentage of users. Building it after the main pipeline should satisfy both I think.

Gwmngilfen · June 25, 2018, 11:43am

This came up a few days ago, and inspired me to try and model this a little bit - since I’m fairly new to statistical work, I’d appreciate critique from any real data scientists we have in our community …

TL;DR - we’re not likely to affect more than ~600 people by changing this.

So the survey replies show this:

amd64	ARM	%
139	4	4/143 => 2.8%

Not everyone filled out the question, so that’s why it’s not out of 164. Since we presented this as a binary choice, we can model this as an unfair coin, getting “heads” (or amd64) with p = 0.972 and “tails” with p = 0.028. This is a pretty extreme unfair coin, but for our rough purposes it’s OK (I think).

This model allows us to ask the question: What % of the population would be affected by dropping ARM builds, and how confident are we of that result?

I think we want to be fairly confident about changes here - the discussions around dropping Fedora back in the day remind me that people do feel strongly about their own choices. Add to that, we have community-donated ARM servers, so it’s nice to be able to put people’s hardware to use. So, I’ll pick a 97.5% confidence level, in which case:

> binom.test(c(139,4),conf.level=.975,alternative='greater')$conf.int
[1] 0.9299306 1.0000000
attr(,"conf.level")
[1] 0.975

The way to read this is: with a 2.5% chance of me being flat-out-wrong, the true value of the percentage of people using amd64 is somewhere between 0.9299 and 1.0. Taking the lower bound (93% if we round it) and multiplying it by the best estimate of the number of people in the community (I currently roughtly estimate this to be 6-10k) gives us a maximum affected population of between 420 and 700 people.

I’m reasonably confident of the 93% figure, I think my maths is right. The absolute-affected value of 420-700 people is considerably more vague based on how I calculated it

dLobatog · June 25, 2018, 12:03pm

Just a comment to distinguish this from when we dropped Fedora. Fedora required us to build a lot of packages outside of the SCL, which meant maintaining RPM specs which were more complex, and any contributor had to be aware of this. We also had to check versions of our packages against Fedora repos and try to use theirs whenever possible. It added significant complexity.

ARM builds are not inherently more complex or anything than amd64. It’s just the Jenkins slaves we use are slower, and since we require our tests and packages to work everywhere, they delay the whole release pipeline. I agree they should not be a blocker. If possible, @packaging could we have an ‘arch’ parameter to the release pipeline? Having 2 release pipelines would mean the majority of our users would get new versions much faster.

I also support dropping it from nightly builds - in my experience ARM builds have not been more problematic than amd64, I cannot recall any problem that exclusively affected ARM, so anything we should fix, we would discover it on amd64.

ehelms · June 25, 2018, 1:51pm

Is there any breakout of the data to know if users are running the server or smart proxies on ARM? There’s been talk here and there of breaking the smart proxy to it’s own repository to support more OSes since it’s also simpler and reducing the server support matrix.

Gwmngilfen · June 25, 2018, 1:56pm

The question was specifically “What hardware do you run Foreman on?”. While this doesn’t preclude people using it for proxy data, I’d assume it’s more applicable to the Foreman server

I would expect the proxy percentage to be higher, given it’s a lighter footprint. How much higher is hard to say

lzap · June 25, 2018, 2:59pm

I would like to know more about the context. Who and why is actually doing ARM builds? Why would anyone would like to run Foreman on ARM - I mean managing ARM fleet via Foreman running on Intel makes a lot of sense, but compared the effort to do all the packaging I wonder why this was started. I vaguely remember that we were donated some ARM resources.

As a huge fan of ARM platform, I think that ARM builds must not delay us from doing our primary architecture. I appreciate the effort put into ARM builds and would love to see this happening in the future. I am still looking forward the day I could buy decent ARM servers with reasonable pricing. But it’s not coming, maybe 2019 will be the year of ARM servers.

ekohl · June 25, 2018, 3:09pm

Running Foreman (Proxy) on a RPi is does make sense though core Foreman might have become too big because of memory requirements.

Gwmngilfen · June 25, 2018, 3:18pm

I’m currently running Foreman, a proxy, and a Saltmaster on an 64bit ARM ROCK64 which has 4Gb of RAM and an actual gigE nic. Works great

Anurag · June 26, 2018, 6:51am

I am still looking forward the day I could buy decent ARM servers with reasonable pricing. But it’s not coming, maybe 2019 will be the year of ARM servers.

I agree with the suggestion that we should retain ARM builds (run the pipeline separately, in the end). As large data center operators (like how Cloudflare did) share their reference archs with the public, interest in ARM infrastructure would only grow.

lzap · June 26, 2018, 8:34am

Absolutely, however you can still run Foreman on intel managing thousands of ARM servers.

But let’s keep building ARM, absolutely. We need to figure out how to do this faster.

tbrisker · July 4, 2018, 12:56pm

Looks like the main slowness was on arm32, which we are no longer building for nightlies and newer versions. arm64 seems to be not much slower then ia64, so it might be fine to leave it for now (as it seems extracting it from the pipeline may take a lot of work), so I won’t continue with pursuing this for now, unless the people making releases say this has a negative impact on the time it takes to get a release ready.