Koji disk space cleanup

As we need to clean up some disk space on Koji, I am proposing the following changes:

  1. Remove all locally synced Fedora repositories, and remove Fedora 27 and 28 from the external repos list in Koji. Convert Fedora 29 (only built for nightly) to point at external URL.
  2. When Katello 4.0 is released, drop all locally synced EL 5 content.
  3. Remove all locally synced EPEL 6 content and update external repos to point at Index of /pub/archive/epel
  4. Convert puppetlabs external repositories to point at official Puppetlabs repositories and delete local content
  5. Remove foreman-rails* external repositories
  6. Remove tfm-ror51* external repository
  7. Remove rhel7-x86_64 as we use centos7

There is likely more conversion we can do but this my proposed starting list. @evgeni @pcreech please review

Can we make 2 “When Katello 4.0 is released, drop all locally synced EL5 content”?
We need EL5 only for client-bits, and of those Katello is a heavy user, so I’d prefer not to block Katello 3.17 fixes while it is a supported target. (Quite possible this will never be needed, but if we wait, we shall wait till the correct moment).

Would replacing locally synced content with external repos mean we are saving disk space at the cost of slower builds and higher bandwidth fees (iirc koji is on aws infra)? do we have any data on how much we use these repos and how much it would cost us to do the switch? in other words, could buying bigger storage be cheaper than increasing network costs, at least for some more commonly used repos?

I believe since the connection is initiated from inside AWS reaching out there is no network cost to pay and part why @pcreech wanted to move this route. We have been using this strategy for all EL8 builds and have not noticed any issues (knock on wood). For at least Fedora and CentOS repositories, their Koji takes the same strategy (that is where we got the idea) so in theory if we are pointing to the same repositories they are pointing to we can assume the same level of stability.

We also use mrepo today which does the job but is a bit old. Perhaps if we need a cached option we could look at a tool that manages and syncs repositories. And fetches and caches it on demand.

1 Like

We did see one issue with external repos: when the repo changes, but Koji didn’t regen the local metadata yet, it might end up trying to pull a package from the external one that isn’t there anymore (as those are replaced in CentOS, not added). But that can be easily fixed by manually regenerating the repo (and probably telling koji somewhere to do that itself more often). So not a big deal as such.

I have made an initial pass tackling #1 (minus the switch to external repositories). Update to list of external repositories – Cleanup removed external Koji repositories by ehelms · Pull Request #1581 · theforeman/foreman-infra · GitHub

We are now down to 701GB used on /mnt/koji. The external repos, /mnt/koji/external-repos/, are now down to 220GB total.

Poking around, we additionally have the following taking up space:

  • 16GB dating back to 2017 – /mnt/koji/backups/postgres
  • 32GB from duplicity – /mnt/koji/backups/ephemeral

Looks like there’s a kojira.conf setting we can enable for this:

https://docs.pagure.org/koji/external_repo_server_bootstrap/#regenerating-your-repo

koji doesn’t monitor external repositories for changes by default. Administrators can enable such bejaviour with setting check_external_repos = true in kojira.conf (for details see Koji Utilities).

As Eric mentioned, all inbound data transfer into AWS is free. Completely, totally free. It’s traffic that leaves the aws region that you pay for.

||Pricing|
| — | — |
|Data Transfer IN To Amazon EC2 From Internet||
|All data transfer in|$0.00 per GB|

1 Like

Will that regen all repos/tags that use the external repo? As I think all our old tags (like 1.x etc) also refer to the now external repos and regenerating those would be a waste of resources.

Let me elevate the question: should we prune old tags off of Koji up to a release point?

In the most technical sense, yes. In the practical sense, I’m curious if we’ll notice/care, especially compared with the time lost with not having an up to date repo when they happen. And if it becomes a noticiable issue, we could always up the parallel task settings on the koji-fedora28-builder or spin up a new builder for createrepo tasks (we need an el8 host for createrepo).

newRepo tasks are in general such a minor part of the load we put on koji.

“load” yes, but they take a slot, and for some reason only run on the fedora builder, which clogs the pipe every now and then.

This is because we can’t run el8 newRepo on the main koji. And you can’t (as far as I can tell) filter el8 vs el7 for newRepo to go to different builders.

If we’ve seen it clog before, then its probably worth evaluating if we can increase the number of slots on that builder anyway. It can probably handle more than we have on it right now. And with our increasing need for el8 builds in general it’s probably past due for us to beef up our capacity anyways.

For a test, i’ve updated the capacity on the fedora28 builder to 8 instead of 2. Looking at the cpu history over the past week, we’ve had only minor spikes in cpu usage. My assumption is we have more headroom. Since the koji server has 4 cores, and 16 capacity, i chose 8 since koji-fedora28-builder has 2 cores.

AFAIK AWS also has CentOS mirrors:
https://blog.centos.org/2020/02/speeding-up-yum-for-centos-ec2-instances/

If we use those, it would effectively be the same as koji builder pulling from our main koji server, right?

If i remember correctly, these are hooks into the CentOS mirrorlist function. The actual server hostname that could be provided could change, and if it did we would have to update everything to a different one of the URL, since koji does not have the ability to consume mirrorlists. I was not given a guarantee that this would be even remotely stable.

Aiming for step #4 here, a PR to cleanup our use of Puppet repositories: External puppet repos by ehelms · Pull Request #1610 · theforeman/foreman-infra · GitHub

Once that is in I will cleanup any artifacts on disk.

I have completed cleaning up local Puppet and EL5 repositories. This has given back ~25GB of space.

Fedora 29 cleanup has been checked off. That freed up nearly 100GB.

All tasks have been completed. We are left with just RHEL 6.6 repositories used for building EL6 client bits currently on the Koji and are down to 550GB of used space.