Ohai,
in the past, we had several reports that Foreman (and especially in Katello-enabled setups) uses too much memory and OOM kills happen (see e.g. Katello 4.5/Foreman 3.3 memory leak (in gunicorn)). While it is clear that we should aim to fix all possible memory leaks, investigating this possible solutions (and workarounds) for them and monitoring existing setups, we also realized that our default deployments are not optimal as they effectively oversubscribe the available system resources.
A short example: We assume that the Puma/Rails/Foreman processes consume about 1GB of memory each, and deploy as many “workers” as would fit into system memory (subtracting a small “reservation”). [Yes, if you look closely at puppet-foreman
you’ll notice it also considers CPU count, but let’s ignore that for a moment.] If you turn this around, it means whatever “reservation” we put into that formula (today it’s 1.5GB), this is all the memory that all the other processes are allowed to use, as there will be a Puma worker for each other available gigabyte of system memory.
In reality, each Puma worker uses less memory, and shares quite a bit of it with the main process (as we use app prealoading, so the application code only exists once in memory), but it’s still somewhere in the 4-600MB ballpark, which on my 16CPU/32GB machine means with 24 Puma workers (the 24 is due to CPU influencing the formula) something like 14GB “real” usage (and 24GB “allowed”).
If Puma would be the only thing running on that machine, it would be perfectly fine.
But we also have to run PostgreSQL, Tomcat, Pulp and a few others here.
Tomcat is happy with 1-2GB (napkin: 32GB system - 24GB puma allowed - 2GB tomcat = 6GB “free”).
PostgreSQL, if not tuned too much, is also happy with 2-4GB (napkin: 6GB “free” above - 4GB postgresql = 2GB “free”).
Pulp consists of 3 distinct parts: content app (very light, just serves bytes to users, can easily fit in 1G), api (that’s the part leaking in the link above) and workers. Both API and workers can consume a lot of memory, but we have (on paper) only 2GB left. And this is exactly where the Linux kernel starts not liking us.
But Evgeni, why are you writing all that text for a problem we’re already aware about?
Well, I want to fix it. Or at least tame it a bit!
On the one hand, we’ve been trying to make the memory leak in Pulp API less severe (see Api server memory leak? - Support - Pulp Community for some details and progress).
But even if the memory leak wouldn’t exist, we today deploy most of the services (Puma, Pulp API and Pulp Workers especially, as those are the most resource hungry ones) in a configuration that is suited for “one service per VM” deployments, but not for “multiple services per VM”.
The idea is to reduce the “visible system resources” when the individual parts of the stack perform their default sizing calculation. So, given the 16C/32G machine above, when configuring Puma, we’d pretend it’s supposed to configure for 8C/16G (or something similar) instead, thus leaving more room for the other services. Users will still be able to override those decisions (as they are today) with installer options, thus explicitly configuring the number of workers each individual service should run, but the defaults would become more sustainable.
Now, the idea might sound simple, the implementation will most probably not be that.
Today, most services are configured by individual Puppet modules, not knowing anything about each other (as they ultimately could even run on different systems). Those modules are tied together using our installer and this is also where I think I’d try to inject the logic of finding out which services are available (that’s not a static list, some people run PostgreSQL on a dedicated host, some run Foreman without Katello/Pulp/Candlepin, etc) and which slice of the system to present to those.
Inside the individual modules, it could then look something like this (based on puppet-foreman
, with the *_allowed_*
parameters being calculated by the installer):
$_puma_cpu = pick($foreman::foreman_service_allowed_cpus, $facts['processors']['count'])
$_puma_ram = pick($foreman::foreman_service_allowed_ram, $facts['memory']['system']['total_bytes']/(1024 * 1024 * 1024))
$puma_workers = pick(
$foreman::foreman_service_puma_workers,
floor(
min(
$_puma_cpu * 1.5,
$_puma_ram - 1.5
)
)
)
But Evgeni, why are you writing that second wall of text, just go implement this!
I hoped to gather some feedback whether y’all think that’s a valuable idea or if you maybe have other ideas how we could provide better dynamic defaults in our deployments.