Problem:
May 27 03:06:08 foremanprd kernel: Out of memory: Killed process 1994344 (pulpcore-api) total-vm:2199176kB, anon-rss:2144756kB, file-rss:384kB, shmem-rss:0kB, UID:977 pgtables:4340kB oom_score_adj:0
Jun 3 03:06:34 foremanprd kernel: Out of memory: Killed process 1613 (java) total-vm:10333940kB, anon-rss:2184648kB, file-rss:0kB, shmem-rss:0kB, UID:53 pgtables:5144kB oom_score_adj:0
Jun 4 03:50:15 foremanprd kernel: Out of memory: Killed process 3448896 (pulpcore-api) total-vm:7381228kB, anon-rss:7257028kB, file-rss:384kB, shmem-rss:0kB, UID:977 pgtables:14480kB oom_score_adj:0
Jun 5 03:45:28 foremanprd kernel: Out of memory: Killed process 3603741 (pulpcore-api) total-vm:5213120kB, anon-rss:5141352kB, file-rss:256kB, shmem-rss:0kB, UID:977 pgtables:10240kB oom_score_adj:0
We have suffered lately from a few OOM events, are there a memory management tuning recommendation somewhere that can help us better manage the memory usage or are we having something configured wrongly?
Expected outcome:
no OOM events
Foreman and Proxy versions:
foreman-3.13.0-1.el9.noarch
Foreman and Proxy plugin versions:
katello-4.15.0-1.el9.noarch Distribution and version:
Other relevant data:
As a temporary mitigation, we have added a swap to the system, previously server was running without it…
[root@foremanprd ~]# free -h
total used free shared buff/cache available
Mem: 31Gi 18Gi 577Mi 612Mi 12Gi 12Gi
Swap: 15Gi 9.8Gi 6.2Gi
If you need more data, please let me know, I will supply all I can.
Hello ikinia, sadly I was OOO and have limited information, only those messages from the log and few words from colleagues. the web service was restarted after the swap was added.
that’s a good guide as a starter, you can do some basic profiling of the box, but working out where the memory usage is, is it in pulp and content, is it in the puppet master, or the database etc.
There are some built in metrics you could pull, or use system system tools like sar etc
the extra load of a daily sync could be pushing it over the edge, rather than the problem
I’m being a bit general as not knowing your specific setup, but you need to look at what is using your memory from the core functions of your katello node(s)
serving the interface - normally low
the proxy/capsule function, normally load but is impacted by usage
the puppet master - linked to proxy load too, number of threads and the size of the threads and the duraction of these threads locking resources open
content - serving content via the pulp process, demand and size impact it
Postgres database - basic patterns for database load and impact
other smaller integration components
while any of these can trigger the OOM you’re seeing, you need to understand how much of your resources each one is using and why, you may find (example only) that your puppet server is way over using your resources and being locked open as it’s too busy - which is holding onto ram, so when a task such as a content sync happens (and is correctly sized) that is the task that pushes it over the resource limit