Our /var partition has been growing very rapidly. This is a brand new install of katello 4.13.1-1 and foreman 3.11.2-1 running on Rocky 9. We only have two products: Rocky 8 and Rocky 9 and while we are doing “Immediate” download and not “On Demand” this partition is now taking up:
root@katello01 /etc/cron.d # df -hP /var
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/vg_sys-lv_var 4.5T 1.7T 2.9T 37% /var
1.7 T is a lot!
Further, it has been growing rapidly. I checked it about a week ago and it was sitting at 27% utilization.
None of our mirroring policies are “Additive”.
We mirror both x86_64 and aarch64 repos from the upstream mirrors.
Expected outcome:
That the /var partition doesn’t grow crazy fast and that deduplication of packages is in working order. We do have multiple Orgs on this server both mirroring Rocky 8 and Rocky 9 so hopefully there is a way to make sure de-dupe is functioning correctly.
Foreman and Proxy versions:
foreman-3.11.2-1.el9.noarch
Foreman and Proxy plugin versions:
Distribution and version:
Rocky 9.4 (Blue Onyx)
Other relevant data:
We also have a script that purges the number of versions of our CV’s so I am not sure what is eating up all this space. Is there a way to tell what is?
Thank you for any insights / methods to verify things you may be able to suggest,
besides trying to figure out what in blazes is causing the Postgres DB to log 472GB of data, does anyone know if i can safely just remove an old one? Like the one from Thursday or does Postgres need to be shutdown to release the filehandle type of thing?
Only the current weekday log should be open and in use. You can check with lsof or similar tools:
# lsof /var/lib/pgsql/data/log/postgresql-*.log
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
postmaste 1179 postgres 6w REG 253,0 21304 205529411 /var/lib/pgsql/data/log/postgresql-Sat.log
Without know what is logged that excessively it’s hard to tell. Check the log files and look for lines with most repetitions. It could be some problem with the database schema, long running queries, locks, etc. Are logs are usually quite self explanatory.
etc, and it cleared the space fine. You are correct in that I didn’t need to have the application shutdown to do this as it only had the current day open.
I did end up shutting everything down and then starting it back up after clearing space to see if it would fix the extensive logging but it didn’t.
Next, I checked for “paused tasks” with: foreman-maintain health check
and it reported none.
Then I went into the GUI and looked for “long running tasks” I believe it was and saw something from about a month ago. “Discover Hosts” I believe it was and I killed that.
2024-11-16 07:12:35 UTC LOG: duration: 1021.169 ms statement: UPDATE “dynflow_actions” SET “output” = '\x83a97265706f5f75726c7390a7637261776c6564dd0004f4aed928687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616dd929687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616d2fd929687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616d2fd929687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616d2fd929687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616d2fd929687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616d2fd920687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732fd929687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616d2fd929687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616d2fd929687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616d2fd929687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616d2fd920687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732fd929687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616d2fd929687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616d2fd929687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616d2fd929687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616d2fd920687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732fd929687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616d2fd929687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616d2fd929687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616d2fd929687474703a2f2f6d6972726f722e63656e746f732e6f72672f63656e746f732f382d73747265616d2fd920687474703a2f2f6d6972726f72
Any ideas on that? When I was trying to troubleshoot last week by tailing the logfile it was impossible to see anything as it was a constant stream of ASCII text looking like that.
At the end of the statement there should be the execution plan uuid and id identifying the task. But as you see the effect it’s most likely the task you have killed. Can you see the output of the task in the task view?
As the update took more than 1 seconds it was logged. If that happens a lot, you’ll get a lot of output.
Hi @ekohl, just from curiosity, is vacuuming anything which should be run periodically by foreman/katello user ? Or is it performed automatically by some deamon/cron task ?
To answer your one query: yes, this /var partition is on SSD drives. We have noticed we get far better performance from Foreman/Katello when they are on faster disks than SATA.
I will see if I can look in the remaining logfiles for anymore clues. Is that weird string of characters encoded or something? I don’t see any mention of trying to sync CentOS 8 stream from the way it appears.
Then to echo what @JendaVodka asked- is this vacuuming of the DB something we should be running manually?
I don’t see any entry for it in /etc/cron.d/katello or /etc/cron.d/foreman
It’s a hex encoded string. It start with \x to indicate that. After that it’s all hex digits. The string itself is probably UTF-8 or similar encoded but this for instance: 72 65 70 6f 5f 75 72 6c is repo_url.