This started after I accidentally left a new Foreman/Katello VM running for a couple of weeks. Due to lack of space disk space on the hardware running this VM, the virtual disk file was hosted temporarily on an NFS share. During the two weeks I left it running, I had to reboot the server with the NFS share and forgot that I’d left the VM running. I believe this was what caused the problem, which is postgres insert queries hanging and consuming all the cores and most of the RAM. I’ve rebooted the server several times and it appears to be working properly for a few hours before these processes manage to bring the server to a crawl. One process at a time consumes a core until there is one process for each core and nearly all the RAM is used.
I can bring the CPU usage down to normal by stopping the foreman-proxy service, restarting the postgresql service, and then starting foreman-proxy again but then these running processes that never terminate come back one at a time until the services need to be restarted again.
I’d like to fix whatever has gone wrong with the database to stop having to restart these services.
I am curious how on Earch would 1k rows in a session table kill a VM that dramatically? We are speaking one thousand records. Was there some huge row or something? It could be some session storage attack maybe.
I don’t think it was having 1k rows. I’m up to 500+ rows now and there’s no problem. I think that sessions weren’t being cleaned up or added due to some bad juju from the NFS share dropping. I had rows in that table from as far back as 7/6, which was the day the NFS share went down, before clearing that table and vacuuming. Now there is nothing older than a day.