Foreman-proxy: open FD of deleted log files - hdd out of space

Problem:
foreman-proxy keeps file descriptors of delete log files in /var/log/foreman-proxy/ . It consumes ~ 1GB per hour until disk is full. Foreman-proxy process probably creates file proxy.log.1, then gzips the file and deletes the original file BUT FD keeps open and consumes disk space

This behavior occurs randomly
Foreman and Proxy versions:
1.20.2

Other relevant data:
logs

sudo lsof +L1
ruby      62733 foreman-proxy    9u   REG  252,0 2593310065     0 1313894 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   15u   REG  252,0 2365372407     0 1318313 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   20u   REG  252,0 2340602901     0 1318154 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   27u   REG  252,0 2390022153     0 1317781 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   32u   REG  252,0 2418658852     0 1314081 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   33u   REG  252,0 2267874282     0 1318979 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   39u   REG  252,0  730826554     0 1319069 /var/log/foreman-proxy/proxy.log.1 (deleted)


/proc/62733/fd# ls -l | grep deleted
lrwx------ 1 foreman-proxy foreman-proxy 64 Jul  4 13:19 15 -> /var/log/foreman-proxy/proxy.log.1 (deleted)
lrwx------ 1 foreman-proxy foreman-proxy 64 Jul  4 13:19 20 -> /var/log/foreman-proxy/proxy.log.1 (deleted)
lrwx------ 1 foreman-proxy foreman-proxy 64 Jul  4 13:19 27 -> /var/log/foreman-proxy/proxy.log.1 (deleted)
lrwx------ 1 foreman-proxy foreman-proxy 64 Jul  4 13:19 32 -> /var/log/foreman-proxy/proxy.log.1 (deleted)
lrwx------ 1 foreman-proxy foreman-proxy 64 Jul  4 13:19 33 -> /var/log/foreman-proxy/proxy.log.1 (deleted)
lrwx------ 1 foreman-proxy foreman-proxy 64 Jul  4 13:19 39 -> /var/log/foreman-proxy/proxy.log.1 (deleted)
lrwx------ 1 foreman-proxy foreman-proxy 64 Jul  4 13:19 9 -> /var/log/foreman-proxy/proxy.log.1 (deleted)

Just curious, what os is this on?

foreman# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 16.04.6 LTS
Release:        16.04
Codename:       xenial

One more information:

size of all the deleted files changes (grows)

Every 2.0s: lsof +L1 | grep foreman-proxy/proxy.log.                                                                                                                          Thu Jul  4 14:19:36 2019

ruby      62733 foreman-proxy    9u   REG  252,0 74002390     0 1313894 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   15u   REG  252,0 75641376     0 1318313 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   20u   REG  252,0 74510136     0 1318154 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   27u   REG  252,0 74369050     0 1317781 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   32u   REG  252,0 74306006     0 1314081 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   33u   REG  252,0 74142990     0 1318979 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   39u   REG  252,0 74002745     0 1319069 /var/log/foreman-proxy/proxy.log.1 (deleted)
Every 2.0s: lsof +L1 | grep foreman-proxy/proxy.log.                                                                                                                          Thu Jul  4 14:22:30 2019

ruby      62733 foreman-proxy    9u   REG  252,0 78634660     0 1313894 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   15u   REG  252,0 80273646     0 1318313 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   20u   REG  252,0 79142406     0 1318154 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   27u   REG  252,0 79001320     0 1317781 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   32u   REG  252,0 78938409     0 1314081 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   33u   REG  252,0 78775393     0 1318979 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62733 foreman-proxy   39u   REG  252,0 78635148     0 1319069 /var/log/foreman-proxy/proxy.log.1 (deleted)

I don’t have a system at hand, but what’s in the logrotate config? Normally it should run systemctl kill --signal=SIGUSR1 --kill-who=main foreman-proxy to notify foreman-proxy to open the new file. Can you verify that actually opens new files? If it doesn’t, then we know the bug is in the proxy. Otherwise it’s in the delivery.

Here’s the logrotate config:

/var/log/foreman-proxy/migrate_settings.log
/var/log/foreman-proxy/proxy.log
{
  missingok
  notifempty
  create 0644 foreman-proxy foreman-proxy
  sharedscripts
  rotate 5
  compress
        daily
  postrotate
    if [ -d /run/systemd/system ] ; then
      /bin/systemctl kill --signal=SIGUSR1 --kill-who=main foreman-proxy >/dev/null 2>&1 || true
    elif [ -e /etc/init.d/foreman-proxy ] ; then
      /etc/init.d/foreman-proxy logrotate >/dev/null 2>&1 || true
    fi
  endscript
}

Here’s output of forced logrotate

foreman# logrotate -vf foreman-proxy
reading config file foreman-proxy

Handling 1 logs

rotating pattern: /var/log/foreman-proxy/migrate_settings.log
/var/log/foreman-proxy/proxy.log
 forced from command line (5 rotations)
empty log files are not rotated, old logs are removed
considering log /var/log/foreman-proxy/migrate_settings.log
  log does not need rotating
considering log /var/log/foreman-proxy/proxy.log
  log needs rotating
rotating log /var/log/foreman-proxy/proxy.log, log->rotateCount is 5
dateext suffix '-20190704'
glob pattern '-[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'
renaming /var/log/foreman-proxy/proxy.log.5.gz to /var/log/foreman-proxy/proxy.log.6.gz (rotatecount 5, logstart 1, i 5),
renaming /var/log/foreman-proxy/proxy.log.4.gz to /var/log/foreman-proxy/proxy.log.5.gz (rotatecount 5, logstart 1, i 4),
renaming /var/log/foreman-proxy/proxy.log.3.gz to /var/log/foreman-proxy/proxy.log.4.gz (rotatecount 5, logstart 1, i 3),
renaming /var/log/foreman-proxy/proxy.log.2.gz to /var/log/foreman-proxy/proxy.log.3.gz (rotatecount 5, logstart 1, i 2),
renaming /var/log/foreman-proxy/proxy.log.1.gz to /var/log/foreman-proxy/proxy.log.2.gz (rotatecount 5, logstart 1, i 1),
renaming /var/log/foreman-proxy/proxy.log.0.gz to /var/log/foreman-proxy/proxy.log.1.gz (rotatecount 5, logstart 1, i 0),
old log /var/log/foreman-proxy/proxy.log.0.gz does not exist
renaming /var/log/foreman-proxy/proxy.log to /var/log/foreman-proxy/proxy.log.1
creating new /var/log/foreman-proxy/proxy.log mode = 0644 uid = 997 gid = 997
running postrotate script
compressing log with: /bin/gzip
gzip: stdin: file size changed while zipping
removing old log /var/log/foreman-proxy/proxy.log.6.gz

and yes, it created new file:

# stat /var/log/foreman-proxy/proxy.log
  File: 'proxy.log'
  Size: 4131552         Blocks: 8072       IO Block: 4096   regular file
Device: fc00h/64512d    Inode: 1314082     Links: 1
Access: (0644/-rw-r--r--)  Uid: (  997/foreman-proxy)   Gid: (  997/foreman-proxy)
Access: 2019-07-04 14:52:04.728760259 +0200
Modify: 2019-07-04 14:54:40.469402622 +0200
Change: 2019-07-04 14:54:40.469402622 +0200
 Birth: -

old one:

#stat /var/log/foreman-proxy/proxy.log
  File: '/var/log/foreman-proxy/proxy.log'
  Size: 39356983        Blocks: 76880      IO Block: 4096   regular file
Device: fc00h/64512d    Inode: 1319035     Links: 1
Access: (0644/-rw-r--r--)  Uid: (  997/foreman-proxy)   Gid: (  997/foreman-proxy)
Access: 2019-07-04 14:22:15.949374716 +0200
Modify: 2019-07-04 14:47:28.023619618 +0200
Change: 2019-07-04 14:47:28.023619618 +0200
 Birth: -

It sounds like the proxy never switched to the new file which meant it stayed empty which meant it was never rotated again. It’s hard to say why this happened because it’s hard to reproduce.

I am not sure if the file was switched but it switches the log file since the service was restarted…
proxy.log file has first line at 7:45

  • 4.7.2019 around 15:30 was the foreman-proxy process restarted (there were 0 FD of deleted files)
  • 5.7.2019 between 7:00 - 8:00 FD of deleted files appeared (logrotate action)
lsof +L1 | grep foreman-proxy                                                                                                                                       
ruby      62829 foreman-proxy    9u   REG  252,0 57003264     0 1314082 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62829 foreman-proxy   15u   REG  252,0   807834     0 1319185 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62829 foreman-proxy   16u   REG  252,0   529033     0 1319249 /var/log/foreman-proxy/proxy.log.1 (deleted)
ruby      62829 foreman-proxy   17u   REG  252,0   434364     0 1313881 /var/log/foreman-proxy/proxy.log.1 (deleted)

These files have constant size and do not grow.
I have checked monitoring for historical data of disk usage and this issue happend probably on 17.4.,25.4.,10.5.,13.5.,28.5.,13.6.,20.6.,3.7. - so there is no regular occurence.

One suspicion I have is this: Proxy and dynflow log file rotation

Here is a small patch, can you apply it, set file_rolling_size to 0, restart and report back please: https://github.com/theforeman/smart-proxy/pull/667

All components from Foreman and even Katello can be configured for syslog logging and we could easily get rid of this rotating madness completely if we were able to configure rsyslog/syslogng to deliver messages to the same filename. But thanks to Puppet, this requires huge amount of work and another two components to manage which @ekohl is not happy with. I wish there was an easy way to solve this, syslog/journald is much more powerful.

You can however tell the installer to configure syslog or even journald and then configure log file destinations yourself easily. All bits are here: https://github.com/lzap/foreman-elasticsearch (just skip the ES steps).

Hi Lukas,
I’ve just deployed the patch. I will post results as I get some. The issue happens unpredictably.

1 Like

Hi Lukas,

there are still FD of deleted files - each day one more. I restart the foreman proxy at least once a week so the issue with no space left on hdd hasn’t occured yet.

Ivo

Thanks, can you try this patch then? That should fix it.

Please report back, this is awful issue and we should probably backport it into 1.22/23.

Hi Lukas, I have applied the patch and no deleted FD today after log rotate … thanks

1 Like

Please give it a week and then drop a comment either here or preferably in the PR.

@ivos multiple other customers are running into the issue, I am wondering if you can do the final confirmation that the patch works as expected. thanks!

Hi Lukas,
I have just checked the monitoring tool for disk free statistics and there wasn’t any issue since the patch had been applied. It wasn’t also necessary to restart the foreman-proxy service. For me, the problem is solved. Thanks.
Ivos

1 Like

I am reopening this old thread because there is a new bugzilla on Satellite 6.9 that describes the same behavior: https://bugzilla.redhat.com/show_bug.cgi?id=1985474

After reviewing our code, logging gem and logrotate configuration, I came to conclusion that logrotating proxy is not configured properly. Our proxy is configured by default to do rotation on its own on a weekly basis. Although our puppet configuration comment says it is also maxed out at 100MB, in the code I see we do not enforce this (size is set to zero). Number of logs to keep is at 6. Here is a config snippet:

#:file_rolling_size: 100 # (this is actually 0)
#:file_rolling_age: weekly
#:file_rolling_keep: 6

Now, in the codebase logging stack use rolling mechanism only and only if size > 0. Therefore, unless the size configuration value is uncommented, no rolling is performed in the proxy. A bit confusing, but so far so good.

Now, logrotate configuration:

/var/log/foreman-proxy/proxy.log
{
  missingok
  notifempty
  create 0644 foreman-proxy foreman-proxy
  sharedscripts
  rotate 5
  compress
        daily
  postrotate
    /bin/systemctl kill --signal=SIGUSR1 foreman-proxy >/dev/null 2>&1 || true
  endscript
}

This is normal daily rotation with compression, which is I believe simple file rename operation. Then USR1 signal is used to reopen log files. Here, I am a bit fuzzy finding out why proxy or logger rubygem would open rotated file since rotation within proxy is turned off (by default)

Anyway, to solve this, I think we need to do copytruncate here just like we do with Ruby on Rails.

/var/log/foreman/*.log {
  daily
  missingok
  rotate 14
  compress
  delaycompress
  notifempty
  copytruncate
  postrotate
    /bin/systemctl kill --signal=SIGUSR1 foreman-proxy >/dev/null 2>&1 || true
  endscript
}

No signal would be needed as the file would never get renamed but it should not hurt. Opinions?