Foreman not creating /var/run/foreman

Problem: after reboot of a fresh install of 2.2 on Centos7.8 the dir /var/run/foreman is missing. The symlink to it is dead. I had to create a workaround to get foreman to even start up. The service continues to restart about every 4-6 seconds without this directory.

Expected outcome: Please fix this bug

Foreman and Proxy versions: 2.2

Foreman and Proxy plugin versions: no plugins

Distribution and version: Centos7.8

Other relevant data:

It’s a directory. Not a symlink. It should be in set up in /usr/lib/tmpfiles.d/foreman.conf

1 Like

Umm, this is repeatable. Basic install using foreman-installer on a vanilla Centos7… everything goes well until reboot. The symlink is
/usr/share/foreman/tmp -> /var/run/foreman. After reboot, it isn’t re-created. The service will just cycle every 3-5 seconds and never create it. I had to create a rc.local task to fix this on reboot. I don’t know what to tell ya, this shouldn’t be like this. I even reproduced it a couple times with vagrant. Same issue.

Is systemd-tmpfiles-setup service enabled? If so, are there any errors in its log?

Sorry, but this is confusing: What isn’t created? The symlink /usr/share/foreman/tmp or the directory /run/foreman? It’s /run/foreman which is created by tmpfiles. /var/run should be a symlink to …/run.

Do you have the file /usr/lib/tmpfiles.d/foreman.conf?

Will respond later, commuting.

I’m stumbling over this as well. Would never have thought that directories which are tied to a systemd service would be split out to tmpfiles (knowing full well that this is also systemd).

My workaround is this:

root@lxforemantstls:[~] #: cat /etc/systemd/system/foreman.service.d/var-run.conf
[Service]
RuntimeDirectory=foreman
root@lxforemantstls:[~] #: cat /etc/systemd/system/httpd.service.d/var-run.conf
[Service]
RuntimeDirectory=httpd
root@lxforemantstls:[~] #: cat /etc/systemd/system/postgresql.service.d/var-run.conf
[Service]
RuntimeDirectory=postgresql

https://www.freedesktop.org/software/systemd/man/systemd.exec.html#RuntimeDirectory=

This is particularly useful for unprivileged daemons that cannot create runtime directories in /run/ due to lack of privileges, and to make sure the runtime directory is cleaned up automatically after use.

When looking up the information above I also stumbled over the next sentence to the quoted one:

For runtime directories that require more complex or different configuration or lifetime guarantees, please consider using tmpfiles.d(5).

If only the foreman-installer / satellite-installer would not remove the drop-in files after each installer run…

But what is your problem? The directories are created by tmpfiles. That’s how redhat does it. What would be the purpose of putting this additionally into the service unit?

Regardless of why, I still am very much interested in knowing if there is a way I can add things in https://github.com/theforeman/foreman-installer/blob/develop/config/custom-hiera.yaml to modify the systemd units.

But to answer your question: Almost every time we reboot our foreman or satellite server the services don’t start. From our troubleshooting we found that the tmp files managed by tmpdilfes.d are prone to not being run right when they are needed. They have a delay of some sort. If you put the creation of these files directly inside of the service unit files it works just as you expect it and reliably.

To additionally quote the man page for tmpfiles.d:

System daemons frequently require private runtime directories below /run/ to store communication sockets and similar. For these, it is better to use RuntimeDirectory= in their unit files (see systemd.exec(5) for details), if the flexibility provided by tmpfiles.d is not required. The advantages are that the configuration required by the unit is centralized in one place, and that the lifetime of the directory is tied to the lifetime of the service itself.

And honestly the content / rule is not complex that it would require this to be in tmpfiles.d

# cat /usr/lib/tmpfiles.d/foreman.conf
d /run/foreman 0750 foreman foreman -

So :person_shrugging:

It’s still basically a redhat decision to do it with tmpfiles. It’s like that for most of the services. Working against your distribution is often not a good idea. Plus, if you define it in the unit you have to remove the configuration from tmpfiles.d unless you want both to work against each other if configurations are not identical.

All services properly start on any foreman and smart proxy. Thus, if it doesn’t work for you, you should find our why that’s not working properly. On our foreman servers tmpfiles are set up well before the foreman service starts. sysinit.target is in between.

Ok so after a bit of research I think I have found a workaround for all of us which have the problem OP has posted about.

You do the following:

root@foreman-test:[~] #: tail -6 /etc/foreman-installer/custom-hiera.yaml

systemd::dropin_files:
  run-tmp-fix.conf:
    unit: foreman.service
    content: "[Service]\nRuntimeDirectory=foreman\n"

(I’ll extend this for the services that are affected as well)

Then you run the foreman-installer / satellite-installer and watch it create the drop-in file in the log

root@satellite-test:[~] #: satellite-installer --scenario satellite --verbose-log-level INFO
2023-07-08 13:08:33 [INFO  ] [pre_migrations] <Array> ["Executing hooks in group pre_migrations"]
2023-07-08 13:08:33 [INFO  ] [pre_migrations] <Array> ["All hooks in group pre_migrations finished"]
2023-07-08 13:08:33 [INFO  ] [boot] <Array> ["Executing hooks in group boot"]
2023-07-08 13:08:35 [INFO  ] [boot] <Array> ["All hooks in group boot finished"]
2023-07-08 13:08:35 [NOTICE] [root] Loading installer configuration. This will take some time.
2023-07-08 13:08:35 [INFO  ] [init] Executing hooks in group init
2023-07-08 13:08:35 [INFO  ] [init] All hooks in group init finished
2023-07-08 13:08:35 [INFO  ] [root] Loading default values from puppet modules...
2023-07-08 13:08:50 [INFO  ] [root] ... finished loading default values from puppet modules.
2023-07-08 13:08:50 [INFO  ] [pre_values] Executing hooks in group pre_values
2023-07-08 13:08:50 [INFO  ] [pre_values] All hooks in group pre_values finished
2023-07-08 13:08:50 [NOTICE] [root] Running installer with log based terminal output at level INFO.
2023-07-08 13:08:50 [NOTICE] [root] Use -l to set the terminal output log level to ERROR, WARN, NOTICE, INFO, or DEBUG. See --full-help for definitions.
2023-07-08 13:08:52 [INFO  ] [pre_validations] Executing hooks in group pre_validations
2023-07-08 13:08:52 [INFO  ] [pre_validations] All hooks in group pre_validations finished
2023-07-08 13:08:52 [INFO  ] [root] Running validation checks.
2023-07-08 13:08:52 [INFO  ] [pre_commit] Executing hooks in group pre_commit
Package versions are locked. Continuing with unlock.
2023-07-08 13:08:53 [INFO  ] [root] Package versions are locked. Continuing with unlock.
2023-07-08 13:08:56 [INFO  ] [pre_commit] All hooks in group pre_commit finished
2023-07-08 13:08:56 [INFO  ] [pre] Executing hooks in group pre
2023-07-08 13:08:56 [INFO  ] [pre] Ensuring foreman-selinux, katello-selinux, candlepin-selinux, pulpcore-selinux to package state installed
2023-07-08 13:09:11 [INFO  ] [pre] All hooks in group pre finished
2023-07-08 13:09:11 [NOTICE] [configure] Starting system configuration.
2023-07-08 13:11:08 [INFO  ] [configure] Compiled catalog for satellite-test.home.arpa in environment production in 3.07 seconds
2023-07-08 13:11:16 [NOTICE] [configure] 250 configuration steps out of 1535 steps complete.
2023-07-08 13:11:19 [NOTICE] [configure] 500 configuration steps out of 2430 steps complete.
2023-07-08 13:11:19 [NOTICE] [configure] 750 configuration steps out of 2430 steps complete.
2023-07-08 13:11:20 [NOTICE] [configure] 1000 configuration steps out of 2430 steps complete.
2023-07-08 13:11:20 [NOTICE] [configure] 1250 configuration steps out of 2437 steps complete.
2023-07-08 13:11:26 [NOTICE] [configure] 1500 configuration steps out of 2439 steps complete.
2023-07-08 13:11:29 [NOTICE] [configure] 1750 configuration steps out of 2444 steps complete.
2023-07-08 13:11:29 [INFO  ] [configure] /Stage[main]/Systemd/Systemd::Dropin_file[run-tmp-fix.conf]/File[/etc/systemd/system/foreman.service.d/run-tmp-fix.conf]/content:
2023-07-08 13:11:29 [INFO  ] [configure] --- /etc/systemd/system/foreman.service.d/run-tmp-fix.conf     2023-07-08 12:58:06.420351479 +0200
2023-07-08 13:11:29 [INFO  ] [configure] +++ /tmp/puppet-file20230708-165314-10qy32s    2023-07-08 13:11:29.024426188 +0200
2023-07-08 13:11:29 [INFO  ] [configure] @@ -1 +1,2 @@
2023-07-08 13:11:29 [INFO  ] [configure] -[Service]\nRuntimeDirectory=foreman\n
2023-07-08 13:11:29 [INFO  ] [configure] \ No newline at end of file
2023-07-08 13:11:29 [INFO  ] [configure] +[Service]
2023-07-08 13:11:29 [INFO  ] [configure] +RuntimeDirectory=foreman
2023-07-08 13:11:29 [INFO  ] [configure] /Stage[main]/Systemd/Systemd::Dropin_file[run-tmp-fix.conf]/File[/etc/systemd/system/foreman.service.d/run-tmp-fix.conf]/content: content changed '{sha256}0c74177c8bd8d61f2a2acd94180ec9ea2f682f7e04d0614f7230d5f7006988a6' to '{sha256}ce95088e465c54ac2784b8e2829082a7eee0d54a88f2875ba840b85a430d1e5d'
2023-07-08 13:11:29 [INFO  ] [configure] /Stage[main]/Foreman::Config/Systemd::Dropin_file[foreman-service]/Systemd::Daemon_reload[foreman.service]/Exec[systemd-foreman.service-systemctl-daemon-reload]: Triggered 'refresh' from 1 event

Check whether the file was created:

root@foreman-test:[~] #: cat /etc/systemd/system/foreman.service.d/run-tmp-fix.conf
[Service]
RuntimeDirectory=foreman
root@foreman-test:[~] #:

as well as:

root@satellite-test:[~] #: systemctl cat foreman.service
# /usr/lib/systemd/system/foreman.service
[Unit]
Description=Foreman
Documentation=https://theforeman.org
After=network.target remote-fs.target nss-lookup.target
Requires=foreman.socket

[Service]
Type=notify
User=foreman
TimeoutSec=300
PrivateTmp=true
WorkingDirectory=/usr/share/foreman
ExecStart=/usr/share/foreman/bin/rails server --environment $FOREMAN_ENV
Environment=FOREMAN_ENV=production
Environment=MALLOC_ARENA_MAX=2

SyslogIdentifier=foreman

[Install]
WantedBy=multi-user.target

# /etc/systemd/system/foreman.service.d/installer.conf
[Service]
User=foreman
Environment=FOREMAN_ENV=production
Environment=FOREMAN_HOME=/usr/share/foreman
Environment=FOREMAN_PUMA_THREADS_MIN=5
Environment=FOREMAN_PUMA_THREADS_MAX=5
Environment=FOREMAN_PUMA_WORKERS=15

# /etc/systemd/system/foreman.service.d/run-tmp-fix.conf
[Service]
RuntimeDirectory=foreman
root@satellite-test:[~] #:

Yes, you’re right on both of your points there – in principle. From our troubleshooting there seem to be racy situations where the tmpfiles service does not create the folders in /tmp/ when they are needed after a reboot. Thus the services won’t start. If you manually kick the tmpfiles service which creates them it all works but as this should be something which happens right after a reboot… yeah so the solution to have it in the service unit works best for us. We will have to keep an eye out and watch this with every update of course but that’s a price we’re willing to pay.

Sounds to me as if the tmpfiles setup service isn’t running at all. If the directories are missing, even after boot is complete, then the setup service didn’t run or failed.

And now you have systemd and tmpfiles working against each other, as permissions are different…

It’s spontaneous. There is no clear pattern but we noticed that it happens and sometimes it does not.

root@satellite-test:[~] #: systemctl is-active is-enabled systemd-tmpfiles-setup.service
inactive
active
root@satellite-test:[~] #:

Easily fixable by using RuntimeDirectoryMode=

You need to check the status and with list-dependencies to see when/if the service has been running. systemd-analyze plot may also help.

Well, until the tmpfiles configuration has been changed. In addition, there is the systemd-tmpfiles-clean service which may interfere. Basically, you have to remove the tmpfiles configuration file for the service to make sure it does not interfere. Of course, that’s not possible because foreman-installer and the other system rpms will restore the files.

As I wrote before: it may break at times you don’t expect and won’t even notice that it’s related with this. Don’t work against the system. Fix the original issue and find out why the tmpfiles setup service doesn’t always seem to work. It works perfectly for me so I suspect it’s something else causing this, something like you are trying to do here. Making conflicting or competing changes may cause things to fail at times you don’t expect.

If the directories are missing, check what happened. Check the status of the service. Check the systemd execution chain. My guess is the service has failed for some reason.

Sort of. The RPM might restore it during an update of foreman / satellite but the foreman-installer / satellite-installer command does not. This is not managed by Puppet and thus I can remove it.

You are absolutely correct with the way to troubleshoot this and I appreciate your help! We will follow up on that end when I/we have enough time. As of right now this is what works and we have it documented and the side effects are tolerable to us.

@gvde Your comments encouraged me to go down the rabbit hole.

TL;DR: You were right, I found out why and solved it.

The reason the files in /var/run could not get created was because of a NFS mount.
We moved the /var/lib/pulp/ directory onto a NFS share for business reasons.
The folder /var/lib/pulp/tmp folder is mounted as a bind mount locally to /var/pulp_tmp – it really is just a local bind to have that IO as fast as possible (at least to the kernel as the vmdk itself is also “on the network”) for those temporary files.

I did ran systemctl status systemd-tmpfiles-setup and it showed interesting and very indicative error messages.

Then I looked into it further: Mounting the NFS share caused the following error:

[root@satellite ~]# journalctl | awk '/Found ordering cycle/,/break ordering cycle/'
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found ordering cycle on auditd.service/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on local-fs.target/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on var-lib-pulp-tmp.mount/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on var-lib-pulp.mount/start
Jul 14 14:09:54 satellite.home.arpa kernel: audit: type=1400 audit(1689336594.680:6): avc:  denied  { getattr } for  pid=839 comm="systemd-tmpfile" name="/" dev="dm-0" ino=128 scontext=system_u:system_r:rpcbind_t:s0 tcontext=system_u:object_r:fs_t:s0 tclass=filesystem permissive=0
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on remote-fs-pre.target/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on nfs-client.target/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on gssproxy.service/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on basic.target/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on sockets.target/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on pulpcore-api.socket/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on sysinit.target/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Job auditd.service/start deleted to break ordering cycle starting with sysinit.target/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found ordering cycle on import-state.service/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on local-fs.target/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on var-lib-pulp-tmp.mount/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on var-lib-pulp.mount/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on remote-fs-pre.target/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on nfs-client.target/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on gssproxy.service/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on basic.target/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on sockets.target/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on pulpcore-api.socket/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found dependency on sysinit.target/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Job import-state.service/start deleted to break ordering cycle starting with sysinit.target/start
Jul 14 14:09:54 satellite.home.arpa systemd[1]: sysinit.target: Found ordering cycle on systemd-update-done

This gave me enough information to find out more via Google.

My solution was to make my own systemd mount units instead of putting it into /etc/fstab.

The unit files look likes this:

[root@satellite ~]# systemctl cat var-lib-pulp.mount var-pulp_tmp.mount
# /etc/systemd/system/var-lib-pulp.mount
[Unit]
Description=Pulp mount /var/lib/pulp
[Mount]
What=filer.home.arpa:/vol_daten_nfs/qtree-lxupdate/satellite
Where=/var/lib/pulp
Type=nfs
Options=context="system_u:object_r:var_lib_t:s0",_netdev,tcp,rw,vers=3,rsize=32768,wsize=32768,hard,timeo=600,bg,nointr
TimeoutIdleSec=600
[Install]
WantedBy=multi-user.target


# /etc/systemd/system/var-pulp_tmp.mount
[Unit]
Description=Local pulp mount /var/pulp-tmp
[Mount]
What=/var/lib/pulp/tmp
Where=/var/pulp_tmp
Options=context="system_u:object_r:var_lib_t:s0",bind,X-mount.mkdir
[Install]
WantedBy=multi-user.target
[root@satellite ~]#

After several tests I am confident that this way we won’t be running into those problems again.

For all the others in this topic: Do you also use NFS mounts? Did you all solve it or just walk away dealing with it?