Foreman-Proxy service won't start after reboot of server

Problem:
Foreman-Proxy service won’t start
Expected outcome:
Proxy service should start
Foreman and Proxy versions:
Foreman 3.2.1 Katello 4.4 Proxy 3.2.1
Foreman and Proxy plugin versions:
The following were installed at installation time (this is still the current config)
foreman-plugin-ansible
foreman-plugin-discovery
foreman-plugin-openscap
foreman-proxy-plugin-remote-execution-ssh
Distribution and version:
Rocky 8.6
Other relevant data:
I have a relatively new installation. (Within the last month) It was installed with the plugins listed above. After rebooting I discovered through some error messages in the GUI that the foreman-proxy service was not running.

I was able to do some troubleshooting by looking at the basics but it seems the error messages aren’t as useful as I’d like.

[root@gsil-satellite ~]# foreman-maintain health check
Running ForemanMaintain::Scenario::FilteredScenario
================================================================================
Check number of fact names in database:                               [OK]
--------------------------------------------------------------------------------
Check whether all services are running:                               [FAIL]
Following services are not running: foreman-proxy
--------------------------------------------------------------------------------
Continue with step [Restart applicable services]?, [y(yes), n(no)] y
Restart applicable services:                                                    

Stopping the following service(s):
foreman-proxy
- All services stopped                                                          

Starting the following service(s):
foreman-proxy
| starting foreman-proxy                                                        
Job for foreman-proxy.service failed because the control process exited with error code.
See "systemctl status foreman-proxy.service" and "journalctl -xe" for details.
| All services started                                                          
\ Server responded successfully!                                      [OK]      
--------------------------------------------------------------------------------
Rerunning the check after fix procedure
Check whether all services are running:                               [FAIL]
Following services are not running: foreman-proxy
--------------------------------------------------------------------------------
Continue with step [Restart applicable services]?, [y(yes), n(no)] n
Check whether all services are running using the ping call:           [OK]      
--------------------------------------------------------------------------------
Check for paused tasks:                                               [OK]
--------------------------------------------------------------------------------
Scenario [ForemanMaintain::Scenario::FilteredScenario] failed.

The following steps ended up in failing state:

  [services-up]

Resolve the failed steps and rerun the command.
In case the failures are false positives, use
--whitelist="services-up"

[root@gsil-satellite ~]# systemctl start foreman-proxy
Job for foreman-proxy.service failed because the control process exited with error code.
See "systemctl status foreman-proxy.service" and "journalctl -xe" for details.


[root@gsil-satellite ~]# systemctl status foreman-proxy
● foreman-proxy.service - Foreman Proxy
   Loaded: loaded (/usr/lib/systemd/system/foreman-proxy.service; enabled; vendor preset: disabled)
  Drop-In: /etc/systemd/system/foreman-proxy.service.d
           └─90-limits.conf
   Active: failed (Result: exit-code) since Wed 2022-06-29 09:29:25 CDT; 16s ago
  Process: 16205 ExecStart=/usr/share/foreman-proxy/bin/smart-proxy --no-daemonize (code=exited, status=203/EXEC)
 Main PID: 16205 (code=exited, status=203/EXEC)

Jun 29 09:29:25 gsil-satellite.gsil.smil systemd[1]: Starting Foreman Proxy...
Jun 29 09:29:25 gsil-satellite.gsil.smil systemd[1]: foreman-proxy.service: Main process exited, code=exited, status=203/EXEC
Jun 29 09:29:25 gsil-satellite.gsil.smil systemd[1]: foreman-proxy.service: Failed with result 'exit-code'.
Jun 29 09:29:25 gsil-satellite.gsil.smil systemd[1]: Failed to start Foreman Proxy.
[root@gsil-satellite ~]# systemctl start foreman-proxy
Job for foreman-proxy.service failed because the control process exited with error code.
See "systemctl status foreman-proxy.service" and "journalctl -xe" for details.


[root@gsil-satellite ~]# journalctl -xe
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- The unit run-user-993.mount has successfully entered the 'dead' state.
Jun 29 09:30:13 gsil-satellite.gsil.smil systemd[1]: user-runtime-dir@993.service: Succeeded.
-- Subject: Unit succeeded
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- The unit user-runtime-dir@993.service has successfully entered the 'dead' state.
Jun 29 09:30:13 gsil-satellite.gsil.smil systemd[1]: Stopped User runtime directory /run/user/993.
-- Subject: Unit user-runtime-dir@993.service has finished shutting down
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit user-runtime-dir@993.service has finished shutting down.
Jun 29 09:30:13 gsil-satellite.gsil.smil auditd[1771]: Audit daemon log file is larger than max size
Jun 29 09:30:13 gsil-satellite.gsil.smil systemd[1]: Removed slice User Slice of UID 993.
-- Subject: Unit user-993.slice has finished shutting down
-- Defined-By: systemd
-- Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
-- 
-- Unit user-993.slice has finished shutting down.
Jun 29 09:30:24 gsil-satellite.gsil.smil rsyslogd[2266]: cannot resolve hostname 'logcollector': Resource temporarily unavailable [v8.2102.0-7.el8_6.1 try https://www.rsyslog.c>
lines 1551-1573/1573 (END)

I also have the logs from /var/log/foreman-proxyproxy.log if needed. I didn’t want to clog the forum with that. Just ask and I can send it.

Thanks for your support.

This means the executeable from the systemd unit is not executable. Can you check the file /usr/share/foreman-proxy/bin/smart-proxy exists, is executable, SELinux label is correct and all the other usual stuff? It could also be the shebang or something similar so also trying to run it in the foreground to see some error might help.

The smart-proxy file does exist, it is executable, the SELinux label is correct.
I compared the label against my system that is working and they are the same.

I also tried to run:

/usr/share/foreman-proxy/bin/smart-proxy --no-daemonize

and the system says:
/usr/share/gems/gems/sequel-5.42.0/lib/sequel/adapters/sqlite.rb:114: warning: rb_check_safe_obj will be removed in Ruby 3.0
and never returns me back to the command prompt until I interrupt it with CTRL + C

I am not sure I fully understand. What do you mean about the shebang? When you say to run it in the foreground I assume you mean:

/usr/share/foreman-proxy/bin/smart-proxy --no-daemonize

Is that right?

This sounds good, so I see no obvious reason why it fails to execute.

The shebang is the interpreter in the first line of an executable script, so if this is wrong like the interpreter is missing execution would also fail.

There is a drop-in for systemd at /etc/systemd/system/foreman-proxy.service.d/90-limits.conf, what is it defining and is it the same on the other system working fine?

Ok, thanks for clarifying the shebang. I wasn’t thinking about looking inside the contents of the script at the time of writing. Now I see it though :grinning:
Both systems have #!/usr/bin/ruby defined

On the working system:
I don’t find the 90-limits.conf file in the directory you listed. The only file like that is listed under /etc/systemd/system/redis.service.d/90-limits.conf
This makes sense considering that this system was installed with
foreman-installer --scenario katello --foreman-initial-organization=GSIL
So, we can sort of compare the two systems, but not totally.

On the “dead” system:
I do find the file path you listed. It is set with:
LimitNOFILE=100000
This system was installed as:

foreman-installer --scenario katello --foreman-initial-organization=GSIL \
--enable-foreman-plugin-ansible \
--enable-foreman-plugin-discovery \
--enable-foreman-plugin-openscap \
--enable-foreman-plugin-remote-execution-ssh

I can not see a reason for increasing the file limit to render the service not executable, so this is very likely also a dead end. So I am out of ideas for now. :frowning:

Did you run this as foreman-proxy or as root?

foreman-proxy.service runs it as foreman-proxy, thus the correct way should be

# sudo -u foreman-proxy /usr/share/foreman-proxy/bin/smart-proxy --no-daemonize

I ran the command you suggested.

On the working system:
1 Stop foreman-proxy service with
systemctl stop foreman-proxy
2 become normal user and run the command as listed. The system never drops back into a command prompt. Use CTRL + C to exit and return to command prompt.
3 become root and run the command as listed. The system behavior is the same as #2.

On the Non-Working system:
1 foreman-proxy service is already stopped/dead so nothing to do.
2. become normal user and run the command as listed. The system says sudo: unable to execute /usr/share/foreman-proxy/bin/smart-proxy: Operation not permitted
3 become root and run command as listed. The system says sudo: unable to execute /usr/share/foreman-proxy/bin/smart-proxy: Operation not permitted

Okay, so operation not permitted. Sounds like permissions. I compared the smart-proxy file on both systems. It is set 755 and owned root:root.
So no difference there on either system. Both are the same.

I also tried setenforce 0 and run the command again just in case. That made no difference.

I don’t understand that. Does your “normal user” have full sudo privileges? Otherwise this shouldn’t be possible.

Can you check the rpm if something has changed which shouldn’t:

$ rpm -V foreman-proxy
S.?....T.  c /etc/foreman-proxy/settings.d/bmc.yml
S.?....T.  c /etc/foreman-proxy/settings.d/dhcp.yml
S.?....T.  c /etc/foreman-proxy/settings.d/dhcp_isc.yml
S.?....T.  c /etc/foreman-proxy/settings.d/dhcp_libvirt.yml
..?......  c /etc/foreman-proxy/settings.d/dhcp_native_ms.yml
S.?....T.  c /etc/foreman-proxy/settings.d/dns.yml
..?......  c /etc/foreman-proxy/settings.d/dns_dnscmd.yml
S.?....T.  c /etc/foreman-proxy/settings.d/dns_libvirt.yml
S.?....T.  c /etc/foreman-proxy/settings.d/dns_nsupdate.yml
S.?....T.  c /etc/foreman-proxy/settings.d/dns_nsupdate_gss.yml
..?......  c /etc/foreman-proxy/settings.d/facts.yml
S.?....T.  c /etc/foreman-proxy/settings.d/httpboot.yml
S.?....T.  c /etc/foreman-proxy/settings.d/logs.yml
S.?....T.  c /etc/foreman-proxy/settings.d/puppet.yml
S.?....T.  c /etc/foreman-proxy/settings.d/puppet_proxy_puppet_api.yml
S.?....T.  c /etc/foreman-proxy/settings.d/puppetca.yml
S.?....T.  c /etc/foreman-proxy/settings.d/puppetca_hostname_whitelisting.yml
S.?....T.  c /etc/foreman-proxy/settings.d/puppetca_http_api.yml
S.?....T.  c /etc/foreman-proxy/settings.d/puppetca_puppet_cert.yml
S.?....T.  c /etc/foreman-proxy/settings.d/puppetca_token_whitelisting.yml
S.?....T.  c /etc/foreman-proxy/settings.d/realm.yml
S.?....T.  c /etc/foreman-proxy/settings.d/realm_freeipa.yml
S.?....T.  c /etc/foreman-proxy/settings.d/registration.yml
S.?....T.  c /etc/foreman-proxy/settings.d/templates.yml
S.?....T.  c /etc/foreman-proxy/settings.d/tftp.yml
S.?....T.  c /etc/foreman-proxy/settings.yml
$ rpm -q foreman-proxy
foreman-proxy-3.2.1-1.el8.noarch
$ getent passwd foreman-proxy
foreman-proxy:x:493:493:Foreman Proxy daemon user:/usr/share/foreman-proxy:/bin/false
$ ls -lZ /usr/share/foreman-proxy/bin/smart-proxy
-rwxr-xr-x. 1 root root system_u:object_r:bin_t:s0 175 May 24 12:06 /usr/share/foreman-proxy/bin/smart-proxy
$ lsattr /usr/share/foreman-proxy/bin/smart-proxy
-------------------- /usr/share/foreman-proxy/bin/smart-proxy

It’s a ruby script, so this should work, too:

# echo "puts \"Hello world\", RUBY_VERSION, RUBY_PATCHLEVEL" | sudo -u foreman-proxy /usr/bin/ruby
Hello world
2.7.4
191

Yes, I can run sudo -i and become root without issue as well as run other commands as sudo. So privilege escalation is not an issue.

I agree and this has me confused as well. It is not logical.

I don’t fully understand the output from rpm -V.
Here’s what I do understand:
1 -V is to verify
2 The listing is all the files that are contained in the rpm package and the file path for each
3. I don’t know about the S ? and T. What do those represent?
4. I see some minor differences between your rpm -V listing and mine that I can’t explain. We both have the same rpm version for foreman-proxy
Also, I note there is a difference between my working and non-working system. One of the paths is missing an “S” on the dead system but on the working system that same path contains the S

S.5…T. c /etc/foreman-proxy/settings.d/templates.yml (working)
…5…T. c /etc/foreman-proxy/settings.d/templates.yml (non-working)

yes, I reviewed that the foreman-proxy user is present on both systems. I did less /etc/passwd and did a quick visual search for the user in the listing but still the same thing, right? (though, your command is more elegant and clean visually) :slightly_smiling_face:

Yes, I checked SELinux context earlier in the thread but double checked just in case I missed it the first time.

The attribute listings all match

The Ruby one liner script gave me the same output as you on both systems

The letter in RPM verification have the following meaning:

       S file Size differs
       M Mode differs (includes permissions and file type)
       5 digest (formerly MD5 sum) differs
       D Device major/minor number mismatch
       L readLink(2) path mismatch
       U User ownership differs
       G Group ownership differs
       T mTime differs
       P caPabilities differ

So on the working system the size changed and on the non-working not, this could be caused by a difference in the file. But then it should not fail the whole service, but only the feature.

You shouldn’t use sudo -i here. We do not want a login shell.

Full docs are in the man pages for rpm. The verification tests are:

       S file Size differs
       M Mode differs (includes permissions and file type)
       5 digest (formerly MD5 sum) differs
       D Device major/minor number mismatch
       L readLink(2) path mismatch
       U User ownership differs
       G Group ownership differs
       T mTime differs
       P caPabilities differ

No. Only if you do rpm -Vv. Without -v it only prints how differences from the files in the rpm, i.e. files which have been modified or are missing. The c in the line denotes that it’s a configuration file which may have been modified (i.e. they won’t be overwritten during an update or reinstall). Seeing modified files like in my output is to be expected. That’s just the foreman configuration of the foreman-proxy in place. You shouldn’t have any other differences, though.

That’s probably just by accident. Both files are different, just the latter happens to have the exact same size as the original file in the rpm. You can check the files and compare.

Generally, it must easier, if you copy the output of those commands into your response. I can’t tell, if everything is O.K. without seeing the full output.

Likewise. Simply run the command and paste the output.

Again. Simply copy the output. It’s much easier for everyone to verify, if you post the output…

Mea Culpa, Mea Culpa! I am eating humble pie and wearing egg on my face. I had forgotten that I applied some STIG settings to this system earlier. Clearly there must be a setting that needs to be backed off. I am reviewing my SCAP scan to see what was actually applied to the system and see if I can spot any obvious areas that would cause permission type issues.

@GVDE @Dirk
Thank you both. I have decided it will be far easier to just rebuild the system as it is not fully in production use just yet. It was in a transition stage:

new build > secure the system > production use.

Sorry, I didn’t realize earlier about the STIG settings. Guess I have been so busy that it slipped my mind. It’s funny what you think you remember sometimes.

I believe the issue is with fapolicyd being enabled by the STIG settings. Disabling with systemctl disable --now fapolicyd should fix the issue. The following Red Hat article discusses other workaround options: How to configure fapolicyd in satellite 6? - Red Hat Customer Portal

A policy is already created at GitHub - theforeman/foreman-fapolicyd, packaging is done only for nightly yet. If it is required I could try to do a backport.