Foreman cannot fork new processes because it cannot allocate memory

Problem:
Foreman seems to crash often because it cannot fork new processes because it cannot allocate memory (errno=12)

Expected outcome:
Should not be there :slight_smile:

Foreman and Proxy versions:
I’m running on 2.1.4 (just upgraded) but on 2.1.3 was the same

Foreman and Proxy plugin versions:

Distribution and version:
RHEL 7.9

Other relevant data:

My first thought would be, out of memory (OS level). But everytime I check free -m there should be enough free memory.

One thing to add, might be coincidence, but the problems started after this machine was updated to RHEL 7.9! As I cannot find records before this upgrade!

This is the last part of the log, the first 100 lines are all the same: The cannot fork new process message.

[ 2020-10-29 13:42:08.0453 14310/7f6d64457700 Pool2/Pool.h:760 ]: ERROR: Cannot fork() a new process: Cannot allocate memory (errno=12)
Backtrace:
in ‘long long unsigned int Passenger::ApplicationPool2::Pool::realCollectAnalytics()’ (Pool.h:867)
in ‘static void Passenger::ApplicationPool2::Pool::collectAnalytics(Passenger::ApplicationPool2::PoolPtr)’ (Pool.h:754)
terminate called after throwing an instance of ‘boost::exception_detail::clone_impl<boost::exception_detail::error_info_injectorboost::thread_resource_error >’
what(): boost::thread_resource_error: Resource temporarily unavailable
ERROR: cannot fork a process for executing ‘tee’
[ pid=14310, timestamp=1603975328 ] Process aborted! signo=SIGABRT(6), reason=SI_TKILL, signal sent by PID 14310 with UID 0, si_addr=0x37d6, randomSeed=1603975300
[ pid=14310 ] Could not create crash log file, so dumping to stderr only.
[ pid=14310 ] Could fork a child process for dumping diagnostics: fork() failed with errno=12
[ 2020-10-29 13:42:08.0600 13868/7fbae9320700 agents/Watchdog/AgentWatcher.cpp:96 ]: Phusion Passenger helper agent (pid=14310) crashed with signal SIGABRT, restarting it…

Can you please share the output of passenger-status when foreman is running? Also, is there any specific request that causes the crash? what are the last requests in production.log and in apache logs before the crashes?

Hi,

Thanks! Well passenger-status says:
ERROR: Phusion Passenger doesn’t seem to be running.

About the trigger, we see a lot of failed reports from random servers calling the puppet enc eg:
/node/xxxxxxx?format=yml
But I also got it randomly on the web interface.

And for example:

2020-10-29T13:42:06 [I|app|6b345db2] Processing by SmartProxiesController#show as HTML
2020-10-29T13:42:06 [I|app|6b345db2] Parameters: {“id”=>“1-xxxxxxxxxxxxxxxxxxxx”}
2020-10-29T13:42:06 [I|app|6b345db2] Rendering smart_proxies/show.html.erb within layouts/application
2020-10-29T13:42:06 [I|app|6b345db2] Rendered smart_proxies/plugins/_plugin_version.html.erb (Duration: 0.4ms | Allocations: 172)
2020-10-29T13:42:06 [I|app|6b345db2] Rendered smart_proxies/plugins/_no_template.html.erb (Duration: 1.5ms | Allocations: 517)
2020-10-29T13:42:06 [I|app|6b345db2] Rendered smart_proxies/plugins/_plugin_version.html.erb (Duration: 0.1ms | Allocations: 25)
2020-10-29T13:42:06 [I|app|6b345db2] Rendered /opt/theforeman/tfm/root/usr/share/gems/gems/foreman_openscap-4.0.2/app/views/smart_proxies/plugins/_openscap.html.erb (Duration: 2.5ms | Allocations: 997)
2020-10-29T13:42:06 [I|app|6b345db2] Rendered smart_proxies/plugins/_plugin_version.html.erb (Duration: 0.1ms | Allocations: 25)
2020-10-29T13:42:06 [I|app|6b345db2] Rendered smart_proxies/plugins/_tftp.html.erb (Duration: 1.9ms | Allocations: 536)
2020-10-29T13:42:06 [I|app|6b345db2] Rendered smart_proxies/plugins/_logs.html.erb (Duration: 0.7ms | Allocations: 206)
2020-10-29T13:42:06 [I|app|6b345db2] Rendered smart_proxies/plugins/_plugin_version.html.erb (Duration: 0.1ms | Allocations: 25)
2020-10-29T13:42:06 [I|app|6b345db2] Rendered smart_proxies/plugins/_puppet.html.erb (Duration: 50.0ms | Allocations: 1122)
2020-10-29T13:42:06 [I|app|6b345db2] Rendered smart_proxies/plugins/_plugin_version.html.erb (Duration: 0.1ms | Allocations: 25)
2020-10-29T13:42:06 [I|app|6b345db2] Rendered smart_proxies/plugins/_puppet_ca.html.erb (Duration: 65.8ms | Allocations: 1661)
2020-10-29T13:42:06 [I|app|6b345db2] Rendered smart_proxies/show.html.erb within layouts/application (Duration: 495.7ms | Allocations: 63481)
2020-10-29T13:42:06 [I|app|6b345db2] Rendered layouts/_application_content.html.erb (Duration: 3.0ms | Allocations: 1023)
2020-10-29T13:42:06 [I|app|6b345db2] Rendering layouts/base.html.erb
2020-10-29T13:42:06 [I|app|6b345db2] Rendered layouts/base.html.erb (Duration: 103.7ms | Allocations: 32595)
2020-10-29T13:42:06 [W|app|6b345db2] Action failed
2020-10-29T13:42:06 [I|app|6b345db2] Rendering common/500.html.erb within layouts/application
2020-10-29T13:42:06 [I|app|6b345db2] Rendered common/500.html.erb within layouts/application (Duration: 3.1ms | Allocations: 1179)
2020-10-29T13:42:06 [I|app|6b345db2] Rendered layouts/_application_content.html.erb (Duration: 0.4ms | Allocations: 158)
2020-10-29T13:42:06 [I|app|6b345db2] Rendering layouts/base.html.erb
2020-10-29T13:42:06 [I|app|6b345db2] Rendered layouts/base.html.erb (Duration: 13.7ms | Allocations: 1089)
2020-10-29T13:42:06 [I|app|6b345db2] Completed 500 Internal Server Error in 690ms (Views: 19.3ms | ActiveRecord: 297.5ms | Allocations: 105756)
2020-10-29T13:42:08 [I|dyn|] start terminating throttle_limiter…
2020-10-29T13:42:08 [I|dyn|] start terminating client dispatcher…
2020-10-29T13:42:08 [I|dyn|] stop listening for new events…
2020-10-29T13:42:08 [I|dyn|] start terminating clock…
2020-10-29T13:42:08 [I|dyn|] start terminating throttle_limiter…
2020-10-29T13:42:08 [I|dyn|] start terminating client dispatcher…
2020-10-29T13:42:08 [I|dyn|] stop listening for new events…
2020-10-29T13:42:08 [I|dyn|] start terminating clock…
2020-10-29T13:42:09 [I|dyn|] start terminating throttle_limiter…
2020-10-29T13:42:09 [I|dyn|] start terminating client dispatcher…
2020-10-29T13:42:09 [I|dyn|] stop listening for new events…
2020-10-29T13:42:09 [I|dyn|] start terminating clock…

Hmm I disabled selinux for testing, and I did not had any errors yet. Would you advice to reinstall the foreman-selinux rpm?

Perhaps, it could be that the RHEL update caused some files to be relabled etc. If you still see issues when selinux is enabled, please share the audit.log for them so we can make sure to fix the foreman policy if needed.

@tbrisker I could reinstall the selinux package. But why shouldn’t I update to 2.2 right away? So I did! I saw the new selinux trigger working right away and after a few days the selinux warnings are gone for now!

The frequency of the out of memory errors/restarts are very low now and they are acceptable right now. So maybe we should keep this thread as a future reference as some other people might have the same problem!

Kudos for trying to help me out though!

1 Like