Problem:
Hi I would like some help, we are encountering random rh-mongodb34-mongod service crashes (mostly during very heavy operations such as calculating errata) unfortunately bit lost on where to start diagnosis the issue as SystemD just indicates that it received a kill 9 signal
Expected outcome:
MongoDB is no longer unstable
Foreman and Proxy versions:
1.22.0
Foreman and Proxy plugin versions:
Pulp 1.4.1
Pulp server version 2.19.1
Other relevant data:
katello-service status indicates that the below-mentioned service also failed when Mango went down, most likely since the port was not listening as indicated with the connection refused.
pulp_celerybeat.service - Pulp’s Celerybeat
Loaded: loaded (/usr/lib/systemd/system/pulp_celerybeat.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Wed 2019-07-24 12:50:37 UTC; 22min ago
Process: 10605 ExecStart=/usr/bin/celery beat --app=pulp.server.async.celery_instance.celery --scheduler=pulp.server.async.scheduler.Scheduler (code=exited, status=1/FAILURE)
Main PID: 10605 (code=exited, status=1/FAILURE)
Jul 24 12:50:36 SERVERNAME.REMOVED celery[10605]: File “/usr/lib64/python2.7/site-packages/pymongo/mongo_client.py”, line 712, in _get_socket
Jul 24 12:50:36 SERVERNAME.REMOVED celery[10605]: server = self._get_topology().select_server(selector)
Jul 24 12:50:36 SERVERNAME.REMOVED celery[10605]: File “/usr/lib64/python2.7/site-packages/pymongo/topology.py”, line 141, in select_server
Jul 24 12:50:36 SERVERNAME.REMOVED celery[10605]: address))
Jul 24 12:50:36 SERVERNAME.REMOVED celery[10605]: File “/usr/lib64/python2.7/site-packages/pymongo/topology.py”, line 117, in select_servers
Jul 24 12:50:36 SERVERNAME.REMOVED celery[10605]: self._error_message(selector))
Jul 24 12:50:36 SERVERNAME.REMOVED celery[10605]: pymongo.errors.ServerSelectionTimeoutError: localhost:27017: [Errno 111] Connection refused
Jul 24 12:50:37 SERVERNAME.REMOVED systemd[1]: pulp_celerybeat.service: main process exited, code=exited, status=1/FAILURE
Jul 24 12:50:37 SERVERNAME.REMOVED systemd[1]: Unit pulp_celerybeat.service entered failed state.
Jul 24 12:50:37 SERVERNAME.REMOVED systemd[1]: pulp_celerybeat.service failed.
logs