Candlepin does not start at boot; but OK after a "manual" restart

Katello 4.2.2. So, lately I have noticed that candlepin does not start properly after a reboot. But after that, a “foreman-maintain service restart” always works and then everything is running just fine. Hoping someone has an idea what’s happening here.

After the reboot, the candlepin logs show NOTHING. But the foreman production log looks like this:

2022-02-09T10:20:22 [I|app|] Everything ready for world: acdf84cf-260b-4fb6-81b1-fd963f83e39a
2022-02-09T10:20:22 [I|dyn|] Performing validity checks
2022-02-09T10:20:22 [I|dyn|] Finished performing validity checks
2022-02-09T10:20:36 [E|app|cbacd850] Error occurred while starting Katello::CandlepinEventListener
2022-02-09T10:20:36 [E|app|cbacd850] Connection refused - connect(2) for "localhost" port 61613
2022-02-09T10:20:36 [E|app|cbacd850] /usr/share/gems/gems/stomp-1.4.9/lib/connection/netio.rb:461:in `initialize'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/netio.rb:461:in `open'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/netio.rb:461:in `block in open_ssl_socket'
 cbacd850 | /usr/share/ruby/timeout.rb:78:in `timeout'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/netio.rb:460:in `open_ssl_socket'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/netio.rb:520:in `open_socket'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/utils.rb:116:in `block in socket'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/utils.rb:109:in `synchronize'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/utils.rb:109:in `socket'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/stomp/connection.rb:173:in `initialize'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/stomp/client.rb:134:in `new'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/stomp/client.rb:134:in `create_connection'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/stomp/client.rb:101:in `block in initialize'
 cbacd850 | /usr/share/ruby/timeout.rb:95:in `block in timeout'
 cbacd850 | /usr/share/ruby/timeout.rb:33:in `block in catch'
 cbacd850 | /usr/share/ruby/timeout.rb:33:in `catch'
 cbacd850 | /usr/share/ruby/timeout.rb:33:in `catch'
 cbacd850 | /usr/share/ruby/timeout.rb:110:in `timeout'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/stomp/client.rb:99:in `initialize'
 cbacd850 | /usr/share/gems/gems/katello-4.2.2/app/lib/katello/messaging/stomp_connection.rb:69:in `new'
 cbacd850 | /usr/share/gems/gems/katello-4.2.2/app/lib/katello/messaging/stomp_connection.rb:69:in `client'
 cbacd850 | /usr/share/gems/gems/katello-4.2.2/app/lib/katello/messaging/stomp_connection.rb:43:in `subscribe'
 cbacd850 | /usr/share/gems/gems/katello-4.2.2/app/services/katello/candlepin_event_listener.rb:34:in `run'
 cbacd850 | /usr/share/gems/gems/katello-4.2.2/app/lib/katello/event_daemon/monitor.rb:40:in `block in check_services'
 cbacd850 | /usr/share/gems/gems/katello-4.2.2/app/lib/katello/event_daemon/monitor.rb:30:in `each'
 cbacd850 | /usr/share/gems/gems/katello-4.2.2/app/lib/katello/event_daemon/monitor.rb:30:in `check_services'
 cbacd850 | /usr/share/gems/gems/katello-4.2.2/app/lib/katello/event_daemon/monitor.rb:16:in `block (2 levels) in start'
 cbacd850 | /usr/share/gems/gems/activesupport-6.0.3.7/lib/active_support/execution_wrapper.rb:88:in `wrap'
 cbacd850 | /usr/share/gems/gems/katello-4.2.2/app/lib/katello/event_daemon/monitor.rb:15:in `block in start'
 cbacd850 | /usr/share/gems/gems/katello-4.2.2/app/lib/katello/event_daemon/monitor.rb:14:in `loop'
 cbacd850 | /usr/share/gems/gems/katello-4.2.2/app/lib/katello/event_daemon/monitor.rb:14:in `start'
 cbacd850 | /usr/share/gems/gems/katello-4.2.2/app/lib/katello/event_daemon/runner.rb:78:in `block in start_monitor_thread'
 cbacd850 | /usr/share/gems/gems/logging-2.3.0/lib/logging/diagnostic_context.rb:474:in `block in create_with_logging_context'
2022-02-09T10:20:38 [I|app|07a3c103] Started GET "/" for 141.106.224.170 at 2022-02-09 10:20:38 -0600
2022-02-09T10:20:38 [W|app|07a3c103] Failed to open TCP connection to localhost:23443 (Connection refused - connect(2) for "localhost" port 23443)
 07a3c103 | /usr/share/ruby/net/http.rb:960:in `initialize'
 07a3c103 | /usr/share/ruby/net/http.rb:960:in `open'
 07a3c103 | /usr/share/ruby/net/http.rb:960:in `block in connect'
 07a3c103 | /usr/share/ruby/timeout.rb:95:in `block in timeout'
 07a3c103 | /usr/share/ruby/timeout.rb:105:in `timeout'
 07a3c103 | /usr/share/ruby/net/http.rb:958:in `connect'
 07a3c103 | /usr/share/ruby/net/http.rb:943:in `do_start'
 07a3c103 | /usr/share/ruby/net/http.rb:932:in `start'
 07a3c103 | /usr/share/gems/gems/rest-client-2.0.2/lib/restclient/request.rb:715:in `transmit'
 07a3c103 | /usr/share/gems/gems/rest-client-2.0.2/lib/restclient/request.rb:145:in `execute'
 07a3c103 | /usr/share/gems/gems/rest-client-2.0.2/lib/restclient/request.rb:52:in `execute'
 07a3c103 | /usr/share/gems/gems/rest-client-2.0.2/lib/restclient/resource.rb:51:in `get'
 07a3c103 | /usr/share/gems/gems/katello-4.2.2/app/models/katello/ping.rb:256:in `backend_status'
 07a3c103 | /usr/share/gems/gems/katello-4.2.2/app/models/katello/ping.rb:89:in `block in ping_candlepin_without_auth'
 07a3c103 | /usr/share/gems/gems/katello-4.2.2/app/models/katello/ping.rb:138:in `exception_watch'
 07a3c103 | /usr/share/gems/gems/katello-4.2.2/app/models/katello/ping.rb:88:in `ping_candlepin_without_auth'
 07a3c103 | /usr/share/gems/gems/katello-4.2.2/app/models/katello/ping.rb:227:in `ping_services_for_capsule'
 07a3c103 | /usr/share/gems/gems/katello-4.2.2/app/models/katello/ping.rb:25:in `ping!'
 07a3c103 | /usr/share/gems/gems/katello-4.2.2/app/services/katello/organization_creator.rb:14:in `create_all_organizations!'
 07a3c103 | /usr/share/gems/gems/katello-4.2.2/lib/katello/middleware/organization_created_enforcer.rb:12:in `call'
 07a3c103 | /usr/share/gems/gems/katello-4.2.2/lib/katello/middleware/event_daemon.rb:10:in `call'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/middleware/static.rb:126:in `call'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/middleware/static.rb:126:in `call'
 07a3c103 | /usr/share/gems/gems/apipie-dsl-2.4.0/lib/apipie_dsl/static_dispatcher.rb:67:in `call'
 07a3c103 | /usr/share/gems/gems/apipie-rails-0.5.17/lib/apipie/static_dispatcher.rb:66:in `call'
 07a3c103 | /usr/share/gems/gems/apipie-rails-0.5.17/lib/apipie/extractor/recorder.rb:137:in `call'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/middleware/static.rb:126:in `call'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/middleware/static.rb:126:in `call'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/middleware/static.rb:126:in `call'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/middleware/static.rb:126:in `call'
 07a3c103 | /usr/share/foreman/lib/foreman/middleware/telemetry.rb:10:in `call'
 07a3c103 | /usr/share/gems/gems/apipie-rails-0.5.17/lib/apipie/middleware/checksum_in_headers.rb:27:in `call'
 07a3c103 | /usr/share/foreman/lib/foreman/middleware/catch_json_parse_errors.rb:9:in `call'
 07a3c103 | /usr/share/gems/gems/rack-2.2.3/lib/rack/tempfile_reaper.rb:15:in `call'
 07a3c103 | /usr/share/gems/gems/rack-2.2.3/lib/rack/etag.rb:27:in `call'
 07a3c103 | /usr/share/gems/gems/rack-2.2.3/lib/rack/conditional_get.rb:27:in `call'
 07a3c103 | /usr/share/gems/gems/rack-2.2.3/lib/rack/head.rb:12:in `call'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/http/content_security_policy.rb:18:in `call'
 07a3c103 | /usr/share/foreman/lib/foreman/middleware/logging_context_session.rb:22:in `call'
 07a3c103 | /usr/share/gems/gems/rack-2.2.3/lib/rack/session/abstract/id.rb:266:in `context'
 07a3c103 | /usr/share/gems/gems/rack-2.2.3/lib/rack/session/abstract/id.rb:260:in `call'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/middleware/cookies.rb:648:in `call'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/middleware/callbacks.rb:27:in `block in call'
 07a3c103 | /usr/share/gems/gems/activesupport-6.0.3.7/lib/active_support/callbacks.rb:101:in `run_callbacks'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/middleware/callbacks.rb:26:in `call'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/middleware/actionable_exceptions.rb:18:in `call'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/middleware/debug_exceptions.rb:32:in `call'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/middleware/show_exceptions.rb:33:in `call'
 07a3c103 | /usr/share/gems/gems/railties-6.0.3.7/lib/rails/rack/logger.rb:37:in `call_app'
 07a3c103 | /usr/share/gems/gems/railties-6.0.3.7/lib/rails/rack/logger.rb:28:in `call'
 07a3c103 | /usr/share/gems/gems/sprockets-rails-3.2.1/lib/sprockets/rails/quiet_assets.rb:13:in `call'
 07a3c103 | /usr/share/foreman/lib/foreman/middleware/logging_context_request.rb:11:in `call'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/middleware/remote_ip.rb:81:in `call'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/middleware/request_id.rb:27:in `call'
 07a3c103 | /usr/share/gems/gems/katello-4.2.2/lib/katello/prevent_json_parsing.rb:12:in `call'
 07a3c103 | /usr/share/gems/gems/rack-2.2.3/lib/rack/method_override.rb:24:in `call'
 07a3c103 | /usr/share/gems/gems/rack-2.2.3/lib/rack/runtime.rb:22:in `call'
 07a3c103 | /usr/share/gems/gems/activesupport-6.0.3.7/lib/active_support/cache/strategy/local_cache_middleware.rb:29:in `call'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/middleware/executor.rb:14:in `call'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/middleware/static.rb:126:in `call'
 07a3c103 | /usr/share/gems/gems/rack-2.2.3/lib/rack/sendfile.rb:110:in `call'
 07a3c103 | /usr/share/gems/gems/actionpack-6.0.3.7/lib/action_dispatch/middleware/host_authorization.rb:76:in `call'
 07a3c103 | /usr/share/gems/gems/secure_headers-6.3.0/lib/secure_headers/middleware.rb:11:in `call'
 07a3c103 | /usr/share/gems/gems/railties-6.0.3.7/lib/rails/engine.rb:527:in `call'
 07a3c103 | /usr/share/gems/gems/railties-6.0.3.7/lib/rails/railtie.rb:190:in `public_send'
 07a3c103 | /usr/share/gems/gems/railties-6.0.3.7/lib/rails/railtie.rb:190:in `method_missing'
 07a3c103 | /usr/share/gems/gems/rack-2.2.3/lib/rack/urlmap.rb:74:in `block in call'
 07a3c103 | /usr/share/gems/gems/rack-2.2.3/lib/rack/urlmap.rb:58:in `each'
 07a3c103 | /usr/share/gems/gems/rack-2.2.3/lib/rack/urlmap.rb:58:in `call'
 07a3c103 | /usr/share/gems/gems/puma-5.3.2/lib/puma/configuration.rb:249:in `call'
 07a3c103 | /usr/share/gems/gems/puma-5.3.2/lib/puma/request.rb:77:in `block in handle_request'
 07a3c103 | /usr/share/gems/gems/puma-5.3.2/lib/puma/thread_pool.rb:338:in `with_force_shutdown'
 07a3c103 | /usr/share/gems/gems/puma-5.3.2/lib/puma/request.rb:76:in `handle_request'
 07a3c103 | /usr/share/gems/gems/puma-5.3.2/lib/puma/server.rb:438:in `process_client'
 07a3c103 | /usr/share/gems/gems/puma-5.3.2/lib/puma/thread_pool.rb:145:in `block in spawn_thread'
 07a3c103 | /usr/share/gems/gems/logging-2.3.0/lib/logging/diagnostic_context.rb:474:in `block in create_with_logging_context'
2022-02-09T10:20:38 [I|app|07a3c103] Processing by DashboardController#index as */*
2022-02-09T10:20:38 [I|app|07a3c103] Redirected to https://katello.ctsi.mcw.edu/users/login
2022-02-09T10:20:38 [I|app|07a3c103] Filter chain halted as :require_login rendered or redirected
2022-02-09T10:20:38 [I|app|07a3c103] Completed 302 Found in 104ms (ActiveRecord: 18.1ms | Allocations: 29383)
2022-02-09T10:20:51 [E|app|cbacd850] Error occurred while starting Katello::CandlepinEventListener
2022-02-09T10:20:51 [E|app|cbacd850] Connection refused - connect(2) for "localhost" port 61613
2022-02-09T10:20:51 [E|app|cbacd850] /usr/share/gems/gems/stomp-1.4.9/lib/connection/netio.rb:461:in `initialize'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/netio.rb:461:in `open'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/netio.rb:461:in `block in open_ssl_socket'
 cbacd850 | /usr/share/ruby/timeout.rb:78:in `timeout'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/netio.rb:460:in `open_ssl_socket'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/netio.rb:520:in `open_socket'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/utils.rb:116:in `block in socket'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/utils.rb:109:in `synchronize'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/utils.rb:109:in `socket'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/stomp/connection.rb:173:in `initialize'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/stomp/client.rb:134:in `new'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/stomp/client.rb:134:in `create_connection'
 cbacd850 | /usr/share/gems/gems/stomp-1.4.9/lib/stomp/client.rb:101:in `block in initialize'
 cbacd850 | /usr/share/ruby/timeout.rb:95:in `block in timeout'
 cbacd850 | /usr/share/ruby/timeout.rb:33:in `block in catch'
 cbacd850 | /usr/share/ruby/timeout.rb:33:in `catch'
 cbacd850 | /usr/share/ruby/timeout.rb:33:in `catch'
 cbacd850 | /usr/share/ruby/timeout.rb:110:in `timeout'

When you say Candlepin are you referring to the tomcat service where candlepin is running or just the listener which receives messages which is indicated as failing in the log you shared? The distinction is important.

When you say there’s ‘NOTHING’ in the candlepin log I am taking that to mean that /var/log/candlepin/candlepin.log is empty or did you mean there’s nothing that looks pertinent to you? Sharing your candlepin.log could be useful from the timeframe at and preceding the production.log stack trace you provided above.

Also, double-check that tomcat is enabled (I have no idea why it would not be already) => systemctl enable tomcat

1 Like

Huzzah! Your hints helped me figure it out. But I’ll be thorough here, just in case someone could possibily have a similar issue.

First, to answer your questions, there were no candlepin logs, period. And I am a bit unsure of the terminology here, but I assume my situation is the “tomcat service where candepin runs”.

The status I had after booting was such (from hammer status):

Compute Resources:

candlepin:
    Status:          FAIL
    Server Response: Message: Failed to open TCP connection to localhost:23443 (Connection refused - connect(2) for "localhost" port 23443)
candlepin_auth:
    Status:          FAIL
    Server Response: Message: A backend service [ Candlepin ] is unreachable
candlepin_events:
    Status:          FAIL
    message:         Not running
    Server Response: Duration: 0ms
katello_events:
    Status:          ok
    message:         0 Processed, 0 Failed
    Server Response: Duration: 0ms
pulp3:
    Status:          ok
    Server Response: Duration: 55ms
foreman_tasks:
    Status:          ok
    Server Response: Duration: 3ms

But your repy gave me things to look at. Checked tomcat. ENABLED, but… also NOT running.

[root@katello ~]# systemctl status tomcat
● tomcat.service - Apache Tomcat Web Application Container
   Loaded: loaded (/usr/lib/systemd/system/tomcat.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Wed 2022-02-09 12:48:21 CST; 2min 6s ago
  Process: 1760 ExecStart=/usr/libexec/tomcat/server start (code=exited, status=217/USER)
 Main PID: 1760 (code=exited, status=217/USER)

Feb 09 12:48:21 katello.ctsi.mcw.edu systemd[1]: Started Apache Tomcat Web Application Container.
Feb 09 12:48:21 katello.ctsi.mcw.edu systemd[1]: tomcat.service: Main process exited, code=exited, status=217/USER
Feb 09 12:48:21 katello.ctsi.mcw.edu systemd[1]: tomcat.service: Failed with result 'exit-code'.

And there’s the answer. “217/USER” means, the user did not exist when started. What?

AH… so in my setup, I have a freeipa server for user accounts and some service accounts. And… “tomcat” is one such user.

SO… in my situation, somehow the installer didn’t create the /etc/passwd entry for tomcat, and instead used my ldap user. And… apparently when tomcat starts, the tomcat user isn’t quite available yet because it’s defined remotely. I have to check on this as I am not sure I really USE that ldap user anymore…

I have another katello setup (supports a diffeent batch of servers) where I do not have this issue. And there… there’s an /etc/passsd entry for tomcat. But NOT on the host I am talking about here.

In the end, the fix was simply to create the /etc/passwd entry for the tomcat user and voila. All good.

tomcat:x:91:91:Apache Tomcat:/usr/share/tomcat:/sbin/nologin

The second katello server whee this worked out of the box does not have a tomcat user defined in ldap. Which is why only this server was affected. Both katello servers were built at about the same time.

Executive summary: Due to circumsances with my environment, the tomcatt user existed in LDAP and affected the tomcat service on my katello host. Rectified by definining tocat locally.

Off now to investigate if I really need that tomcat user anymore… OR, to limit it to only certain hosts which is probably the real answer for my setup.

Glad you got it working and sharing what you found. It’s definitely not where my mind would have gone next if you said things still weren’t working. Your setup also explains the missing/empty candlepin logs!

I’ve never tried (and it may be that no one else has either) what I am about to mention, but it appears that the tomcat user and group can be configured via /etc/foreman-installer/custom-hiera.yaml by adding the following keys:

‘candlepin::user’ and ‘candlepin::group’

It’s probably safe to assume you’d need to create the user and group yourself and then re-run the installer. Of course, if you try it, backup your system first.

Well in my case, I actually do still need the tomcat user defined centrally. Long story. I’m a bit miffed at freeipa; I have it set so that only certain hosts use that user id. BUT… that really only affects “login” etc, regardless of the host-based access rules, the user is still useable for things like “su” and starting services. Booooooo.

I’ve made a note to myself that if I end up doing a new reinstall (we all know it’ll have to happen some day again), that I need to do the install BEFORE hooking up my system to ldap. Not a big deal.

And now I’ve learned a bit more about how things work behind the scenes (ie tomcat)

1 Like