"Katello::Errors::CandlepinNotRunning" after "dnf update" / CentOS8

Foreman 2.5.2 installed on up to date CentOS8, all working good. Yesterday I run “dnf update” and there was two packages ready to update:

candlepin-4.0.6-1.el8.noarch
candlepin-selinux-4.0.6-1.el8.noarch

Currently installed versions was 4.0.4-1

After upgrade foreman UI shows only:
**Oops, we're sorry but something went wrong** Katello::Errors::CandlepinNotRunning

I installed original versions and it started working again:
dnf install candlepin-4.0.4-1.el8

This is our testing server, so no harm done. However I wonder what is best practice to keep in-production Foreman server up to date? How to dare install everything with “dnf update”? Should I prevent updating Foreman-related packages and install only security related fixes?

Well, why is candlepin not running?

You can check the general health of the server:

# foreman-maintain health check

Is the tomcat service running and did it load the updated candlepin?

# systemctl status tomcat

Also check the logs in /var/log/foreman/production.log and /var/log/candlepin/candlepin.log for errors.

[root@foreman ~]# foreman-maintain health check
Running ForemanMaintain::Scenario::FilteredScenario
================================================================================
Check number of fact names in database:                               [OK]
--------------------------------------------------------------------------------
Check to verify no empty CA cert requests exist:                      [OK]
--------------------------------------------------------------------------------
Check whether all services are running:                               [OK]
--------------------------------------------------------------------------------
Check whether all services are running using the ping call:           [FAIL]
Couldn't connect to the server: undefined method `to_sym' for nil:NilClass
--------------------------------------------------------------------------------
Continue with step [Restart applicable services]?, [y(yes), n(no)] n
Scenario [ForemanMaintain::Scenario::FilteredScenario] failed.

The following steps ended up in failing state:

  [server-ping]

Health check does not complete before or after upgrade. It has never worked. Symptoms seems identical to this problem.

[root@foreman ~]# systemctl status tomcat
● tomcat.service - Apache Tomcat Web Application Container
   Loaded: loaded (/usr/lib/systemd/system/tomcat.service; enabled; vendor preset: disabled)
   Active: active (running) since Fri 2021-08-06 13:21:08 EEST; 4min 25s ago
 Main PID: 1035 (java)
    Tasks: 29 (limit: 49477)
   Memory: 653.0M
   CGroup: /system.slice/tomcat.service
           └─1035 /usr/lib/jvm/jre-11/bin/java -Xms1024m -Xmx4096m -Djava.security.auth.login.config=/usr/share/tomcat/conf/login.config -classpath /usr/share/tomcat/bin/bootstrap.jar:/usr/share/tomcat/bin/tomcat-juli.jar:/usr/share/jav>

Aug 06 13:22:40 foreman.varjomgmt.samk.fi server[1035]: 06-Aug-2021 13:22:40.915 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public void org.candlepin.model.RulesCurator.delete(org.candlepin.model.Persisted)] i>
Aug 06 13:22:40 foreman.varjomgmt.samk.fi server[1035]: 06-Aug-2021 13:22:40.916 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public org.candlepin.model.Persisted org.candlepin.model.RulesCurator.create(org.cand>
Aug 06 13:22:41 foreman.varjomgmt.samk.fi server[1035]: 06-Aug-2021 13:22:41.794 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public void org.candlepin.model.ContentCurator.delete(org.candlepin.model.Persisted)]>
Aug 06 13:22:42 foreman.varjomgmt.samk.fi server[1035]: 06-Aug-2021 13:22:42.105 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public void org.candlepin.model.EntitlementCertificateCurator.delete(org.candlepin.mo>
Aug 06 13:23:16 foreman.varjomgmt.samk.fi server[1035]: 06-Aug-2021 13:23:16.608 SEVERE [main] org.apache.catalina.core.StandardContext.startInternal One or more listeners failed to start. Full details will be found in the appropriate c>
Aug 06 13:23:16 foreman.varjomgmt.samk.fi server[1035]: 06-Aug-2021 13:23:16.706 SEVERE [main] org.apache.catalina.core.StandardContext.startInternal Context [/candlepin] startup failed due to previous errors
Aug 06 13:23:17 foreman.varjomgmt.samk.fi server[1035]: 06-Aug-2021 13:23:17.011 WARNING [main] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesJdbc The web application [candlepin] registered the JDBC driver [org.postgre>
Aug 06 13:23:17 foreman.varjomgmt.samk.fi server[1035]: 06-Aug-2021 13:23:17.082 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deployment of web application directory [/var/lib/tomcat/webapps/candlepin] has finished>
Aug 06 13:23:17 foreman.varjomgmt.samk.fi server[1035]: 06-Aug-2021 13:23:17.198 INFO [main] org.apache.coyote.AbstractProtocol.start Starting ProtocolHandler ["https-jsse-nio-127.0.0.1-23443"]
Aug 06 13:23:17 foreman.varjomgmt.samk.fi server[1035]: 06-Aug-2021 13:23:17.548 INFO [main] org.apache.catalina.startup.Catalina.start Server startup in [107,876] milliseconds

[root@foreman ~]# tail -f /var/log/candlepin/candlepin.log
2021-08-06 13:22:23,891 [thread=main] [=, org=, csid=] INFO  org.candlepin.async.JobManager - Registering job: RefreshPoolsJob: org.candlepin.async.tasks.RefreshPoolsJob
2021-08-06 13:22:23,891 [thread=main] [=, org=, csid=] INFO  org.candlepin.async.JobManager - Registering job: RegenEnvEntitlementCertsJob: org.candlepin.async.tasks.RegenEnvEntitlementCertsJob
2021-08-06 13:22:23,892 [thread=main] [=, org=, csid=] INFO  org.candlepin.async.JobManager - Registering job: RegenProductEntitlementCertsJob: org.candlepin.async.tasks.RegenProductEntitlementCertsJob
2021-08-06 13:22:23,892 [thread=main] [=, org=, csid=] INFO  org.candlepin.async.JobManager - Registering job: UndoImportsJob: org.candlepin.async.tasks.UndoImportsJob
2021-08-06 13:22:23,892 [thread=main] [=, org=, csid=] INFO  org.candlepin.async.JobManager - Registering job: UnmappedGuestEntitlementCleanerJob: org.candlepin.async.tasks.UnmappedGuestEntitlementCleanerJob
2021-08-06 13:23:02,402 [thread=main] [=, org=, csid=] WARN  org.hibernate.id.UUIDHexGenerator - HHH000409: Using org.hibernate.id.UUIDHexGenerator which does not generate IETF RFC 4122 compliant UUID values; consider using org.hibernate.id.UUIDGenerator instead
2021-08-06 13:23:02,952 [thread=main] [=, org=, csid=] WARN  org.hibernate.mapping.RootClass - HHH000038: Composite-id class does not override equals(): org.candlepin.model.PoolAttribute
2021-08-06 13:23:02,953 [thread=main] [=, org=, csid=] WARN  org.hibernate.mapping.RootClass - HHH000039: Composite-id class does not override hashCode(): org.candlepin.model.PoolAttribute
2021-08-06 13:23:15,751 [thread=main] [=, org=, csid=] WARN  org.hibernate.metamodel.internal.MetadataContext - HHH015011: Unable to locate static metamodel field : org.candlepin.model.KeyPair_#privateKey; this may or may not indicate a problem with the static metamodel
2021-08-06 13:23:15,751 [thread=main] [=, org=, csid=] WARN  org.hibernate.metamodel.internal.MetadataContext - HHH015011: Unable to locate static metamodel field : org.candlepin.model.KeyPair_#publicKey; this may or may not indicate a problem with the static metamodel

/var/log/foreman/production.log:

2021-08-06T13:28:48 [E|app|45da5be1] Error occurred while starting Katello::CandlepinEventListener
2021-08-06T13:28:48 [E|app|45da5be1] Connection refused - connect(2) for "localhost" port 61613
2021-08-06T13:28:48 [E|app|45da5be1] /usr/share/gems/gems/stomp-1.4.9/lib/connection/netio.rb:461:in `initialize'
 45da5be1 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/netio.rb:461:in `open'
 45da5be1 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/netio.rb:461:in `block in open_ssl_socket'
 45da5be1 | /usr/share/ruby/timeout.rb:78:in `timeout'
 45da5be1 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/netio.rb:460:in `open_ssl_socket'
 45da5be1 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/netio.rb:520:in `open_socket'
 45da5be1 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/utils.rb:116:in `block in socket'
 45da5be1 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/utils.rb:109:in `synchronize'
 45da5be1 | /usr/share/gems/gems/stomp-1.4.9/lib/connection/utils.rb:109:in `socket'
 45da5be1 | /usr/share/gems/gems/stomp-1.4.9/lib/stomp/connection.rb:173:in `initialize'
 45da5be1 | /usr/share/gems/gems/stomp-1.4.9/lib/stomp/client.rb:134:in `new'
 45da5be1 | /usr/share/gems/gems/stomp-1.4.9/lib/stomp/client.rb:134:in `create_connection'
 45da5be1 | /usr/share/gems/gems/stomp-1.4.9/lib/stomp/client.rb:101:in `block in initialize'
 45da5be1 | /usr/share/ruby/timeout.rb:95:in `block in timeout'
 45da5be1 | /usr/share/ruby/timeout.rb:33:in `block in catch'
 45da5be1 | /usr/share/ruby/timeout.rb:33:in `catch'
 45da5be1 | /usr/share/ruby/timeout.rb:33:in `catch'
 45da5be1 | /usr/share/ruby/timeout.rb:110:in `timeout'
 45da5be1 | /usr/share/gems/gems/stomp-1.4.9/lib/stomp/client.rb:99:in `initialize'
 45da5be1 | /usr/share/gems/gems/katello-4.1.2.1/app/lib/katello/messaging/stomp_connection.rb:69:in `new'
 45da5be1 | /usr/share/gems/gems/katello-4.1.2.1/app/lib/katello/messaging/stomp_connection.rb:69:in `client'
 45da5be1 | /usr/share/gems/gems/katello-4.1.2.1/app/lib/katello/messaging/stomp_connection.rb:43:in `subscribe'
 45da5be1 | /usr/share/gems/gems/katello-4.1.2.1/app/services/katello/candlepin_event_listener.rb:34:in `run'
 45da5be1 | /usr/share/gems/gems/katello-4.1.2.1/app/lib/katello/event_daemon/monitor.rb:40:in `block in check_services'
 45da5be1 | /usr/share/gems/gems/katello-4.1.2.1/app/lib/katello/event_daemon/monitor.rb:30:in `each'
 45da5be1 | /usr/share/gems/gems/katello-4.1.2.1/app/lib/katello/event_daemon/monitor.rb:30:in `check_services'
 45da5be1 | /usr/share/gems/gems/katello-4.1.2.1/app/lib/katello/event_daemon/monitor.rb:16:in `block (2 levels) in start'
 45da5be1 | /usr/share/gems/gems/activesupport-6.0.3.7/lib/active_support/execution_wrapper.rb:88:in `wrap'
 45da5be1 | /usr/share/gems/gems/katello-4.1.2.1/app/lib/katello/event_daemon/monitor.rb:15:in `block in start'
 45da5be1 | /usr/share/gems/gems/katello-4.1.2.1/app/lib/katello/event_daemon/monitor.rb:14:in `loop'
 45da5be1 | /usr/share/gems/gems/katello-4.1.2.1/app/lib/katello/event_daemon/monitor.rb:14:in `start'
 45da5be1 | /usr/share/gems/gems/katello-4.1.2.1/app/lib/katello/event_daemon/runner.rb:78:in `block in start_monitor_thread'
 45da5be1 | /usr/share/gems/gems/logging-2.3.0/lib/logging/diagnostic_context.rb:474:in `block in create_with_logging_context'

I hit the same.

As a workaround, downgrading those 2 candlepin packages gets you running again (dnf history rollback).

The tomcat container fails to start. From /var/log/tomcat/catalina.log

06-Aug-2021 11:11:20.641 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public void org.candlepin.model.EntitlementCertificateCurator.delete(org.candlepin.model.Persisted)] is synthetic an
d is being intercepted by [com.google.inject.persist.jpa.JpaLocalTxnInterceptor@5178bad7]. This could indicate a bug.  The method may be intercepted twice, or may not be intercepted at all.
06-Aug-2021 11:11:24.864 SEVERE [main] org.apache.catalina.core.StandardContext.startInternal One or more listeners failed to start. Full details will be found in the appropriate container log file
06-Aug-2021 11:11:24.867 SEVERE [main] org.apache.catalina.core.StandardContext.startInternal Context [/candlepin] startup failed due to previous errors

/var/log/candlepin/error.log

2021-08-06 11:11:23,267 [thread=main] [=, org=, csid=] WARN  org.hibernate.id.UUIDHexGenerator - HHH000409: Using org.hibernate.id.UUIDHexGenerator which does not generate IETF RFC 4122 compliant UUID values; consider using org.hibernate.id.UUIDGenerator instead
2021-08-06 11:11:23,347 [thread=main] [=, org=, csid=] WARN  org.hibernate.mapping.RootClass - HHH000038: Composite-id class does not override equals(): org.candlepin.model.PoolAttribute
2021-08-06 11:11:23,347 [thread=main] [=, org=, csid=] WARN  org.hibernate.mapping.RootClass - HHH000039: Composite-id class does not override hashCode(): org.candlepin.model.PoolAttribute
2021-08-06 11:11:24,718 [thread=main] [=, org=, csid=] WARN  org.hibernate.metamodel.internal.MetadataContext - HHH015011: Unable to locate static metamodel field : org.candlepin.model.KeyPair_#publicKey; this may or may not indicate a problem with the static metamodel
2021-08-06 11:11:24,718 [thread=main] [=, org=, csid=] WARN  org.hibernate.metamodel.internal.MetadataContext - HHH015011: Unable to locate static metamodel field : org.candlepin.model.KeyPair_#privateKey; this may or may not indicate a problem with the static metamodel

I’m not sure how upgrading that candlepin package would break in this manner. Can you try running the foreman-installer again after upgrading candlepin?

I took a deeper look and it turns out that the new version of Candlepin you found in the repos requires a database migration. If the DB schema is out of date then Candlepin doesn’t start and re-running the installer should fix it.

To run the db migration without all the installer stuff you can do this:

  • note the postgres username (candlepin) and password from /etc/candlepin/candlepin.conf
  • run /usr/share/candlepin/cpdb --update -u candlepin -p $PASSWORD
  • systemctl restart tomcat
1 Like

Wouldn’t it be a good idea, if candlepin would print an error message like “please run database migration” or something pointing to the reason why it stopped? Why is there nothing in the logs about this?

I filed a Bugzilla against Candlepin with an RFE for this → https://bugzilla.redhat.com/show_bug.cgi?id=1993223