Katello 4.0 installation on RHEL 8 failed

Adding a note that might be interesting or congruent perhaps @ekohl / @wbclark is that we are seeing that 404 in our puppet-candlepin spec testing only on EL8, e.g.

  1) candlepin works Command "curl -k -s -o /dev/null -w '%{http_code}' https://localhost:8443/candlepin/status" stdout is expected to eq "200"
     Failure/Error: its(:stdout) { should eq "200" }
       
       expected: "200"
            got: "404"
       
       (compared using ==)
       
     # ./spec/acceptance/basic_candlepin_spec.rb:13:in `block (3 levels) in <top (required)>'

This is also a case of we are seeing it in one environment but not replicating it in other environments

Could it be a connection problem between the postgres db and candlepin during the start process of tomcat/candlepin (eg. timeout) ? How could I debug it ?

Two questions to maybe help us debug:

  1. How much RAM does the system have?
  2. What version of postgres is running?
  1. Both systems (vm and bare metal) have 8GB memory. On bare metal actually it looks like this:
[root@scotty ~]# free -m
total        used        free      shared  buff/cache   available
Mem:           7664        3895        2123          83        1645        3387
Swap:          4095           0        4095
[root@scotty ~]#
  1. postgresql12 is running:
[postgres@scotty ~]$ psql
psql (12.5)
Type "help" for help.
postgres=#

Bear with me I am trying to replicate this and gather enough info to figure out what is happening here.

Could try restarting tomcat and then look at the most recent output and paste it? It should all show up via:

systemctl restart tomcat
journalctl -xef -t server

Just a few hours ago once again I made a complete new installation (wipe all, install os, install katello) with the same result: tomcat/candlepin does not start.
For this run I set selinux to permissive, so I think, we can eliminate this as error source.
Here is the output:

[root@scotty tomcat]# systemctl stop tomcat
[root@scotty tomcat]# rm -f *log
[root@scotty tomcat]# systemctl start tomcat &&  journalctl -xef -t server
-- Logs begin at Wed 2021-05-12 21:47:01 CEST. --
May 12 21:47:18 scotty.home.petersen20.de server[1082]: Java virtual machine used: /usr/lib/jvm/jre-11/bin/java
May 12 21:47:18 scotty.home.petersen20.de server[1082]: classpath used: /usr/share/tomcat/bin/bootstrap.jar:/usr/share/tomcat/bin/tomcat-juli.jar:/usr/share/java/ant.jar:/usr/share/java/ant-launcher.jar:/usr/lib/jvm/java/lib/tools.jar
May 12 21:47:18 scotty.home.petersen20.de server[1082]: main class used: org.apache.catalina.startup.Bootstrap
May 12 21:47:18 scotty.home.petersen20.de server[1082]: flags used: -Xms1024m -Xmx4096m -Djava.security.auth.login.config=/usr/share/tomcat/conf/login.config
May 12 21:47:18 scotty.home.petersen20.de server[1082]: options used: -Dcatalina.base=/usr/share/tomcat -Dcatalina.home=/usr/share/tomcat -Djava.endorsed.dirs= -Djava.io.tmpdir=/var/cache/tomcat/temp -Djava.util.logging.config.file=/usr/share/tomcat/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
May 12 21:47:18 scotty.home.petersen20.de server[1082]: arguments used: start
May 12 21:47:24 scotty.home.petersen20.de server[1082]: 12-May-2021 21:47:24.266 WARNING [main] org.apache.catalina.startup.SetAllPropertiesRule.begin [SetAllPropertiesRule]{Server/Service/Connector} Setting property 'sslProtocols' to 'TLSv1.2' did not find a matching property.
May 12 21:47:24 scotty.home.petersen20.de server[1082]: 12-May-2021 21:47:24.804 WARNING [main] org.apache.tomcat.util.digester.SetPropertiesRule.begin Match [Server/Service/Engine/Host] failed to set property [xmlValidation] to [false]
May 12 21:47:24 scotty.home.petersen20.de server[1082]: 12-May-2021 21:47:24.805 WARNING [main] org.apache.tomcat.util.digester.SetPropertiesRule.begin Match [Server/Service/Engine/Host] failed to set property [xmlNamespaceAware] to [false]
May 12 21:47:24 scotty.home.petersen20.de server[1082]: 12-May-2021 21:47:24.850 INFO [main] org.apache.catalina.core.AprLifecycleListener.lifecycleEvent The APR based Apache Tomcat Native library which allows optimal performance in production environments was not found on the java.library.path: [/usr/java/packages/lib:/usr/lib64:/lib64:/lib:/usr/lib]
May 12 21:47:31 scotty.home.petersen20.de server[1082]: 12-May-2021 21:47:31.227 INFO [main] org.apache.coyote.AbstractProtocol.init Initializing ProtocolHandler ["https-jsse-nio-127.0.0.1-23443"]
May 12 21:47:40 scotty.home.petersen20.de server[1082]: 12-May-2021 21:47:40.728 INFO [main] org.apache.catalina.startup.Catalina.load Server initialization in [18,578] milliseconds
May 12 21:47:41 scotty.home.petersen20.de server[1082]: 12-May-2021 21:47:41.882 INFO [main] org.apache.catalina.core.StandardService.startInternal Starting service [Catalina]
May 12 21:47:41 scotty.home.petersen20.de server[1082]: 12-May-2021 21:47:41.883 INFO [main] org.apache.catalina.core.StandardEngine.startInternal Starting Servlet engine: [Apache Tomcat/9.0.30]
May 12 21:47:42 scotty.home.petersen20.de server[1082]: 12-May-2021 21:47:42.053 INFO [main] org.apache.catalina.startup.HostConfig.deployDirectory Deploying web application directory [/var/lib/tomcat/webapps/candlepin]
May 12 21:48:31 scotty.home.petersen20.de server[1082]: 12-May-2021 21:48:31.611 INFO [main] org.apache.jasper.servlet.TldScanner.scanJars At least one JAR was scanned for TLDs yet contained no TLDs. Enable debug logging for this logger for a complete list of JARs that were scanned but no TLDs were found in them. Skipping unneeded JARs during scanning can improve startup time and JSP compilation time.
May 12 21:48:31 scotty.home.petersen20.de server[1082]: WARNING: An illegal reflective access operation has occurred
May 12 21:48:31 scotty.home.petersen20.de server[1082]: WARNING: Illegal reflective access by org.candlepin.pki.impl.JSSProviderLoader (file:/var/lib/tomcat/webapps/candlepin/WEB-INF/classes/) to field java.lang.ClassLoader.usr_paths
May 12 21:48:31 scotty.home.petersen20.de server[1082]: WARNING: Please consider reporting this to the maintainers of org.candlepin.pki.impl.JSSProviderLoader
May 12 21:48:31 scotty.home.petersen20.de server[1082]: WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
May 12 21:48:31 scotty.home.petersen20.de server[1082]: WARNING: All illegal access operations will be denied in a future release
May 12 21:48:43 scotty.home.petersen20.de server[1082]: 12-May-2021 21:48:43.430 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public org.candlepin.model.Persisted org.candlepin.model.OwnerCurator.create(org.candlepin.model.Persisted)] is synthetic and is being intercepted by [com.google.inject.persist.jpa.JpaLocalTxnInterceptor@3ae3dfd2]. This could indicate a bug.  The method may be intercepted twice, or may not be intercepted at all.
May 12 21:48:43 scotty.home.petersen20.de server[1082]: 12-May-2021 21:48:43.802 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public org.candlepin.model.Persisted org.candlepin.model.ProductCurator.merge(org.candlepin.model.Persisted)] is synthetic and is being intercepted by [com.google.inject.persist.jpa.JpaLocalTxnInterceptor@3ae3dfd2]. This could indicate a bug.  The method may be intercepted twice, or may not be intercepted at all.
May 12 21:48:43 scotty.home.petersen20.de server[1082]: 12-May-2021 21:48:43.803 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public void org.candlepin.model.ProductCurator.delete(org.candlepin.model.Persisted)] is synthetic and is being intercepted by [com.google.inject.persist.jpa.JpaLocalTxnInterceptor@3ae3dfd2]. This could indicate a bug.  The method may be intercepted twice, or may not be intercepted at all.
May 12 21:48:43 scotty.home.petersen20.de server[1082]: 12-May-2021 21:48:43.804 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public org.candlepin.model.Persisted org.candlepin.model.ProductCurator.create(org.candlepin.model.Persisted)] is synthetic and is being intercepted by [com.google.inject.persist.jpa.JpaLocalTxnInterceptor@3ae3dfd2]. This could indicate a bug.  The method may be intercepted twice, or may not be intercepted at all.
May 12 21:48:44 scotty.home.petersen20.de server[1082]: 12-May-2021 21:48:44.031 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public void org.candlepin.model.EntitlementCurator.delete(org.candlepin.model.Persisted)] is synthetic and is being intercepted by [com.google.inject.persist.jpa.JpaLocalTxnInterceptor@3ae3dfd2]. This could indicate a bug.  The method may be intercepted twice, or may not be intercepted at all.
May 12 21:48:44 scotty.home.petersen20.de server[1082]: 12-May-2021 21:48:44.083 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public void org.candlepin.model.ConsumerCurator.delete(org.candlepin.model.Persisted)] is synthetic and is being intercepted by [com.google.inject.persist.jpa.JpaLocalTxnInterceptor@3ae3dfd2]. This could indicate a bug.  The method may be intercepted twice, or may not be intercepted at all.
May 12 21:48:44 scotty.home.petersen20.de server[1082]: 12-May-2021 21:48:44.085 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public org.candlepin.model.Persisted org.candlepin.model.ConsumerCurator.create(org.candlepin.model.Persisted,boolean)] is synthetic and is being intercepted by [com.google.inject.persist.jpa.JpaLocalTxnInterceptor@3ae3dfd2]. This could indicate a bug.  The method may be intercepted twice, or may not be intercepted at all.
May 12 21:48:44 scotty.home.petersen20.de server[1082]: 12-May-2021 21:48:44.382 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public void org.candlepin.model.CdnCurator.delete(org.candlepin.model.Persisted)] is synthetic and is being intercepted by [com.google.inject.persist.jpa.JpaLocalTxnInterceptor@3ae3dfd2]. This could indicate a bug.  The method may be intercepted twice, or may not be intercepted at all.
May 12 21:48:44 scotty.home.petersen20.de server[1082]: 12-May-2021 21:48:44.448 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public void org.candlepin.model.PoolCurator.delete(org.candlepin.model.Persisted)] is synthetic and is being intercepted by [com.google.inject.persist.jpa.JpaLocalTxnInterceptor@3ae3dfd2]. This could indicate a bug.  The method may be intercepted twice, or may not be intercepted at all.
May 12 21:48:44 scotty.home.petersen20.de server[1082]: 12-May-2021 21:48:44.839 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public void org.candlepin.model.RulesCurator.delete(org.candlepin.model.Persisted)] is synthetic and is being intercepted by [com.google.inject.persist.jpa.JpaLocalTxnInterceptor@3ae3dfd2]. This could indicate a bug.  The method may be intercepted twice, or may not be intercepted at all.
May 12 21:48:44 scotty.home.petersen20.de server[1082]: 12-May-2021 21:48:44.840 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public org.candlepin.model.Persisted org.candlepin.model.RulesCurator.create(org.candlepin.model.Persisted)] is synthetic and is being intercepted by [com.google.inject.persist.jpa.JpaLocalTxnInterceptor@3ae3dfd2]. This could indicate a bug.  The method may be intercepted twice, or may not be intercepted at all.
May 12 21:48:44 scotty.home.petersen20.de server[1082]: 12-May-2021 21:48:44.926 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public void org.candlepin.model.ContentCurator.delete(org.candlepin.model.Persisted)] is synthetic and is being intercepted by [com.google.inject.persist.jpa.JpaLocalTxnInterceptor@3ae3dfd2]. This could indicate a bug.  The method may be intercepted twice, or may not be intercepted at all.
May 12 21:48:45 scotty.home.petersen20.de server[1082]: 12-May-2021 21:48:45.560 WARNING [main] com.google.inject.internal.ProxyFactory.<init> Method [public void org.candlepin.model.EntitlementCertificateCurator.delete(org.candlepin.model.Persisted)] is synthetic and is being intercepted by [com.google.inject.persist.jpa.JpaLocalTxnInterceptor@3ae3dfd2]. This could indicate a bug.  The method may be intercepted twice, or may not be intercepted at all.
May 12 21:49:18 scotty.home.petersen20.de server[1082]: 12-May-2021 21:49:18.538 SEVERE [main] org.apache.catalina.core.StandardContext.startInternal One or more listeners failed to start. Full details will be found in the appropriate container log file
May 12 21:49:18 scotty.home.petersen20.de server[1082]: 12-May-2021 21:49:18.567 SEVERE [main] org.apache.catalina.core.StandardContext.startInternal Context [/candlepin] startup failed due to previous errors
May 12 21:49:18 scotty.home.petersen20.de server[1082]: 12-May-2021 21:49:18.743 WARNING [main] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesJdbc The web application [candlepin] registered the JDBC driver [org.postgresql.Driver] but failed to unregister it when the web application was stopped. To prevent a memory leak, the JDBC Driver has been forcibly unregistered.
May 12 21:49:18 scotty.home.petersen20.de server[1082]: 12-May-2021 21:49:18.774 WARNING [main] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [candlepin] appears to have started a thread named [C3P0PooledConnectionPoolManager[identityToken->1hgf027ahr854qk1annxnp|1aff6dc1]-AdminTaskTimer] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:
May 12 21:49:18 scotty.home.petersen20.de server[1082]:  java.base@11.0.11/java.lang.Object.wait(Native Method)
May 12 21:49:18 scotty.home.petersen20.de server[1082]:  java.base@11.0.11/java.util.TimerThread.mainLoop(Timer.java:553)
May 12 21:49:18 scotty.home.petersen20.de server[1082]:  java.base@11.0.11/java.util.TimerThread.run(Timer.java:506)
May 12 21:49:18 scotty.home.petersen20.de server[1082]: 12-May-2021 21:49:18.792 WARNING [main] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [candlepin] appears to have started a thread named [C3P0PooledConnectionPoolManager[identityToken->1hgf027ahr854qk1annxnp|1aff6dc1]-HelperThread-#0] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:
May 12 21:49:18 scotty.home.petersen20.de server[1082]:  java.base@11.0.11/java.lang.Object.wait(Native Method)
May 12 21:49:18 scotty.home.petersen20.de server[1082]:  com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:683)
May 12 21:49:18 scotty.home.petersen20.de server[1082]: 12-May-2021 21:49:18.793 WARNING [main] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [candlepin] appears to have started a thread named [C3P0PooledConnectionPoolManager[identityToken->1hgf027ahr854qk1annxnp|1aff6dc1]-HelperThread-#1] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:
May 12 21:49:18 scotty.home.petersen20.de server[1082]:  java.base@11.0.11/java.lang.Object.wait(Native Method)
May 12 21:49:18 scotty.home.petersen20.de server[1082]:  com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:683)
May 12 21:49:18 scotty.home.petersen20.de server[1082]: 12-May-2021 21:49:18.795 WARNING [main] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [candlepin] appears to have started a thread named [C3P0PooledConnectionPoolManager[identityToken->1hgf027ahr854qk1annxnp|1aff6dc1]-HelperThread-#2] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:
May 12 21:49:18 scotty.home.petersen20.de server[1082]:  java.base@11.0.11/java.lang.Object.wait(Native Method)
May 12 21:49:18 scotty.home.petersen20.de server[1082]:  com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:683)
May 12 21:49:18 scotty.home.petersen20.de server[1082]: 12-May-2021 21:49:18.796 WARNING [main] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesThreads The web application [candlepin] appears to have started a thread named [Thread-0 (-scheduled-threads)] but has failed to stop it. This is very likely to create a memory leak. Stack trace of thread:
......
May 12 21:51:21 scotty.home.petersen20.de server[2098]:         at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
May 12 21:51:21 scotty.home.petersen20.de server[2098]: Caused by: java.lang.ClassNotFoundException: Illegal access: this web application instance has been stopped already. Could not load [ch.qos.logback.classic.spi.ThrowableProxy]. The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access.
May 12 21:51:21 scotty.home.petersen20.de server[2098]:         at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForClassLoading(WebappClassLoaderBase.java:1375)
May 12 21:51:21 scotty.home.petersen20.de server[2098]:         at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1226)
May 12 21:51:21 scotty.home.petersen20.de server[2098]:         at org.apache.catalina.loader.WebappClassLoaderBase.loadClass(WebappClassLoaderBase.java:1188)
May 12 21:51:21 scotty.home.petersen20.de server[2098]:         ... 16 more
May 12 21:51:21 scotty.home.petersen20.de server[2098]: Caused by: java.lang.IllegalStateException: Illegal access: this web application instance has been stopped already. Could not load [ch.qos.logback.classic.spi.ThrowableProxy]. The following stack trace is thrown for debugging purposes as well as to attempt to terminate the thread which caused the illegal access.
May 12 21:51:21 scotty.home.petersen20.de server[2098]:         at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForResourceLoading(WebappClassLoaderBase.java:1385)
May 12 21:51:21 scotty.home.petersen20.de server[2098]:         at org.apache.catalina.loader.WebappClassLoaderBase.checkStateForClassLoading(WebappClassLoaderBase.java:1373)
May 12 21:51:21 scotty.home.petersen20.de server[2098]:         ... 18 more
May 12 21:51:26 scotty.home.petersen20.de server[2098]: Exception in thread "Thread-2 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@52cfd82b)" java.lang.NoClassDefFoundError: ch/qos/logback/classic/spi/ThrowableProxy
May 12 21:51:26 scotty.home.petersen20.de server[2098]:         at ch.qos.logback.classic.spi.LoggingEvent.<init>(LoggingEvent.java:119)
May 12 21:51:26 scotty.home.petersen20.de server[2098]:         at ch.qos.logback.classic.Logger.buildLoggingEventAndAppend(Logger.java:419)
May 12 21:51:26 scotty.home.petersen20.de server[2098]:         at ch.qos.logback.classic.Logger.filterAndLog_0_Or3Plus(Logger.java:383)
May 12 21:51:26 scotty.home.petersen20.de server[2098]:         at ch.qos.logback.classic.Logger.log(Logger.java:765)
May 12 21:51:26 scotty.home.petersen20.de server[2098]:         at jdk.internal.reflect.GeneratedMethodAccessor49.invoke(Unknown Source)
May 12 21:51:26 scotty.home.petersen20.de server[2098]:         at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
May 12 21:51:26 scotty.home.petersen20.de server[2098]:         at java.base/java.lang.reflect.Method.invoke(Method.java:566)
May 12 21:51:26 scotty.home.petersen20.de server[2098]:         at org.jboss.logging.Slf4jLocationAwareLogger.doLog(Slf4jLocationAwareLogger.java:89)
May 12 21:51:26 scotty.home.petersen20.de server[2098]:         at org.jboss.logging.Slf4jLocationAwareLogger.doLog(Slf4jLocationAwareLogger.java:75)
May 12 21:51:26 scotty.home.petersen20.de server[2098]:         at org.jboss.logging.Logger.warn(Logger.java:1236)
May 12 21:51:26 scotty.home.petersen20.de server[2098]:         at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:47)
May 12 21:51:26 scotty.home.petersen20.de server[2098]:         at org.apache.activemq.artemis.utils.actors.OrderedExecutor.doTask(OrderedExecutor.java:31)
May 12 21:51:26 scotty.home.petersen20.de server[2098]:         at org.apache.activemq.artemis.utils.actors.ProcessorBase.executePendingTasks(ProcessorBase.java:65)
May 12 21:51:26 scotty.home.petersen20.de server[2098]:         at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
May 12 21:51:26 scotty.home.petersen20.de server[2098]:         at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
May 12 21:51:26 scotty.home.petersen20.de server[2098]:         at org.apache.activemq.artemis.utils.ActiveMQThreadFactory$1.run(ActiveMQThreadFactory.java:118)
^C
[root@scotty tomcat]# 

(The complete logs are too long for pasting, so I attached them as file).candlepin.log (48.3 KB) catalina.2021-05-12.log (69.2 KB) error.log (14.0 KB) localhost.2021-05-12.log (8.8 KB)

To help level set the environment what do you have for:

/usr/lib/jvm/jre-11/bin/java -version
rpm -q candlepin
[root@scotty ~]# /usr/lib/jvm/jre-11/bin/java -version
openjdk version "11.0.11" 2021-04-20 LTS
OpenJDK Runtime Environment 18.9 (build 11.0.11+9-LTS)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.11+9-LTS, mixed mode, sharing)
[root@scotty ~]# rpm -q candlepin
candlepin-3.2.11-1.el8.noarch
[root@scotty ~]#

Hello Gentleman.

While I have absolutely nothing meaningful to contribute, I have been experiencing this issue as well. I have noticed the following.

Senerio:
I have an automated build (kickstart) with additional build scripts that execute after the node has been rebooted to ensure a consistent build. 4 out of 5 builds results in the same issue with the “Candlepin 404” error. Similar logs as denoted above.

On the 5th build, the entire Foreman/Katello system builds out perfectly fine.

The system is an 8 core, 32 gig RAM RHEL v8.3 VM. The hypervisor it is on is extremely under utilized.

Happy to contribute with any troubleshooting, log collection, etc.

I switched to centos8 and all works fine. Unbelievable, I think ! Never before I saw such a mismatch between centos and rhel.
All the years before I only used Centos and was satisfied with it. After Redhat made the decision of centos8 stream I started to try to switch my systems to Redhat. After this experience I will stop this try and will stay at some of the centos derivaties.
Thanks to all here willing me to help.

1 Like

Glad to hear you got unstuck. We’ll continue to investigate this issue
since what’s in RHEL comes to CentOS eventually and part of our CI which
tests on a CentOS 8 container is still hitting this same issue.

If I can help in some way, let me know and I try as best I can.
From your point of view what is your medium-term recommendation: Stay at centos (include derivaties) or switch to rhel ?

I don’t have the luxury of switching to CentOS, so I am happy to continue working on this.

You don’t have /tmp as noexec by chance?

/tmp does not contain “noexec” in it’s mount options.

/dev/mapper/rootvg-tmplv on /tmp type ext4 (rw,nosuid,nodev,relatime,seclabel)
/dev/mapper/rootvg-tmplv on /var/tmp type ext4 (rw,nosuid,nodev,relatime,seclabel)

Just to let you know. I never installed F2.4/K4.0 on RedHat, but I did successfully on CentOS 8.3.
I used a VM (KVM) for it with 6vCPU and 8 GB. All databases are LOCAL but externally managed by PostgreSQL 12.7.
I created a iSCSI volume for /var/lib/pulp on my Synology.
This configuration works fine now for a couple of days. Not a single issue.
Even a clean install on Rocky Linux 8.3 RC1 (one of the CentOS 8 alternatives) worked fine.
As I mentioned before, the pgsql databases are local. I tried to install (using foreman-installer options) with remote PostgreSQL databases but this installation got stuck (the “Do you mean enabled?” problem).
In this forum a patch is suggested for this problem, but that not work out for me.
Because of CentOS 8 Streams, I am looking for a reliable CentOS compatible solution as well.
rgds,
-gw

So here is a fun test. Since it is virtual, and currently the build is automated. I built two identical VMS (CPU/Mem/Disk).
1 Completed successfully with the installation and 1 failed.

The failure is consistent. Its always with Candlepin 404 Not Found. Looking back through the Tomcat logs (see way above), it is the same error every failure.

It is just odd that it does succeed

It really sounds like a race condition. Perhaps there’s some async task that runs and if the service is started before it finishes, it breaks?

I was thinking the same thing, it happens in the 1400 block. I enabled DEBUG on the installer, I didn’t get much on the output.
The whole build (OS, Forman install, etc.) takes about 20-30 minutes from start to finish. I can enable anything you think would help you guys. When looking at the tomcat output, it complains that it can’t find classes. I couldn’t locate the class path, but they are indeed in the “webapps” directory for candle.

May 17 03:25:52 sec1syslog server[1178]: Exception in thread "Thread-43522 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$6@2d981b7b)" java.lang.NoClassDefFoundError: ch/qos/logback/classic/spi/ThrowableProxy

14-May-2021 14:46:43.271 SEVERE [main] org.apache.catalina.core.StandardContext.listenerStart Exception sending context initialized event to listener instance of class [org.candlepin.guice.CandlepinContextListener] java.lang.NoSuchFieldError: SERVER_SENT_EVENTS_TYPE

Ignore the first timestamp, but that is the same one it throws

I’m still trying to dig into this to figure out what might be the cause. With the release of RHEL 8.4, is anyone able to test an installation on it (instead of 8.3) to see if the issue persists across RHEL versions?