Getting errors in WebUI after login

Problem:
Getting errors in WebUI after login.

Error: Oops, we're sorry but something went wrong Katello::Errors::CandlepinNotRunning

Expected outcome:

Foreman and Proxy versions:

rpm -qa | grep -i foreman

rubygem-foreman_maintain-1.2.1-1.el8.noarch
foreman-service-3.5.1-1.el8.noarch
foreman-vmware-3.5.1-1.el8.noarch
foreman-postgresql-3.5.1-1.el8.noarch
rubygem-foreman_remote_execution-8.2.0-1.fm3_5.el8.noarch
rubygem-foreman_discovery-22.0.2-1.fm3_5.el8.noarch
foreman-selinux-3.5.1-1.el8.noarch
foreman-debug-3.5.1-1.el8.noarch
rubygem-foreman_default_hostgroup-7.0.0-1.fm3_5.el8.noarch
foreman-proxy-3.5.1-1.el8.noarch
rubygem-hammer_cli_foreman_remote_execution-0.2.2-1.fm3_0.el8.noarch
rubygem-hammer_cli_foreman-3.5.0-1.el8.noarch
rubygem-hammer_cli_foreman_tasks-0.0.18-1.fm3_5.el8.noarch
foreman-dynflow-sidekiq-3.5.1-1.el8.noarch
foreman-release-3.5.1-1.el8.noarch
foreman-installer-katello-3.5.1-1.el8.noarch
rubygem-foreman_column_view-0.4.0-6.fm3_3.el8.noarch
foreman-3.5.1-1.el8.noarch
rubygem-foreman-tasks-7.1.1-2.fm3_5.el8.noarch
foreman-ec2-3.5.1-1.el8.noarch
foreman-installer-3.5.1-1.el8.noarch
foreman-cli-3.5.1-1.el8.noarch
Foreman and Proxy plugin versions:

Distribution and version:
Red Hat Enterprise Linux release 8.6 (Ootpa)

Other relevant data:
I will attach screenshot and foreman rake errors.

Error on GUI

Try looking at the running (well, not running) services via systemctl --failed or foreman-maintain service status, I bet tomcat is not running (probably because it was killed by OOMKiller?)

2 Likes

Attaching o/p of below commands.

  1. foreman-debug
    foreman-rake-errors.txt.gz (2.9 KB)

  2. foreman-rake errors:fetch_log request_id=d156bc79
    Not sure how to upload this(larger than 3MB)

  3. hammer status

# hammer -p cadence status
Version:           3.5.1
API Version:       v2
Database:
    Status:          ok
    Server Response: Duration: 0ms
Plugins:
 1) Name:    foreman-tasks
    Version: 7.1.1
 2) Name:    foreman_column_view
    Version: 0.4.0
 3) Name:    foreman_default_hostgroup
    Version: 7.0.0
 4) Name:    foreman_discovery
    Version: 22.0.2
 5) Name:    foreman_remote_execution
    Version: 8.2.0
 6) Name:    katello
    Version: 4.7.1
Smart Proxies:
 1) Name:     sjprdsatapp01.cadence.com
    Version:  3.5.1
    Status:   ok
    Features:
     1) Name:    pulpcore
        Version: 3.2.0
     2) Name:    logs
        Version: 3.5.1
Compute Resources:

candlepin:
    Status:          FAIL
    Server Response: Message: 404 Not Found
candlepin_auth:
    Status:          FAIL
    Server Response: Message: Katello::Errors::CandlepinNotRunning
candlepin_events:
    Status:          FAIL
    message:         Not running
    Server Response: Duration: 0ms
katello_events:
    Status:          ok
    message:         0 Processed, 0 Failed
    Server Response: Duration: 0ms
pulp3:
    Status:          ok
    Server Response: Duration: 454ms
pulp3_content:
    Status:          ok
    Server Response: Duration: 232ms
foreman_tasks:
    Status:          ok
    Server Response: Duration: 3ms

This is curious. I’d expect a “Not running” message if tomcat were actually not running.

Does foreman-maintain service restart --only tomcat help?

If not, I think we’ll need to see your production.log

  1. I restarted tomcat (systemctl restart tomcat) and able to get foreman webUI.
    however, there are errors in logs.
    Attaching o/p.
    tomcat-status.log (2.2 KB)

error in /var/log/message:

Feb  3 00:17:06 sjprdsatapp01 server[176131]: 03-Feb-2023 00:17:06.941 WARNING [main] org.apache.catalina.loader.WebappClassLoaderBase.clearReferencesJdbc The web application [candlepin] registered the JDBC driver [org.postgresql.Driver] but failed to unregister it when the web application was stopped. To prevent a memory leak, the JDBC Driver has been forcibly unregistered.

  1. Attaching o/p of ‘foreman-maintain service status’ taken before restarting tomcat service.
    foreman-maintain-service-status.log (32.1 KB)

@evgeni @jeremylenz
Thanks for all the help.

Restarting tomcat does solve issue but we see same error after some time.
Attaching production.log as well.
production.log.gz (502.7 KB)

023-02-02T03:46:08 [E|app|e979f5b7] Error occurred while starting Katello::CandlepinEventListener
2023-02-02T03:46:08 [E|app|e979f5b7] Connection refused - connect(2) for "localhost" port 61613

How much RAM does the system have? tomcat gets killed a lot if there’s not enough…

Restarting tomcat services only works for some time. After that I get the same error.

The system has a lot of RAM.

[root@sjprdsatapp01 ~]# free -g
              total        used        free      shared  buff/cache   available
Mem:            754          18         733           0           3         731
Swap:           127           0         127