Foreman 2.5.3-1/Katello 4.1.3-1 Smart-Proxy issues: puppet storage is not report to central server. Clients cannot connect

Problem: After an upgrade from Foreman 2.3.3/Katello 3.18 via the required steps (2.4.1) we encounter the issue that we need to run foreman-installer daily on the Proxy to fix the issues.

Expected outcome: Proxy to run without failing daily.

Foreman and Proxy versions:

Foreman and Proxy plugin versions: 2.5.3/4.1.3

Distribution and version: CentOS 7.9

Other relevant data:

[root@suct2v833 ~]# (zcat /var/log/foreman-proxy/proxy.log-2021090*.gz; cat /var/log/foreman-proxy/proxy.log)|grep -i error
2021-09-08T13:42:56 a8e659d6 [E] Could not fetch capabilities: 783: unexpected token at '<h1>Server Error (500)</h1>'
2021-09-08T13:42:56 a8e659d6 [E] Could not fetch capabilities: 783: unexpected token at '<h1>Server Error (500)</h1>'
2021-09-08T13:42:57 a8e659d6 [E] Could not fetch capabilities: 783: unexpected token at '<h1>Server Error (500)</h1>'
2021-09-08T13:42:57 a8e659d6 [E] Could not fetch capabilities: 783: unexpected token at '<h1>Server Error (500)</h1>'
2021-09-08T13:42:57 a8e659d6 [E] Could not fetch capabilities: 783: unexpected token at '<h1>Server Error (500)</h1>'
2021-09-10T11:14:07 9f6f69ec [E] Could not fetch capabilities: 783: unexpected token at '<h1>Server Error (500)</h1>'
2021-09-10T11:14:07 9f6f69ec [E] Could not fetch capabilities: 783: unexpected token at '<h1>Server Error (500)</h1>'
2021-09-10T11:14:07 9f6f69ec [E] Could not fetch capabilities: 783: unexpected token at '<h1>Server Error (500)</h1>'
2021-09-10T11:14:07 9f6f69ec [E] Could not fetch capabilities: 783: unexpected token at '<h1>Server Error (500)</h1>'
2021-09-10T11:14:07 9f6f69ec [E] Could not fetch capabilities: 783: unexpected token at '<h1>Server Error (500)</h1>'

Hi @jgkootstra

Could you please share the full log entries for one of these failures?
An HTTP500 indicates a internal server error, but without further logs it’s pretty much impossible to hunt down the original cause of that error.

First some errors from /var/log/messages

 grep -i error /var/log/messages|tail -n 5
Sep 10 13:31:31 suct2v833 qdrouterd: 2021-09-10 13:31:31.572406 +0200 SERVER (info) [C278] Connection from 145.222.242.189:59722 (to :5647) failed: amqp:connection:framing-error SSL Failure: Unknown error
Sep 10 13:31:33 suct2v833 qdrouterd: 2021-09-10 13:31:33.538646 +0200 SERVER (info) [C279] Connection from 145.222.242.203:36596 (to :5647) failed: amqp:connection:framing-error SSL Failure: Unknown error
Sep 10 13:31:39 suct2v833 qdrouterd: 2021-09-10 13:31:39.062575 +0200 SERVER (info) [C280] Connection from 145.222.242.178:52330 (to :5647) failed: amqp:connection:framing-error SSL Failure: Unknown error
Sep 10 13:31:39 suct2v833 qdrouterd: 2021-09-10 13:31:39.066996 +0200 SERVER (info) [C281] Connection from 145.222.242.182:39524 (to :5647) failed: amqp:connection:framing-error SSL Failure: Unknown error
Sep 10 13:31:39 suct2v833 qdrouterd: 2021-09-10 13:31:39.695002 +0200 SERVER (info) [C282] Connection from 145.222.243.107:40766 (to :5647) failed: amqp:connection:framing-error SSL Failure: Unknown error
Sep 10 13:41:28 suct2v833 qdrouterd: 2021-09-10 13:41:28.865097 +0200 ROUTER_CORE (info) [C1][L1347] Link detached: del=0 presett=0 psdrop=0 acc=0 rej=0 rel=0 mod=0 delay1=0 delay10=0 blocked=no
Sep 10 13:41:28 suct2v833 qdrouterd: 2021-09-10 13:41:28.879001 +0200 SERVER (info) [C676] Accepted connection to :5647 from 145.222.242.178:53758
Sep 10 13:41:28 suct2v833 qdrouterd: 2021-09-10 13:41:28.895309 +0200 SERVER (info) [C677] Accepted connection to :5647 from 145.222.242.182:40116
Sep 10 13:41:28 suct2v833 qdrouterd: 2021-09-10 13:41:28.904667 +0200 ROUTER_CORE (info) [C676] Connection Opened: dir=in host=145.222.242.178:53758 vhost= encrypted=TLSv1/SSLv3 auth=ANONYMOUS user=anonymous container_id=7925ada5-7b77-4427-b4bf-0868b18320d0 props=
Sep 10 13:41:28 suct2v833 qdrouterd: 2021-09-10 13:41:28.906675 +0200 ROUTER_CORE (info) [C676][L1380] Link attached: dir=out source={pulp.agent.1aed8972-20b3-43f2-8b46-c81d99cc8cf8 expire:sess} target={<none> expire:sess}
Sep 10 13:41:28 suct2v833 qdrouterd: 2021-09-10 13:41:28.918116 +0200 ROUTER_CORE (info) [C677] Connection Opened: dir=in host=145.222.242.182:40116 vhost= encrypted=TLSv1/SSLv3 auth=ANONYMOUS user=anonymous container_id=b1ef8201-554d-4d7e-8b9b-394a9ec2bade props=
Sep 10 13:41:28 suct2v833 qdrouterd: 2021-09-10 13:41:28.920112 +0200 ROUTER_CORE (info) [C677][L1382] Link attached: dir=out source={pulp.agent.cd0c912e-f05b-4336-a1c7-0c3f7a0365b4 expire:sess} target={<none> expire:sess}
Sep 10 13:41:29 suct2v833 qdrouterd: 2021-09-10 13:41:29.359810 +0200 SERVER (info) [C660] Connection from 145.222.243.107:42088 (to :5647) failed: amqp:connection:framing-error SSL Failure: Unknown error
Sep 10 13:41:29 suct2v833 qdrouterd: 2021-09-10 13:41:29.359962 +0200 ROUTER_CORE (info) [C660][L1348] Link closed due to connection loss: del=0 presett=0 psdrop=0 acc=0 rej=0 rel=0 mod=0 delay1=0 delay10=0 blocked=no
Sep 10 13:41:29 suct2v833 qdrouterd: 2021-09-10 13:41:29.360005 +0200 ROUTER_CORE (info) [C660] Connection Closed
2021-09-10T11:14:07 9f6f69ec [I] Started GET /v2/features
2021-09-10T11:14:07 9f6f69ec [I] Started GET /v2/features
2021-09-10T11:14:07 9f6f69ec [E] Could not fetch capabilities: 783: unexpected token at '<h1>Server Error (500)</h1>'
2021-09-10T11:14:07 9f6f69ec [E] Could not fetch capabilities: 783: unexpected token at '<h1>Server Error (500)</h1>'
2021-09-10T11:14:07 9f6f69ec [E] Could not fetch capabilities: 783: unexpected token at '<h1>Server Error (500)</h1>'
2021-09-10T11:14:07 9f6f69ec [I] Finished GET /v2/features with 200 (94.78 ms)
2021-09-10T11:14:07 9f6f69ec [I] Finished GET /v2/features with 200 (94.78 ms)
2021-09-10T11:14:07 9f6f69ec [I] Finished GET /v2/features with 200 (94.78 ms)
2021-09-10T11:14:08 7ec4b5b4 [I] Finished GET /puppet/ca with 200 (3146.4 ms)
2021-09-10T11:14:08 7ec4b5b4 [I] Finished GET /puppet/ca with 200 (3146.4 ms)

The first to log segments from journal/messages should not be related to your issue. qdrouterd is part of qpid and that is not used in communication with the smartproxy. The errors there should not have any impact on your environment, since qpid should only be used for communication with (the long deprecated) katello-agent.

The last segment is the one with actually important errors, though. Sadly, it does not say what’s actually going wrong. Could you raise the loglevel for foreman-proxy in /etc/foreman-proxy/settings.yml to DEBUG and restart the smart proxy? Maybe we can detect at what step it actually screws up. I assume the smart-proxy is having problems talking to some backend process that throws the actual
Also, the output of curl -k --cacert /etc/foreman-proxy/foreman_ssl_ca.pem --cert /etc/foreman-proxy/foreman_ssl_cert.pem --key /etc/foreman-proxy/foreman_ssl_key.pem "https://localhost:9090/v2/features" | python -m json.tool might prove helpful, though this could possibly cause your “daily lockup” again.

Please be aware that both things I asked above will most likely include sensitive data like FQDNs and IPs, so you might want to anonymize those before putting them up here.

I will come back on these requests on Monday

Unable to reproduce