Actions::Katello::Host::Register stuck with message "Unable to verify server's identity: ('The read operation timed out',)"

Hello everyone,
I’m new to the forum, but not to katello/foreman, we used it for a while here
by the way, i’m french and be english could be a little weird from time to time, sorry about that.

we have a lot of tasks blocked on katello :
“Actions::Katello::Host::Update” and "Actions::Katello::Host::UploadPackageProfile"
the tasks stay in status “running” and result “pending” but don’t ever change. We kiill the task by pulp-admin or foreman-rake console on a daily basis but i’m looking for a more permanent solution (like figure out why the tasks are stuck). (pretty much like [Katello] Continue to have Actions::Katello::Host::UploadPackageProfile backfilled)

can anyone tell me in which log file to start to find an explanation ?

More important we have issue with client which can’t register anymore
the task Actions::Katello::Host::Register is stuck the same way and the subscription-manager register on the client and with the message "Unable to verify server’s identity: (‘The read operation timed out’,)"
After the registation the server appear in the content host, but the cliient stay in status unknow and don’t have any repo available (centos and rhel alike).

any hint ?

we use foreman in version 1.15.4-1 and katello in version 3.4.5-1 on centos 7

I guess /var/log/foreman/production.log would be a good address for debugging? Just a presumption, is enough disk space free - especially the disk which is used by pulp (/var/lib/pulp)? How many GB of RAM do you have assigned for your foreman/katello?

Best regards,
Bernhard


ATIX AG - https://atix.de

Hello Bernhard,

yes i think that /var/log/foreman/production.log is a good start :smile:
i’ve trapped the connection by the client in this log :

 2018-03-15 17:24:32 da377788 [app] [I] Started GET "/rhsm/" for 10.0.0.1 at 2018-03-15 17:24:32 +0100
2018-03-15 17:24:32 e9bf0721 [app] [I] Started GET "/rhsm/status" for 10.0.0.1 at 2018-03-15 17:24:32 +0100
2018-03-15 17:24:36 a99acb57 [app] [I] Started GET "/rhsm/" for 10.0.0.1 at 2018-03-15 17:24:36 +0100
2018-03-15 17:24:36 a65657eb [app] [I] Started GET "/rhsm/status" for 10.0.0.1 2018-03-15 17:24:36 +0100
2018-03-15 17:24:36 80458304 [app] [I] Started POST "/rhsm/consumers?owner=INT&activation_keys=centos6" for 10.0.0.1 at 2018-03-15 17:24:36 +0100

but everything seems OK in this log, the task is still stuck in the task list.

We have 1To in /var (not only /var/lib/pulp) and 234Go still free and we are only 54% of inodes taken in /var.

for memory we have 32Go allocated :

free -m
total used free shared buff/cache available
Mem: 32012 12176 1564 35 18271 19320
Swap: 12287 2589 9698

i’ve check the sar and the monitoring, but i don’t see any sign of swapping on the server.

in /var/log/foreman/production.log i’ve mainly have 2 type of error :
“RestClient::ResourceNotFound: 404 Resource Not Found” and “ForemanTasks::lock::LockConflict:”

the lock are due to the task been stuck and server resubscribe
the error RestClient::ResourceNotFound: 404 Resource Not Found seems to be linked to Candlepin :

2018-03-16 03:39:53 e0d5c27c [app] [I] Started GET "/rhsm/consumers/0d2bd6a5-26d5-41d9-89ba-b63d5311284a/content_overrides" for 10.0.0.20 at 2018-03-16 03:39:53 +0100
2018-03-16 03:39:53 e0d5c27c [app] [I] Processing by Katello::Api::Rhsm::CandlepinProxiesController#get as JSON
2018-03-16 03:39:53 e0d5c27c [app] [I]   Parameters: {"id"=>"0d2bd6a5-26d5-41d9-89ba-b63d5311284a"}
2018-03-16 03:39:53 e0d5c27c [app] [I] Current user: 0d2bd6a5-26d5-41d9-89ba-b63d5311284a (regular user)
2018-03-16 03:39:53 e0d5c27c [app] [E] RestClient::ResourceNotFound: 404 Resource Not Found
2018-03-16 03:39:53 e0d5c27c [app] [I]   Rendered text template (0.0ms)
2018-03-16 03:39:53 e0d5c27c [app] [I] Completed 404 Not Found in 37ms (Views: 0.9ms | ActiveRecord: 0.7ms

Hi i’ve got the exact same problem.
The register task appears in Satellite, in this task there is five little tasks (Actions::Katello::Host::Register, Actions::Pulp::Consumer::Create…) which are are really short (less that 1 second) but most of the time there are pending. The whole task during more than 10 minutes.
Meanwhile on the client the subscription-manager time out with the output :
Unable to verify server’s identity: (‘The read operation timed out’,)

Hello, we had the same problem:

  • long time to register on foreman/katello;
  • timeout on the client

There were lots of tasks in pending state, causing late on the postgress database. We identified those tasks and deleted them with foreman-rake command.