"Bulk generate applicability" stuck at Pending after Alma 8 > 9 Leapp Upgrade

Problem:
After upgrading my Foreman server from Alma 8 to Alma 9 using LEAP, on my Prod box.
I seem to have a few instances of “Bulk generate applicability” tasks sitting round at Pending for a subnet of about 10-15 of the approx 120 hosts on my box. There’s no obvious error in the logs, they just seem to start, sit at pending and hang indefinitely.

Some are generic " “Bulk generate applicability” tasks, some are specific to one individual boxes i.e. “Bulk generate applicability for host dcautlprdnet01.REDACTED.local”

There were a couple of minor issues to iron out in-situ during the upgrade, but nothing all that noteworthy, and most of those were ironed out on the ProProd Foreman box. All packages on the box are now el9, and everything else seems to be functioning normally.
There’s no obvious pattern to the hosts involved, they use a variety of different lifecycles.

Can anybody help me figure out where to go next with troubleshooting please?

Expected outcome:
Bulk generate applicability tasks run normally.

Foreman and Proxy versions:

Foreman and Proxy plugin versions:
Foreman 3.11.5-1
Katello 4.13.1-1
Pulpcore 3.49.22-1

Distribution and version:
Alma 9.5

Other relevant data:

[root@dca-foreman-al foreman]# grep bf453852-214f-4350-9548-3a5a67337859 production.log
2024-12-19T20:29:46 [I|bac|bb024630] Task {label: Actions::Katello::Applicability::Hosts::BulkGenerate, id: bf453852-214f-4350-9548-3a5a67337859, execution_plan_id: 3c29671f-41b0-427d-985f-dc13be9540b1} state changed: planning
2024-12-19T20:29:46 [I|bac|bb024630] Task {label: Actions::Katello::Applicability::Hosts::BulkGenerate, id: bf453852-214f-4350-9548-3a5a67337859, execution_plan_id: 3c29671f-41b0-427d-985f-dc13be9540b1} state changed: planned
2024-12-19T20:29:46 [I|bac|bb024630] Task {label: Actions::Katello::Applicability::Hosts::BulkGenerate, id: bf453852-214f-4350-9548-3a5a67337859, execution_plan_id: 3c29671f-41b0-427d-985f-dc13be9540b1} state changed: running
2024-12-20T09:11:45 [I|app|ee2c74b3] Started GET "/foreman_tasks/tasks/bf453852-214f-4350-9548-3a5a67337859" for 192.168.249.200 at 2024-12-20 09:11:45 +0000
2024-12-20T09:11:45 [I|app|ee2c74b3]   Parameters: {"id"=>"bf453852-214f-4350-9548-3a5a67337859"}
2024-12-20T09:12:36 [I|app|4eac8189] Started GET "/foreman_tasks/tasks/bf453852-214f-4350-9548-3a5a67337859" for 192.168.249.200 at 2024-12-20 09:12:36 +0000
2024-12-20T09:12:36 [I|app|4eac8189]   Parameters: {"id"=>"bf453852-214f-4350-9548-3a5a67337859"}
2024-12-20T09:12:37 [I|app|7e44088a] Started GET "/foreman_tasks/api/tasks/bf453852-214f-4350-9548-3a5a67337859/details?include_permissions" for 192.168.249.200 at 2024-12-20 09:12:37 +0000
2024-12-20T09:12:37 [I|app|7e44088a]   Parameters: {"include_permissions"=>nil, "id"=>"bf453852-214f-4350-9548-3a5a67337859"}
2024-12-20T09:12:42 [I|app|ff280cf6] Started GET "/foreman_tasks/api/tasks/bf453852-214f-4350-9548-3a5a67337859/details?include_permissions" for 192.168.249.200 at 2024-12-20 09:12:42 +0000
2024-12-20T09:12:42 [I|app|ff280cf6]   Parameters: {"include_permissions"=>nil, "id"=>"bf453852-214f-4350-9548-3a5a67337859"}
2024-12-20T09:12:47 [I|app|70546c35] Started GET "/foreman_tasks/api/tasks/bf453852-214f-4350-9548-3a5a67337859/details?include_permissions" for 192.168.249.200 at 2024-12-20 09:12:47 +0000
2024-12-20T09:12:47 [I|app|70546c35]   Parameters: {"include_permissions"=>nil, "id"=>"bf453852-214f-4350-9548-3a5a67337859"}
2024-12-20T09:12:53 [I|app|0a0d91bb] Started GET "/foreman_tasks/api/tasks/bf453852-214f-4350-9548-3a5a67337859/details?include_permissions" for 192.168.249.200 at 2024-12-20 09:12:53 +0000
2024-12-20T09:12:53 [I|app|0a0d91bb]   Parameters: {"include_permissions"=>nil, "id"=>"bf453852-214f-4350-9548-3a5a67337859"}
1 Like

Hi @alexz,

So the result is “Pending”, what is the state for these tasks?

Assuming that those tasks are old and have been hanging around a while, they should be safe to clean up:

foreman-rake foreman_tasks:cleanup TASK_SEARCH='result = pending and action ~ "Bulk generate"  VERBOSE=true NOOP=true

If the tasks are failing for specific hosts all the time, then I would recommend trying the following:

# First, pick a host that is failing to have its applicability regenerated.
# You can find the host inside the Dynflow console for the pending task.
host = ::Host.find_by(name: '_host_name_')
host.content_facet.calculate_and_import_applicability

See if that triggers okay.

If the tasks are pending, it could also mean that the Katello ::Katello::HOST_TASKS_QUEUE is misbehaving somehow – might need some help from @aruzicka if that is the case.

So, where did we end up with this?

Hi @iballou @aruzicka

Sorry for the delay on coming back to you on this. As well as being on annual leave over Christmas and New Year I did go on a bit of a tangent where it seemed like the subsection of hosts I was seeing had a common factor in that they had some additional repos configured aside from the standard list accessed through the Foreman server.

Having tidied these up however more have appeared, and these have no unusual repos configured, so it looks like that was a red herring, and these tasks continue to stack up and trigger pending task e-mails.

I’ve run the command specified against the hosts that were listed as hung, and cleared down all the currently stalled Bulk Availability Tasks.

I guess the next step is to just wait a few days and see if more tasks hang, or is there anything else I can do to proactively further investigate for now?

[root@dca-foreman-al ~]# foreman-rake console
Loading production environment (Rails 6.1.7.9)
irb(main):001:0> host = ::Host.find_by(name: 'dcbztststhap01.REDACTED')
=>
#<Host::Managed:0x00007f66fd3bff18
...
irb(main):002:0> host.content_facet.calculate_and_import_applicability
=> true

Thanks.

1 Like

I would say let’s watch it and see if more pile up. Wishful thinking says it was a one-time mystery.

Hi @iballou @aruzicka

To confirm, this hasn’t resolved the issue unfortunately, the Bulk Application tasks are still building up. In fact I’ve been performing monthly patching runs over the lasts 2 weeks and now far more hosts are reporting the hung/pending task, I now have somewhere in the region of 250, so his essentially failing on all hosts. Looking at the timestamps on the tasks they’re being generated by the boxes being patched.

The boxes have still picked up their required updates (I patch using Ansible and a standard dnf module update, the update mechanism is triggered from the Host end), although the installed updates are still showing as uninstalled on the Content Hosts page.

image

I’ve been trying to hunt down some further details on the actual technical mechanism that the Generate Bulk Availability uses to try to understand it a bit better, but potential info on this seems a bit sparse/hidden behind Redhat subscriber paywalls in a couple of instances.

I also enabled debug logging on the Foreman server and reran an update, this has given a bit more information in the logs for the task but I’m not sure if it points to anything that further hints at a root cause of the issue., although a more experienced eye might be able to pick something out from it:

[root@dca-foreman-al ~]# grep 85ede4c4-f8d6-4e1d-b9f9-3e954035a930 /var/log/foreman/production.log
2025-01-28T13:11:53 [I|bac|e6e80500] Task {label: , execution_plan_id: 85ede4c4-f8d6-4e1d-b9f9-3e954035a930} state changed: pending
2025-01-28T13:11:53 [D|dyn|e6e80500] ExecutionPlan 85ede4c4-f8d6-4e1d-b9f9-3e954035a930      pending >>  planning
2025-01-28T13:11:53 [I|bac|e6e80500] Task {label: Actions::Katello::Applicability::Hosts::BulkGenerate, id: 7312bdd0-87a6-4d6f-a093-8e231810e693, execution_plan_id: 85ede4c4-f8d6-4e1d-b9f9-3e954035a930} state changed: planning
2025-01-28T13:11:53 [D|dyn|e6e80500]          Step 85ede4c4-f8d6-4e1d-b9f9-3e954035a930: 1   pending >>   running in phase     Plan Actions::Katello::Applicability::Hosts::BulkGenerate
2025-01-28T13:11:53 [D|dyn|e6e80500]          Step 85ede4c4-f8d6-4e1d-b9f9-3e954035a930: 1   running >>   success in phase     Plan Actions::Katello::Applicability::Hosts::BulkGenerate
2025-01-28T13:11:53 [D|dyn|e6e80500] ExecutionPlan 85ede4c4-f8d6-4e1d-b9f9-3e954035a930     planning >>   planned
2025-01-28T13:11:53 [I|bac|e6e80500] Task {label: Actions::Katello::Applicability::Hosts::BulkGenerate, id: 7312bdd0-87a6-4d6f-a093-8e231810e693, execution_plan_id: 85ede4c4-f8d6-4e1d-b9f9-3e954035a930} state changed: planned
2025-01-28T13:11:53 [D|dyn|] ExecutionPlan 85ede4c4-f8d6-4e1d-b9f9-3e954035a930      planned >>   running
2025-01-28T13:11:53 [I|bac|e6e80500] Task {label: Actions::Katello::Applicability::Hosts::BulkGenerate, id: 7312bdd0-87a6-4d6f-a093-8e231810e693, execution_plan_id: 85ede4c4-f8d6-4e1d-b9f9-3e954035a930} state changed: running
2025-01-28T13:12:37 [D|app|03a5c0d0] Body: {"parent_task_id":null,"start_at":"2025-01-28 13:11:53 +0000","start_before":null,"external_id":"85ede4c4-f8d6-4e1d-b9f9-3e954035a930","id":"7312bdd0-87a6-4d6f-a093-8e231810e693","label":"Actions::Katello::Applicability::Hosts::BulkGenerate","pending":true,"action":"Bulk generate applicability for host dcautlfmana9.REDACTED.local","username":"foreman_admin","started_at":"2025-01-28 13:11:53 +0000","ended_at":null,"duration":"43.80548","state":"running","result":"pending","progress":0.0,"input":{"host_ids":[16],"current_request_id":"e6e80500-09f1-4cb7-ab3d-3b544fd2ca58","current_timezone":"UTC","current_organization_id":null,"current_location_id":null,"current_user_id":1},"output":{},"humanized":{"action":"Bulk generate applicability for host dcautlfmana9.REDACTED.local","input":[],"output":"","errors":[]},"cli_example":null,"can_edit":true,"can_delete":true,"available_actions":{"cancellable":false,"resumable":false},"execution_plan":{"state":"running","cancellable":false},"failed_steps":[],"running_steps":[],"help":null,"has_sub_tasks":false,"locks":[],"links":[],"username_path":"foreman_admin","dynflow_enable_console":true}
2025-01-28T13:12:42 [D|app|26648951] Body: {"parent_task_id":null,"start_at":"2025-01-28 13:11:53 +0000","start_before":null,"external_id":"85ede4c4-f8d6-4e1d-b9f9-3e954035a930","id":"7312bdd0-87a6-4d6f-a093-8e231810e693","label":"Actions::Katello::Applicability::Hosts::BulkGenerate","pending":true,"action":"Bulk generate applicability for host dcautlfmana9.REDACTED.local","username":"foreman_admin","started_at":"2025-01-28 13:11:53 +0000","ended_at":null,"duration":"48.786321","state":"running","result":"pending","progress":0.0,"input":{"host_ids":[16],"current_request_id":"e6e80500-09f1-4cb7-ab3d-3b544fd2ca58","current_timezone":"UTC","current_organization_id":null,"current_location_id":null,"current_user_id":1},"output":{},"humanized":{"action":"Bulk generate applicability for host dcautlfmana9.REDACTED.local","input":[],"output":"","errors":[]},"cli_example":null,"can_edit":true,"can_delete":true,"available_actions":{"cancellable":false,"resumable":false},"execution_plan":{"state":"running","cancellable":false},"failed_steps":[],"running_steps":[],"help":null,"has_sub_tasks":false,"locks":[],"links":[],"username_path":"foreman_admin","dynflow_enable_console":true}
2025-01-28T13:12:43 [I|app|f17065ac] Started GET "/foreman_tasks/dynflow/85ede4c4-f8d6-4e1d-b9f9-3e954035a930" for 172.31.16.164 at 2025-01-28 13:12:43 +0000

The logs there actually look like business as usual, the task is pending for a bit and then moves to running. When these stuck tasks move to running, do they finish to completion?

Asking since, if they get stuck when running, it could mean that somehow the task itself is just spinning (like stuck in a loop?). If they get stuck in pending, then it seems like there aren’t enough Dynflow workers to pick up the applicability calculation tasks.

The only loop I found related to applicability is in app/models/katello/events/generate_host_applicability.rb , but that is the event that triggers the actions that you’re seeing get stuck.


I took a look at the Red Hat articles and they really just pointed to Bug #36850: Slow generate applicability for Hosts with multiple modulestreams installed - Katello - Foreman, which is fixed. One tip by the way – the free (?) developer subscription I think should give access to the knowledgebase articles.

Hi @iballou @aruzicka

The tasks were just hanging indefinitely, for days, the oldest today were a couple of weeks old. It’s not a particularly busy server, it only serves content to around 150 hosts, so there’s not really any congestion for resources most of the time. All other tasks, scheduled or otherwise, were running normally.

However, the boxes were due an outstanding upgrade from F3.11/K.4.13 to F3.12/K4.14 anyway, so last week I cleared down all the stuck tasks as a precaution and performed the upgrade in the hope that any updated package code might flush something through (and I could rollback if it made anything worse rather than better), and it seems to have done the trick, I’ve flushed through the checks for all the Content Hosts on the box and performed a handful more updates, and Bulk Availability checks are once again functioning normally.

Thanks again for your help.

Also, to confirm Free Developer access no longer gives access to forum content unfortunately, this used to be the case but Red Hat seem to have locked this down in recent months.

1 Like

Well, I’m glad to hear that things seem normal now, let us know if this occurs again. We have some good data in this thread.

Thanks for checking, this is too bad. I use them to help answer folks’ questions, so I guess I’ll keep at it.