"Required lock is already taken by other running tasks" error after forcefully cancelling sync task on a repo

Problem:
After forcefully canceling a hanging repo sync task (maybe not the best idea), whenever I try to interact with that repo in Foreman, I receive a “Required lock is already taken by other running tasks” error message with a link to the task that I forcefully canceled. The linked task is in a stopped state, not running. I assume the error is caused by orphaned lock files, but am unsure of the process I need to follow to clear the lock files.

Expected outcome:
Lock files are cleared and I am able to interact with the repo again, say to re-initiate a sync process.

Foreman and Proxy versions:
Foreman 2.5.1

Foreman and Proxy plugin versions:

Distribution and version:
CentOS Linux release 7.9.2009 (Core)

Other relevant data:
In the Dynaflow console for the forcefully cancelled task, I see
[…]
Actions::Pulp3::Repository::RefreshDistribution (error) [ 391.88s / 179.80s ]
[…]
Actions::Katello::Repository::FetchPxeFiles (pending)

In my searches, I’ve seen the “foreman-rake foreman_tasks:cleanup” command mentioned. Is this the tool I should use? For example, would the following be useful in cleaning up any orphaned lock files associated with these actions?
foreman-rake foreman_tasks:cleanup TASK_SEARCH=‘label = Actions::Pulp3::Repository::RefreshDistribution’ STATES=‘stopped’ VERBOSE=true [NOOP=true]
foreman-rake foreman_tasks:cleanup TASK_SEARCH=‘label = Actions::Katello::Repository::FetchPxeFiles’ STATES=‘stopped’ VERBOSE=true [NOOP=true]

Or is there a more general command to use?

Thanks for any pointers!

Hi @cbcbcb,

Let’s try a restart of all the services which include foreman-tasks/dynflow and see if that clears the lock. Can you run foreman-maintain service restart and see if that helps. If it does not then the next steps would be to try and clear the task state via the dynflow console and then check to make sure we don’t have any hung tasks with pulp-cli.

2 Likes

Hi @cintrix84,
Thanks for your time and suggestion. I appreciate your response.

I ran the foreman-maintain command to restart Foreman this morning. The restart was successful. I then tried to disable the repo in question but again received the “Required lock is already taken by other running tasks” error. However, the task linked to in the error message was different than the one previously linked to. So that’s good.

The current task blocking further actions is a repo sync task that kicked off late last night as part of the sync plan. I’ve gone ahead and removed that sync plan from the product definition including this repo to avoid this happening again.

The sync task is currently in a “paused” state. The dynaflow console is showing:
[…]
20: Actions::Pulp3::Repository::RefreshDistribution (error) [ 391.88s / 179.80s ] [Skip]
[…]
29: Actions::Katello::Repository::FetchPxeFiles (pending)

I tried to use the skip link above to see if the task would continue, but it led to an exception page:

RuntimeError at /foreman_tasks/dynflow/64b7c476-55ef-481f-8c45-40d866b51fa4/skip/20
Skipping step in skipped is not supported

The buttons for “Resume”, “Unlock”, and “Force Unlock” are enabled. Is there a recommended path forward to resume paused tasks?

Thanks, all.

Hi @cbcbcb

Would you be free for a remote session tomorrow? I am in the EST time zone.

I am free any time before 2 pm EST.

We can use Microsoft Teams, Google Hangouts, or Bluejeans. Let me know what works. We have a remote support tool called bomgar for the screensharing/control.

Thanks for the guidance. Clicking “Resume” worked in that the task is now showing as “100% complete”, although it has “Task canceled” in the errors box. In the dynaflow console, I now see the following for the Refresh Distribution and FetchPxeFiles subtasks:

[...]
Actions::Pulp3::Repository::RefreshDistribution (success) [ 567684.39s / 180.15s ]
[...]
 Actions::Katello::Repository::FetchPxeFiles (success) [ 0.07s / 0.07s ]

In the “Locks” tab, everything is unlocked for this task. So that seems like progress.

I then went back and tried to disable the repo and didn’t receive a lock error! Instead, it notified me that the repo was still in use by some content views.

I started to remove the content view versions referencing this repo. I was successful in removing many versions. However, now I’m running into a different error. I’ll start a new thread about that error. I did also try to re-sync the affected repo and was able to sync it successfully.

So, in short, in my case, I think the restart via “foreman-maintain service restart” successfully cleared out the orphaned lock files. Thank you all for your guidance!

@cintrix84, thank you for the offer. I don’t think that’s needed at this time.

@cintrix84 If a restart does not clear the locks, what would be the next step? We have a task that shows “stopped” with error status in Monitor that was cancelled (not paused in UI). However, nothing can now happen as everything (repo syncs, CV deletes) conflicts with the stopped task. The stopped task was a CV publish. Postgres shows task is stopped, Dynflow console seems to say ‘paused’. Just want to clear this so I can upgrade to 2.5.

Jim