Creating incremental update blew up every repo, now I have unkillable task

Katello 4.2.1

I wanted to apply errata (for the recent polkit issue); everything started great and as I have done before, Katello was going to create several incemental content views for my products.

But then it blew up spectacularly (see below). Worse, I have a task that’s mid-error and I just can NOT seem to kill it and clear locks. I am complletely stuck, cannot now create new content views or anything… help?

My biggest question, howw do I REALLY kill this task? I haave tried ‘resume’, ‘force cancel’, all that but it remains and is interfering with everything. Dynflow console has nothing I can deal with (no skip or kill or…).

Error message: the server returns an error
HTTP status code: 400
Response headers: {“date”=>“Fri, 28 Jan 2022 18:02:35 GMT”, “server”=>“gunicorn”, “content-type”=>“application/json”, “vary”=>“Accept,Cookie”, “allow”=>“POST, OPTIONS”, “x-frame-options”=>“SAMEORIGIN”, “content-length”=>“118”, “correlation-id”=>“0febd80a-cde3-4e9a-be17-725a4dd32452”, “access-control-expose-headers”=>“Correlation-ID”, “via”=>“1.1 katello.ctsi.mcw.edu”, “connection”=>“close”}
Response body: [“Version 3 does not exist for repository ‘Red_Hat_Enterprise_Linux_7_Server_-Optional_RPMs_x86_64_7Server-605771’."]Error message: the server returns an error
HTTP status code: 400
Response headers: {“date”=>“Fri, 28 Jan 2022 18:21:15 GMT”, “server”=>“gunicorn”, “content-type”=>“application/json”, “vary”=>“Accept,Cookie”, “allow”=>“POST, OPTIONS”, “x-frame-options”=>“SAMEORIGIN”, “content-length”=>“70”, “correlation-id”=>“0febd80a-cde3-4e9a-be17-725a4dd32452”, “access-control-expose-headers”=>“Correlation-ID”, “via”=>“1.1 katello.ctsi.mcw.edu”, “connection”=>“close”}
Response body: [“Version 7 does not exist for repository ‘rocky8_appstream-362112’.”]Error message: the server returns an error
HTTP status code: 400
Response headers: {“date”=>“Fri, 28 Jan 2022 18:21:15 GMT”, “server”=>“gunicorn”, “content-type”=>“application/json”, “vary”=>“Accept,Cookie”, “allow”=>“POST, OPTIONS”, “x-frame-options”=>“SAMEORIGIN”, “content-length”=>“107”, “correlation-id”=>“0febd80a-cde3-4e9a-be17-725a4dd32452”, “access-control-expose-headers”=>“Correlation-ID”, “via”=>“1.1 katello.ctsi.mcw.edu”, “connection”=>“close”}
Response body: [“Version 9 does not exist for repository ‘Red_Hat_Enterprise_Linux_7_Server_RPMs_x86_64_7Server-611152’.”]Error message: the server returns an error
HTTP status code: 400
Response headers: {“date”=>“Fri, 28 Jan 2022 18:21:15 GMT”, “server”=>“gunicorn”, “content-type”=>“application/json”, “vary”=>“Accept,Cookie”, “allow”=>“POST, OPTIONS”, “x-frame-options”=>“SAMEORIGIN”, “content-length”=>“111”, “correlation-id”=>“0febd80a-cde3-4e9a-be17-725a4dd32452”, “access-control-expose-headers”=>“Correlation-ID”, “via”=>“1.1 katello.ctsi.mcw.edu”, “connection”=>“close”}
Response body: ["Version 15 does not exist for repository 'Red_Hat_Enterprise_Linux_8_for_x86_64
-_AppStream_RPMs_8-772288’.”]Error message: the server returns an error
HTTP status code: 400
Response headers: {“date”=>“Fri, 28 Jan 2022 18:21:15 GMT”, “server”=>“gunicorn”, “content-type”=>“application/json”, “vary”=>“Accept,Cookie”, “allow”=>“POST, OPTIONS”, “x-frame-options”=>“SAMEORIGIN”, “content-length”=>“71”, “correlation-id”=>“0febd80a-cde3-4e9a-be17-725a4dd32452”, “access-control-expose-headers”=>“Correlation-ID”, “via”=>“1.1 katello.ctsi.mcw.edu”, “connection”=>“close”}
Response body: [“Version 8 does not exist for repository ‘rocky8_powertools-395486’.”]

This is where I’m stuck and I have no idea how to resolve this. Tried resume, cancel, force cancel, no go.

ANY ideas to get me out of this would be appreciated

Hi @caseybea
This looks very similar to what I ran into as well :thinking:
Please try to go in the Dynflow Console while the job is running (resume) and then skip the Actions::Pulp3::Repository::MultiCopyUnits step, that helped me to get out of the locked state :slight_smile:

Or better said, it will go in the fault state right away again, but it will show the skip button in the Dynflow Console :slight_smile:

A little earlier, I did see the SKIP links as shown; clicking on those did something (perhaps advanced the task a litle bit?). But it’s now STILL stuck, and there are no longer ANY “clickable” items in the Dynflow console (as shown above for the “Skip”).

I am … stuck. And a bit panicked because I can no longer do anything with content, cannot create a new content view for example.

Hm maybe I needed to resume in the Foreman interface once again, but it did definitely complete then :slight_smile:
(advanced, yes, also for me ^^)

You will have to promote the CV back to the previous one then and remove the created not complete version.

OK, so I did try “promoting” the prior CV to remove the promoted status of the half–broken one. BUT… catch-22, I cannot, as there is a LOCK (which is this half-completed process…).

In re-checking the Dynflow console, I do now surprisingly see a few SKIP items offered up. Huzzah! BUT— clicking on any of them results in a fatal blowup as shown below:

SO… Cannot remove promoted/broken CV. cannot cancel. Cannot resume. Cannot force unlock. Dynflow “Skip” button blows up.

Hopefully someone can continue to help get me out of this predicament…!

Oh maybe that made the difference, it’s never a good idea to force unlock ^^
I could even reproduce it now, so applied for another Incremental CV → Got stuck → clicked on resume → got stuck again → went to the Dynflow Console and skipped the before mentioned step → went back to resume → tracked in the Dynflow Console that it continued → finished broken but unlocked → Promoted to the previous version → deleted the incremental version

clearly the force unlock was a last resort. But didn’t work. If anyone has any ideas I’d appreciate it. I hate having this hosed up and unable to do anything…

OK, so I found some info on cleaning out a hung/broken task via foreman-rake and I think I’m finally out of hot water. And then I was able to successfuly “unpromote” the newly-created (and likely hosed) incremental content views as well. I’m back to square 1 which is good.

The thing that caused this in the first place is an issue with incremental content views; a few of us all had the same issue (Unable to publish an incremental CV update - #9 by caseybea) and @iballou has confirmed a bug and so eventually there will be a fix.

Anyone else recently having this issue (attempt to create an incremental content view, things blow up) should watch that thread identified above for updates.

1 Like