Katello 3.17 to 3.18, cannot migrate to pulp3

We might be getting somewhere.

I could not finnd a history of the orphan cleanup, so it’s runnig now. It cruised along for a while, but it’s not seemingly “stuck” at 63% progress. The messages log is now just a bazillion of these:

Apr 28 15:36:12 katello pulpcore-api: - - [28/Apr/2021:20:36:12 +0000] “GET /pulp/api/v3/tasks/d595448f-d74c-43a7-b66d-65c3ba28f57c/ HTTP/1.1” 200 379 “-” “OpenAPI-Generator/3.7.1/ruby”
Apr 28 15:36:28 katello pulpcore-api: - - [28/Apr/2021:20:36:28 +0000] “GET /pulp/api/v3/tasks/d595448f-d74c-43a7-b66d-65c3ba28f57c/ HTTP/1.1” 200 379 “-” “OpenAPI-Generator/3.7.1/ruby”
Apr 28 15:36:44 katello pulpcore-api: - - [28/Apr/2021:20:36:44 +0000] “GET /pulp/api/v3/tasks/d595448f-d74c-43a7-b66d-65c3ba28f57c/ HTTP/1.1” 200 379 “-” “OpenAPI-Generator/3.7.1/ruby”
Apr 28 15:37:00 katello pulpcore-api: - - [28/Apr/2021:20:37:00 +0000] “GET /pulp/api/v3/tasks/d595448f-d74c-43a7-b66d-65c3ba28f57c/ HTTP/1.1” 200 379 “-” “OpenAPI-Generator/3.7.1/ruby”
Apr 28 15:37:16 katello pulpcore-api: - - [28/Apr/2021:20:37:16 +0000] “GET /pulp/api/v3/tasks/d595448f-d74c-43a7-b66d-65c3ba28f57c/ HTTP/1.1” 200 379 “-” “OpenAPI-Generator/3.7.1/ruby”

Could there be some data inconsistencies on my pulp data that are prevenitng this (the orphan cleanup), and also the more important pulp3 migration from happening?

I’m open to trying various things to check the data for consistency etc. Let me know if you think I’ on the right path here. :slight_smile:

@caseybea you might want to curl the /pulp/api/v3/tasks/d595448f-d74c-43a7-b66d-65c3ba28f57c/ href too. I wonder if your Pulp 3 just isn’t running tasks which is why you’re getting stuck. If orphan cleanup doesn’t complete I’m going to guess this is the case. Your migration definitely shouldn’t be stuck at “waiting”, it should be “running”. I’ll see if the Pulp team has any recommendations for debugging stuck Pulp 3 tasks.

here’'s the curl output (yes, says waiting). Orphan cleanup still stuck. I’ll kill it at the end of the day if it doesn’t move…

I also am including a hammer status just for good measure.

[root@katello ~]# curl https://katello.ctsi.mcw.edu:/pulp/api/v3/tasks/d595448f-d74c-43a7-b66d-65c3ba28f57c/ --cert /etc/pki/katello/certs/pulp-client.crt --key /etc/pki/katello/private/pulp-client.key
{“pulp_href”:"/pulp/api/v3/tasks/d595448f-d74c-43a7-b66d-65c3ba28f57c/",“pulp_created”:“2021-04-28T20:20:43.534728Z”,“state”:“waiting”,“name”:“pulpcore.app.tasks.orphan.orphan_cleanup”,“started_at”:null,“finished_at”:null,“error”:null,“worker”:null,“parent_task”:null,“child_tasks”:,“task_group”:null,“progress_reports”:,“created_resources”:,“reserved_resources_record”:}[root@katello ~]#
[root@katello ~]#
[root@katello ~]# hammer status
Version: 2.3.3
API Version: v2
Database:
Status: ok
Server Response: Duration: 0ms
Plugins:

  1. Name: foreman-tasks
    Version: 3.0.5
  2. Name: foreman_remote_execution
    Version: 4.2.2
  3. Name: katello
    Version: 3.18.2.1
    Smart Proxies:
  4. Name: katello.ctsi.mcw.edu
    Version: 2.3.3
    Status: ok
    Features:
    1. Name: pulp
      Version: 2.1.0
    2. Name: pulpcore
      Version: 2.1.0
    3. Name: dynflow
      Version: 0.3.0
    4. Name: ssh
      Version: 0.3.1
    5. Name: templates
      Version: 2.3.3
    6. Name: tftp
      Version: 2.3.3
    7. Name: puppetca
      Version: 2.3.3
    8. Name: puppet
      Version: 2.3.3
    9. Name: logs
      Version: 2.3.3
      10)Name: httpboot
      Version: 2.3.3
      11)Name: registration
      Version: 2.3.3
      Compute Resources:

candlepin:
Status: ok
Server Response: Duration: 17ms
candlepin_events:
Status: ok
message: 2 Processed, 0 Failed
Server Response: Duration: 0ms
candlepin_auth:
Status: ok
Server Response: Duration: 15ms
katello_events:
Status: ok
message: 1 Processed, 0 Failed
Server Response: Duration: 0ms
pulp:
Status: ok
Server Response: Duration: 27ms
pulp_auth:
Status: ok
Server Response: Duration: 12ms
pulp3:
Status: ok
Server Response: Duration: 35ms
foreman_tasks:
Status: ok
Server Response: Duration: 2ms

[root@katello ~]#

I truly appreciate all the help! I look forward to seeing what you hear back. Thank you so much.

@iballou So just to confirm, the orphan cleanup never completed, it got stuck right away as previously noted. So at present, I cannot run that, and I cannot run the “content prepare” as we know. Let me know what you hear from the pulp team, thank you!

Just confirming where we’re at.

I did try one extra thing, which was performing a db.repairDatabase() on the pulp database. It executed just fine, no errors. But-- did not make a difference, both the orphan and prepare still get stuck. It was worth a shot!

@caseybea so if a task is waiting in Pulp 3, that means either something else is running and taking up all of the task time (maybe something is stuck?) or there might be something else relating to tasks orphaned in the Pulp 3 database.

To check for other running tasks, curl the following:

curl https://`hostname`/pulp/api/v3/tasks/?state=running   --cert /etc/pki/katello/certs/pulp-client.crt  --key /etc/pki/katello/private/pulp-client.key

If anything comes up, we should kill it like so:

curl --request PATCH --header "Content-Type: application/json" --data '{ "state": "canceled" }' https://`hostname`/pulp/api/v3/tasks/<task_id>/   --cert /etc/pki/katello/certs/pulp-client.crt  --key /etc/pki/katello/private/pulp-client.key

If that doesn’t work, the last option I can think of for now would be to drop the Pulp 3 database, re-create it, and re-migrate it. There may be a simpler way, but I’ll have to check. We’ll get to that after we try checking the tasks.

booooo (not you, but my situation- ha!)

Did the curl check— and… nothing:

{“count”:0,“next”:null,“previous”:null,“results”:}

I of course am more than willing to try the DB drop as suggested. Let me know what you wish me to try!

(and again, thank you SO much for the continued help. I off course am hoping to resolve my own situation, but I suspect as more and more people get closer to Katello 4 and the required pulp3 migration, there’s probably a few other folks that may end up with the same issue)

2 Likes

@caseybea no problem! The more issues we solve here the better for the future.

The Pulp team gave me a command to try before we drop the entire DB:

sudo systemctl stop pulpcore* --all
sudo -u pulp PULP_SETTINGS='/etc/pulp/settings.py'  /usr/bin/pulpcore-manager shell -c "import pulpcore; pulpcore.app.models.ReservedResource.objects.all().delete()"
sudo systemctl restart pulpcore* --all

For other people joining us, please only run the above command if you have not run the Pulp 3 switchover and are willing to have to run through the full Pulp 3 migration again. It is a dangerous operation!

Then, I’d say try running orphan cleanup because it should take much less time than the migration. If orphan cleanup doesn’t get stuck, you should be good to run the migration.

If orphaned cleanup gets stuck, here are the commands to reset the database:

sudo systemctl stop pulpcore*
sudo su - postgres
dropdb pulpcore
createdb pulpcore
exit
cd /tmp
sudo -u pulp PULP_SETTINGS='/etc/pulp/settings.py' DJANGO_SETTINGS_MODULE='pulpcore.app.settings' /usr/bin/pulpcore-manager  migrate --no-input
sudo -u pulp PULP_SETTINGS='/etc/pulp/settings.py' DJANGO_SETTINGS_MODULE='pulpcore.app.settings'  /usr/bin/pulpcore-manager reset-admin-password --password <some random password>
sudo systemctl restart pulpcore*
2 Likes

Aha! I think you have finally removed the thorn from the lion’s paw.

After the first command (the object delete one), the orphaned-delete operation completed successfully. I have now begun the “content prepare” operation. It is moving along AND I’m actually getting output.

I will report back after this too has comepeted and I’ve attempted the actual miggration command.

To clarify for anyone else watching, I only did the first command, I did NOT have to delete the etire database as noted above “if the first command didn’t work”.

Glad to hear that pulpcore-manager shell command helped! That’s a bug that the Pulp team fixed only recently, which explains why it’s happening in 3.18. Let us know if you have any other issues.

1 Like

Thanks @iballou for looking at this. I marked your comment with the steps as the solution.

OK, well certainy a lot of progress but not out of the woods yet.

As the content prepare got to the end of the RPMs, it failed with this error:

2021-04-29 15:36:30 -0500: Migrating rpm content to Pulp 3 erratum 2997/234938Migration failed, You will want to investigate: https://katello.ctsi.mcw.edu/foreman_tasks/tasks/c3ace595-9285-49e9-adf3-6acac5daccdc
rake aborted!
ForemanTasks::TaskError: Task c3ace595-9285-49e9-adf3-6acac5daccdc: Katello::Errors::Pulp3Error: No declared artifact with relative path ".treeinfo" for content "<DistributionTree: pk=26685499-74b8-4337-adc6-5bb2d5325ae3>"
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.18.2.1/lib/katello/tasks/pulp3_migration.rake:33:in `block (2 levels) in <top (required)>'
/opt/rh/rh-ruby25/root/usr/share/gems/gems/rake-12.3.0/exe/rake:27:in `<top (required)>'
Tasks: TOP => katello:pulp3_migration
(See full trace by running task with --trace)
                                                                      [FAIL]
Failed executing foreman-rake katello:pulp3_migration, exit status 1
--------------------------------------------------------------------------------
Scenario [Prepare content for Pulp 3] failed.

The following steps ended up in failing state:

  [content-prepare]

Resolve the failed steps and rerun
the command. In case the failures are false positives,
use --whitelist="content-prepare"

The error in the task monitor was:

No declared artifact with relative path “.treeinfo” for content “<DistributionTree: pk=26685499-74b8-4337-adc6-5bb2d5325ae3>”

And here is the /var/log/messages content from the time of the abort:

Apr 29 15:36:01 katello pulpcore-api: - - [29/Apr/2021:20:36:01 +0000] "GET /pulp/api/v3/tasks/9bedaf30-97ad-417c-a16a-11a037f76ce2/ HTTP/1.1" 200 7017 "-" "OpenAPI-Generator/3.7.1/ruby"
Apr 29 15:36:01 katello pulpcore-api: - - [29/Apr/2021:20:36:01 +0000] "GET /pulp/api/v3/task-groups/211a1157-9f21-40d2-ad0e-dc89b54e91a3/ HTTP/1.1" 200 440 "-" "OpenAPI-Generator/3.7.1/ruby"
Apr 29 15:36:17 katello pulpcore-api: - - [29/Apr/2021:20:36:17 +0000] "GET /pulp/api/v3/tasks/9bedaf30-97ad-417c-a16a-11a037f76ce2/ HTTP/1.1" 200 7020 "-" "OpenAPI-Generator/3.7.1/ruby"
Apr 29 15:36:17 katello pulpcore-api: - - [29/Apr/2021:20:36:17 +0000] "GET /pulp/api/v3/task-groups/211a1157-9f21-40d2-ad0e-dc89b54e91a3/ HTTP/1.1" 200 440 "-" "OpenAPI-Generator/3.7.1/ruby"
Apr 29 15:36:25 katello pulpcore-worker-8: pulp: rq.worker:ERROR: Traceback (most recent call last):
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/rq/worker.py", line 936, in perform_job
Apr 29 15:36:25 katello pulpcore-worker-8: rv = job.perform()
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/rq/job.py", line 684, in perform
Apr 29 15:36:25 katello pulpcore-worker-8: self._result = self._execute()
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/rq/job.py", line 690, in _execute
Apr 29 15:36:25 katello pulpcore-worker-8: return self.func(*self.args, **self.kwargs)
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/tasks/migrate.py", line 81, in migrate_from_pulp2
Apr 29 15:36:25 katello pulpcore-worker-8: migrate_content(plan, skip_corrupted=skip_corrupted)
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/migration.py", line 47, in migrate_content
Apr 29 15:36:25 katello pulpcore-worker-8: plugin.migrator.migrate_content_to_pulp3(skip_corrupted=skip_corrupted)
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/plugin/rpm/migrator.py", line 145, in migrate_content_to_pulp3
Apr 29 15:36:25 katello pulpcore-worker-8: loop.run_until_complete(dm.create())
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib64/python3.6/asyncio/base_events.py", line 484, in run_until_complete
Apr 29 15:36:25 katello pulpcore-worker-8: return future.result()
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/plugin/content.py", line 89, in create
Apr 29 15:36:25 katello pulpcore-worker-8: await pipeline
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/api.py", line 225, in create_pipeline
Apr 29 15:36:25 katello pulpcore-worker-8: await asyncio.gather(*futures)
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/api.py", line 43, in __call__
Apr 29 15:36:25 katello pulpcore-worker-8: await self.run()
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/artifact_stages.py", line 244, in run
Apr 29 15:36:25 katello pulpcore-worker-8: RemoteArtifact.objects.bulk_get_or_create(self._needed_remote_artifacts(batch))
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/artifact_stages.py", line 301, in _needed_remote_artifacts
Apr 29 15:36:25 katello pulpcore-worker-8: msg.format(rp=content_artifact.relative_path, c=d_content.content)
Apr 29 15:36:25 katello pulpcore-worker-8: ValueError: No declared artifact with relative path ".treeinfo" for content "<DistributionTree: pk=26685499-74b8-4337-adc6-5bb2d5325ae3>"
Apr 29 15:36:25 katello pulpcore-worker-8: Traceback (most recent call last):
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/rq/worker.py", line 936, in perform_job
Apr 29 15:36:25 katello pulpcore-worker-8: rv = job.perform()
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/rq/job.py", line 684, in perform
Apr 29 15:36:25 katello pulpcore-worker-8: self._result = self._execute()
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/rq/job.py", line 690, in _execute
Apr 29 15:36:25 katello pulpcore-worker-8: return self.func(*self.args, **self.kwargs)
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/tasks/migrate.py", line 81, in migrate_from_pulp2
Apr 29 15:36:25 katello pulpcore-worker-8: migrate_content(plan, skip_corrupted=skip_corrupted)
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/migration.py", line 47, in migrate_content
Apr 29 15:36:25 katello pulpcore-worker-8: plugin.migrator.migrate_content_to_pulp3(skip_corrupted=skip_corrupted)
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/plugin/rpm/migrator.py", line 145, in migrate_content_to_pulp3
Apr 29 15:36:25 katello pulpcore-worker-8: loop.run_until_complete(dm.create())
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib64/python3.6/asyncio/base_events.py", line 484, in run_until_complete
Apr 29 15:36:25 katello pulpcore-worker-8: return future.result()
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/plugin/content.py", line 89, in create
Apr 29 15:36:25 katello pulpcore-worker-8: await pipeline
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/api.py", line 225, in create_pipeline
Apr 29 15:36:25 katello pulpcore-worker-8: await asyncio.gather(*futures)
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/api.py", line 43, in __call__
Apr 29 15:36:25 katello pulpcore-worker-8: await self.run()
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/artifact_stages.py", line 244, in run
Apr 29 15:36:25 katello pulpcore-worker-8: RemoteArtifact.objects.bulk_get_or_create(self._needed_remote_artifacts(batch))
Apr 29 15:36:25 katello pulpcore-worker-8: File "/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/artifact_stages.py", line 301, in _needed_remote_artifacts
Apr 29 15:36:25 katello pulpcore-worker-8: msg.format(rp=content_artifact.relative_path, c=d_content.content)
Apr 29 15:36:25 katello pulpcore-worker-8: ValueError: No declared artifact with relative path ".treeinfo" for content "<DistributionTree: pk=26685499-74b8-4337-adc6-5bb2d5325ae3>"
Apr 29 15:36:26 katello pulpcore-worker-8: pulp: rq.worker:INFO: Cleaning registries for queue: 46938@katello.ctsi.mcw.edu
Apr 29 15:36:26 katello pulpcore-worker-8: pulp: rq.worker:INFO: 46938@katello.ctsi.mcw.edu: 94e7cb44-25cd-4a47-a067-3f8f241fcf12
Apr 29 15:36:26 katello pulpcore-worker-8: pulp: rq.worker:INFO: 46938@katello.ctsi.mcw.edu: Job OK (94e7cb44-25cd-4a47-a067-3f8f241fcf12)
Apr 29 15:36:34 katello pulpcore-api: - - [29/Apr/2021:20:36:34 +0000] "GET /pulp/api/v3/tasks/9bedaf30-97ad-417c-a16a-11a037f76ce2/ HTTP/1.1" 200 8979 "-" "OpenAPI-Generator/3.7.1/ruby"
Apr 29 15:36:34 katello pulpcore-api: - - [29/Apr/2021:20:36:34 +0000] "GET /pulp/api/v3/task-groups/211a1157-9f21-40d2-ad0e-dc89b54e91a3/ HTTP/1.1" 200 440 "-" "OpenAPI-Generator/3.7.1/ruby"
Apr 29 15:39:31 katello pulp: celery.beat:INFO: Scheduler: Sending due task download_deferred_content (pulp.server.controllers.repository.queue_download_deferred)
Apr 29 15:39:31 katello pulp: celery.worker.strategy:INFO: Received task: pulp.server.controllers.repository.queue_download_deferred[7a750e82-7138-43d8-983b-fe0ba643eff0]
Apr 29 15:39:31 katello pulp: celery.app.trace:INFO: [7a750e82] Task pulp.server.controllers.repository.queue_download_deferred[7a750e82-7138-43d8-983b-fe0ba643eff0] succeeded in 0.00519433498266s: None
Apr 29 15:39:31 katello pulp: celery.worker.strategy:INFO: Received task: pulp.server.controllers.repository.download_deferred[d50c4d29-3fcf-4790-8740-c05990d89db5]
Apr 29 15:39:32 katello pulp: celery.app.trace:INFO: [d50c4d29] Task pulp.server.controllers.repository.download_deferred[d50c4d29-3fcf-4790-8740-c05990d89db5] succeeded in 1.007954431s: None
Apr 29 15:39:59 katello pulpcore-worker-7: pulp: rq.worker:INFO: Cleaning registries for queue: 46941@katello.ctsi.mcw.edu
Apr 29 15:40:00 katello pulpcore-worker-6: pulp: rq.worker:INFO: Cleaning registries for queue: 46942@katello.ctsi.mcw.edu
Apr 29 15:40:00 katello pulpcore-worker-2: pulp: rq.worker:INFO: Cleaning registries for queue: 46939@katello.ctsi.mcw.edu
Apr 29 15:40:00 katello pulpcore-worker-1: pulp: rq.worker:INFO: Cleaning registries for queue: 46935@katello.ctsi.mcw.edu

Same error I have: [ContentMigration] Katello::Errors::Pulp3Error

Do you have CentOS 8 Stream repositories?

Centos8, yes. Centos8 STREAM, no.

Can you check in the Postgres database which distribution it is? Instructions are in the other thread.

Note that this creates the pulpcore database using the current locale. In the installer we enforce en_US.UTF-8 and UTF-8 encoding. If your OS locale is en_US.UTF-8 then it’ll be ok but this affects the results of sorting and can cause unexpected behavior.

For example:

$ echo -e "a\nb\nc\nd\nch\nh\ni" | LC_ALL=en_US.UTF-8 sort
a
b
c
ch
d
h
i
$ echo -e "a\nb\nc\nd\nch\nh\ni" | LC_ALL=cs_CZ.UTF-8 sort
a
b
c
d
h
ch
i

I’d recommend to be explicit

createdb -E UTF-8 -l en_US.UTF-8 pulpcore
1 Like

@caseybea @gvde since you’re both hitting the same error, mind if we take the remaining discussion over to [ContentMigration] Katello::Errors::Pulp3Error ? I’ll be looking for potentially-related Pulp 3 bugs.

@iballou @gvde Yes I was going to propose that but you beat me to it.

My original issue, the “content prepare” operationg getting just… STUCK— has been solved above. I’ll continue over in the other thread. Thank you much.

1 Like

Absolutely. Maybe you can change the subject of that topic? “Katello::Errors::Pulp3Error” is very generic and I cannot change the subject anymore.

I can’t change it either, but hopefully searching for the error you’re both seeing will lead to the thread.