Actions::Katello::CapsuleContent::Sync waiting for pulp to start the task

Problem:
Trying to sync content to capsules does not start at all, dynflow only shows ‘waiting for puppet to start the task’

pulp task list --state=waiting or running shows nothing (0), state=completed does show local tasks (like orphan cleanup, delete_version ), but no sync tasks

Expected outcome:
To sync or not to sync, that’s the question…

Foreman and Proxy versions:
2.4.1.1 (central and capsule)

Foreman and Proxy plugin versions:
katello 4.0.3-1

Distribution and version:
CentOS Linux release 7.9.2009

Other relevant data:

I used all the suggestions from Tasks stuck with "waiting for Pulp to start the task" - foreman-2.4 / Katello 4.0.0 / Pulp3 - #4 by iballou as well as running foreman-installer on both foreman and proxies.

  • hammer ping: ok
  • stopped all running (but waiting for pulp) tasks via foreman-rake console find_task=‘id…’.destroy
  • canceled all (very old) waiting pulp tasks (2 months old)
  • foreman-rake cleanup
  • foreman-installer

pulp worker list shows 22667 entries, some dating back a very long time, (with a heartbeat that’s also very old)
pulp worker list --missing shows 22621 entries… yikes
pulp worker list --online shows only 5 workers, which should be ok.

Hope you can help
Regards, Eize Speerstra

Hi @Speertje,

Is your stuck smart proxy also running Pulp 3? Is “Pulpcore” in the “Active features”? Are there Pulp3-related tasks in the smart proxy sync Dynflow console?

It would be good to see the following on the smart proxy:

sudo pip3 list | grep pulp

If it is running Pulp 3, then perhaps your stuck workers are on the smart proxy instead of the main Katello server. Try running this from the other thread on the smart proxy:

[root@formanserver ~]# sudo -u pulp PULP_SETTINGS='/etc/pulp/settings.py' DJANGO_SETTINGS_MODULE='pulpcore.app.settings' pulpcore-manager shell <<EOF
> from pulpcore.app.models import ReservedResource, Worker
> worker_to_res = {}
> for rr in ReservedResource.objects.all():
>   worker_to_res[rr.worker_id] = rr.pulp_id
> workers = [w.pulp_id for w in Worker.objects.online_workers()]
> for rwork in worker_to_res:
>   if rwork not in workers:
>     print(f'Worker {rwork} owns ReservedResource {worker_to_res[rwork]} and is not in online_workers!!')
> EOF

If Pulpcore is older than 3.14, then the old tasking system is in use. That one was susceptible to having stalled workers.

Hi Ian,

Thanks for the reply. pulpcore was indeed on 3.9 (foreman 2.4)
I’ve just upgraded to 2.5 (pulpcore 3.14)

Using your command line before the upgrade I saw 13 workers owning resources and not in online_workers on both my proxies.

After the upgrade (which I had hoped would fix it) I still see the 13 stuck workers.

Pulp can however start new synchronization tasks, so at least it looks like I can sync all content-views now.

2 remaining questions:

  • Should (and how do) I clear the stuck workers
  • how can I use ‘pulp task list’ on the proxies to monitor tasks, there is no pulp key/cert on the proxies to put in .config/pulp/cli.toml

Thanks again!

Hi @Speertje,

Did you also upgrade Pulp 3 on your smart proxies? If you did, do you see USE_NEW_WORKER_TYPE = True in /etc/pulp/settings.py?

If you want to use the pulp-cli on your proxy, you have a couple options. You can copy the Katello certs from your Katello machine over to your smart proxy and use pulp-cli from their. Or, you can configure a profile from pulp-cli on your Katello machine to talk to your smart proxy. That was you wouldn’t need to copy certs.

I did this, and it looked like the following:

[cli]
base_url = "https://centos7-katello-devel.example.com"
cert = "/etc/pki/katello/certs/pulp-client.crt"
key = "/etc/pki/katello/private/pulp-client.key"

[cli-sandbox]
base_url = "https://pulp3-sandbox-centos7.example.com"
cert = "/etc/pki/katello/certs/pulp-client.crt"
key = "/etc/pki/katello/private/pulp-client.key"

Hi Ian,

Yes, the proxies are on pulp3 too.
This line is available in the setting.py
USE_NEW_WORKER_TYPE = True in /etc/pulp/settings.py

Syncing works just fine now
It’s now just a matter of deleting the ‘old’ workers or resources locks they might still hold.

I also figured out how I could use a pulp –baseurl command without the need to alter the cli.toml file.

Met vriendelijke groet,

Eize Speerstra
DevOps Engineer

[Text Description automatically generated]https://www.kpn.com/

+316 20 49 31 79
eize.speerstra@kpn.commailto:eize.speerstra@kpn.com

B2BS DSP Tooling&Monitoring Services I
Maastricht-De Colonel

Team mail: dsp-ts@kpn.commailto:dsp-ts@kpn.com

The Linux Community on TeamKPNhttps://teamkpn.kpnnet.org/group/detail/groep-the-linux-community
The Linux Community on Microsoft Teamshttps://teams.microsoft.com/l/team/19%3A486307bdd51545c4925aa69b2aa2dc2f%40thread.skype/conversations?groupId=d985429b-37fc-4efc-86c6-ce47664d67ae&tenantId=d7790549-8c35-40ea-ad75-954ac3e86be8

[twitter]https://twitter.com/kpn[facebook]https://www.facebook.com/kpn[linkedin]https://www.linkedin.com/company/kpn[youtube]https://www.youtube.com/user/KPN

[/uploads/default/original/2X/7/716cedd41ad117a92319315ae984b5bde8905958.png]

The information transmitted is intended only for use by the addressee and may contain confidential and/or privileged material. Any review, re-transmission, dissemination or other use of it, or the taking of any action in reliance upon this information by persons and/or entities other than the intended recipient is prohibited. If you received this in error, please inform the sender and/or addressee immediately and delete the material. Thank you.

image001.png

image002.png

image003.png

image004.png

image005.png

image007.png

image008.jpg

Glad your syncing is working. I’ll ask the Pulp team about cleaning up the old missing workers.

The old workers should be getting cleaned up if you’re on Pulpcore 3.14. Do you mind pasting what you are seeing regarding the workers?

Hi Ian,

I used:

pulp worker list --not-online
pulp worker list –missing
On the Central node, it returned 0 in both commands

But using
pulp worker list --base-url https://proxy_1 --not-online → 4658 ( --online → 4 )
pulp worker list --base-url https://proxy_1 –missing → 4652 ( --not-missing → 12)

pulp worker list --base-url https://proxy_2 --not-online → 4612 ( --online → 5 )
pulp worker list --base-url https://proxy_2 –missing → 4602 ( --not-missing → 14)

Met vriendelijke groet,

Eize Speerstra
DevOps Engineer

[Text Description automatically generated]https://www.kpn.com/

+316 20 49 31 79
eize.speerstra@kpn.commailto:eize.speerstra@kpn.com

B2BS DSP Tooling&Monitoring Services I
Maastricht-De Colonel

Team mail: dsp-ts@kpn.commailto:dsp-ts@kpn.com

The Linux Community on TeamKPNhttps://teamkpn.kpnnet.org/group/detail/groep-the-linux-community
The Linux Community on Microsoft Teamshttps://teams.microsoft.com/l/team/19%3A486307bdd51545c4925aa69b2aa2dc2f%40thread.skype/conversations?groupId=d985429b-37fc-4efc-86c6-ce47664d67ae&tenantId=d7790549-8c35-40ea-ad75-954ac3e86be8

[twitter]https://twitter.com/kpn[facebook]https://www.facebook.com/kpn[linkedin]https://www.linkedin.com/company/kpn[youtube]https://www.youtube.com/user/KPN

[/uploads/default/original/2X/9/99f08693a76747f459ec4b8512810f85835ac39e.png]

The information transmitted is intended only for use by the addressee and may contain confidential and/or privileged material. Any review, re-transmission, dissemination or other use of it, or the taking of any action in reliance upon this information by persons and/or entities other than the intended recipient is prohibited. If you received this in error, please inform the sender and/or addressee immediately and delete the material. Thank you.

image007.png

image014.png

image002.png

image003.png

image004.png

image005.png

Thanks for the info. And, just to be sure, the Smart Proxies also have Pulpcore 3.14 and not an older version?

Hi Ian,

Indeed all three are running python3-pulpcore.noarch 3.14.8-2.el7 @pulpcore

Met vriendelijke groet,

Eize Speerstra
DevOps Engineer

[Text Description automatically generated]https://www.kpn.com/

+316 20 49 31 79
eize.speerstra@kpn.commailto:eize.speerstra@kpn.com

B2BS DSP Tooling&Monitoring Services I
Maastricht-De Colonel

Team mail: dsp-ts@kpn.commailto:dsp-ts@kpn.com

The Linux Community on TeamKPNhttps://teamkpn.kpnnet.org/group/detail/groep-the-linux-community
The Linux Community on Microsoft Teamshttps://teams.microsoft.com/l/team/19%3A486307bdd51545c4925aa69b2aa2dc2f%40thread.skype/conversations?groupId=d985429b-37fc-4efc-86c6-ce47664d67ae&tenantId=d7790549-8c35-40ea-ad75-954ac3e86be8

[twitter]https://twitter.com/kpn[facebook]https://www.facebook.com/kpn[linkedin]https://www.linkedin.com/company/kpn[youtube]https://www.youtube.com/user/KPN

[/uploads/default/original/2X/a/a64b32b46bf38dcbf978626e96cfc95736b62a30.png]

The information transmitted is intended only for use by the addressee and may contain confidential and/or privileged material. Any review, re-transmission, dissemination or other use of it, or the taking of any action in reliance upon this information by persons and/or entities other than the intended recipient is prohibited. If you received this in error, please inform the sender and/or addressee immediately and delete the material. Thank you.

image009.png

image002.png

image003.png

image004.png

image005.png

image007.png

image008.jpg

We’ve made the change in upstream Pulp to clean up dead workers after 7 days (Issue #8988: `pulpcore-worker` startup should remove old worker records - Pulp). @iballou please file an issue if you’d like it backported :slight_smile:

@Speertje do you see the missing workers causing any functional problems besides them just existing?

Hi Ian,

No, there are no issues, everything functions normally. As long as it’s not blocking anything or taking up resources, I’m fine with it.

Met vriendelijke groet,

Eize Speerstra
DevOps Engineer

[Text Description automatically generated]https://www.kpn.com/

+316 20 49 31 79
eize.speerstra@kpn.commailto:eize.speerstra@kpn.com

B2BS DSP Tooling&Monitoring Services I
Maastricht-De Colonel

Team mail: dsp-ts@kpn.commailto:dsp-ts@kpn.com

The Linux Community on TeamKPNhttps://teamkpn.kpnnet.org/group/detail/groep-the-linux-community
The Linux Community on Microsoft Teamshttps://teams.microsoft.com/l/team/19%3A486307bdd51545c4925aa69b2aa2dc2f%40thread.skype/conversations?groupId=d985429b-37fc-4efc-86c6-ce47664d67ae&tenantId=d7790549-8c35-40ea-ad75-954ac3e86be8

[twitter]https://twitter.com/kpn[facebook]https://www.facebook.com/kpn[linkedin]https://www.linkedin.com/company/kpn[youtube]https://www.youtube.com/user/KPN

[/uploads/default/original/2X/a/a64b32b46bf38dcbf978626e96cfc95736b62a30.png]

The information transmitted is intended only for use by the addressee and may contain confidential and/or privileged material. Any review, re-transmission, dissemination or other use of it, or the taking of any action in reliance upon this information by persons and/or entities other than the intended recipient is prohibited. If you received this in error, please inform the sender and/or addressee immediately and delete the material. Thank you.

image009.png

image002.png

image003.png

image004.png

image005.png

image007.png

Okay, thanks @Speertje. We’re leaning towards not backporting that change since the functionality of Pulp isn’t affected.