It looks like the part that is failing is the Pulp status check, interestingly enough. That status endpoint is usually wide open. So, I wonder then if the CA cert is the issue:
The next thing worth checking would be to open up foreman-rake console and see what file the following returns:
::Cert::Certs.backend_ca_cert_file(:pulp3)
That should return the path to the CA cert used. You could then test:
In 4.13, Katello doesn’t explicitly state which CA should be used with Pulp. As of Katello 4.15.0, the PR above made it in and the CA cert gets explicitly set. That’s confusing my debugging with your 4.13 machine a bit.
If you’re willing to add a hack to test out a theory of mine, what happens if you add
To the very top of the method pulp3_ssl_configuration in app/models/katello/concerns/smart_proxy_extensions.rb ? That should be under /usr/share/gems/gems/katello-4.13...
I will add, now we use what is set in the “SSL CA file” Setting by default. For me that’s /etc/foreman/proxy_ca.pem. Also worth a try for the cert in my hack above if it doesn’t help first time.
I added the line to the method pulp3_ssl_configuration in /usr/share/gems/gems/katello-4.13.1/app/models/katello/concerns/smart_proxy_extensions.rb
This does not appear to change anything as the task still fails with the same error. Changing the ca_file to /etc/foreman/proxy_ca.pem also did not change anything.
Curl of pulp/api/v3/status/ without providing any cert is successful. To curl I do however need to provide --noproxy not sure if this is relevant for the remove orphaned content task.
That’s interesting, it could be relevant, I suppose that means your environment is operating with an HTTP proxy setup somewhere? What happens when you don’t circumvent the proxy?
when I do not circumvent the proxy, curl failed on the cert
I looked into the proxy as a cause a bit further and ended up adjusting the proxy config in /etc/profile
As expected the curl test succeeded.
After restarting foreman the task no longer instantly fails on a certificate error.
We do however get a different error and the cleanup orphaned content task enters the paused state:
["The repository version cannot be deleted because it (or its publications) are currently being used to distribute content. Please update the necessary distributions first."]
We have cancelled the task and will look into this further ourselves first as for now the original issue of the cert error has been resolved.
Essentially, it’s triggered by Katello trying to clean up repositories that are being distributed unexpectedly. Pulp stopped allowing users to delete the repositories – the distribution now must be deleted instead.
To work around it, the distribution that is distributing the content of the repository version that failed to be deleted needs to be deleted itself. I would expect this cleanup to be a one time thing – Katello shouldn’t be making a habit of leaving behind orphaned distributions in Pulp.
Another thing to try: do a ‘complete sync’ of your smart proxy. It seems this bug may be caused by the smart proxy distributing old versions of repository versions for whatever reason. It might be worth checking old smart proxy sync tasks to see if any RefreshDistribution steps were skipped in Dynflow.
I have checked the bug you linked and will follow it further.
In the meantime we will trigger complete syncs of repositories to see if this resolved the issue. We do however also have some EOL repos like CentOS which we won’t be able to sync.
This issue is on the foreman server itself, we do not have a second Smart Proxy configured in the environment.
Ah right, this issue manifests itself primarily on external smart proxies so far so I forgot you have only one.
In that case, if the complete syncs don’t help, the first thing I’d recommend discovering is what repository version Katello is trying to delete that Pulp fails on. What we want to find is the pulp_href for it (that looks like /pulp/api/v3/repositories/container/container/019447d7-54a4-7ac6-b9ff-4401d96d771d/versions/3/). I’m not certain if the target version for deletion gets logged by Pulp in /var/log/messages around the error. If not, it may be visible in the Dynflow console for the failing orphan deletion task.
If that is difficult, the next thing I would ask for would be:
That will generate the list of repository versions that Katello is trying to delete. Cleaning up this issue may involve going through those and seeing which ones are attached to Publications / Distributions in Pulp.