Katello throws "File already exists" on publication

Problem:
We have one repository which throws the error:

"response"=>
  {"task"=>"/pulp/api/v3/tasks/01969f3f-040d-77df-97ad-42a2f286c7a6/"},
 "pulp_tasks"=>
  [{"pulp_href"=>"/pulp/api/v3/tasks/01969f3f-040d-77df-97ad-42a2f286c7a6/",
    "pulp_created"=>"2025-05-05T07:00:29.326+00:00",
    "pulp_last_updated"=>"2025-05-05T07:00:29.326+00:00",
    "state"=>"failed",
    "name"=>"pulp_rpm.app.tasks.publishing.publish",
    "logging_cid"=>"2126b05a-416d-4812-b31b-64e01fb82e96",
    "created_by"=>"/pulp/api/v3/users/2/",
    "unblocked_at"=>"2025-05-05T07:00:29.342+00:00",
    "started_at"=>"2025-05-05T07:00:29.391+00:00",
    "finished_at"=>"2025-05-05T07:01:20.407+00:00",
    "error"=>
     {"traceback"=>
       "  File \"/usr/lib/python3.11/site-packages/pulpcore/tasking/tasks.py\", line 66, in _execute_task\n" +
       "    result = func(*args, **kwargs)\n" +
       "             ^^^^^^^^^^^^^^^^^^^^^\n" +
       "  File \"/usr/lib/python3.11/site-packages/pulp_rpm/app/tasks/publishing.py\", line 385, in publish\n" +
       "    generate_repo_metadata(\n" +
       "  File \"/usr/lib/python3.11/site-packages/pulp_rpm/app/tasks/publishing.py\", line 584, in generate_repo_metadata\n" +
       "    upd_xml = cr.UpdateInfoXmlFile(upd_xml_path, compressiontype=cr_compression_type)\n" +
       "              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n" +
       "  File \"/usr/lib64/python3.11/site-packages/createrepo_c/__init__.py\", line 306, in __init__\n" +
       "    XmlFile.__init__(self, path, XMLFILE_UPDATEINFO,\n",
      "description"=>"File already exists"},
    "worker"=>"/pulp/api/v3/workers/01967f9a-472b-7ca1-8f70-d85bd3087ce1/",
    "child_tasks"=>[],
    "progress_reports"=>
     [{"message"=>"Generating repository metadata",
       "code"=>"publish.generating_metadata",
       "state"=>"failed",
       "total"=>1,
       "done"=>0}],
    "created_resources"=>[],
    "reserved_resources_record"=>
     ["shared:/pulp/api/v3/repositories/rpm/rpm/0191e152-5f17-7cd9-a8d4-8ede25437446/",
      "shared:/pulp/api/v3/domains/01914b7a-8bfd-7257-8d41-150489261881/"]}],
 "task_groups"=>[],
 "poll_attempts"=>{"total"=>22, "failed"=>1}}

We tried:

  • complete sync
  • metadata recreation

This also leads to problems with orphaned cleanup.

Expected outcome:
New Version of the repo can be published

Foreman and Proxy versions:
Foreman and Proxies 3.12.1
Katello 4.14.3 (No Capsules/Smart Proxies with Katello)

Foreman and Proxy plugin versions:

  • candlepin-4.4.20-1.el9.noarch
  • candlepin-selinux-4.4.20-1.el9.noarch
  • dynflow-utils-1.6.3-1.el9.x86_64
  • foreman-3.12.1-1.el9.noarch
  • foreman-cli-3.12.1-1.el9.noarch
  • foreman-dynflow-sidekiq-3.12.1-1.el9.noarch
  • foreman-installer-3.12.1-1.el9.noarch
  • foreman-installer-katello-3.12.1-1.el9.noarch
  • foreman-obsolete-packages-1.10-1.el9.noarch
  • foreman-postgresql-3.12.1-1.el9.noarch
  • foreman-proxy-3.12.1-1.el9.noarch
  • foreman-redis-3.12.1-1.el9.noarch
  • foreman-release-3.12.1-1.el9.noarch
  • foreman-selinux-3.12.1-1.el9.noarch
  • foreman-service-3.12.1-1.el9.noarch
  • foreman-vmware-3.12.1-1.el9.noarch
  • katello-4.14.3-1.el9.noarch
  • katello-certs-tools-2.10.0-1.el9.noarch
  • katello-client-bootstrap-1.7.9-2.el9.noarch
  • katello-common-4.14.3-1.el9.noarch
  • katello-host-tools-4.4.0-2.el9.noarch
  • katello-host-tools-tracer-4.4.0-2.el9.noarch
  • katello-repos-4.14.3-1.el9.noarch
  • katello-selinux-5.0.2-1.el9.noarch
  • python3.11-pulp-ansible-0.21.8-1.el9.noarch
  • python3.11-pulp-cli-0.29.2-2.el9.noarch
  • python3.11-pulp-container-2.20.3-1.el9.noarch
  • python3.11-pulp-deb-3.2.1-1.el9.noarch
  • python3.11-pulp-glue-0.29.2-2.el9.noarch
  • python3.11-pulp-python-3.11.3-1.el9.noarch
  • python3.11-pulp-rpm-3.26.1-1.el9.noarch
  • python3.11-pulpcore-3.49.22-1.el9.noarch
  • rubygem-dynflow-1.9.0-1.el9.noarch
  • rubygem-foreman-tasks-9.2.3-1.fm3_12.el9.noarch
  • rubygem-foreman_ansible-14.2.1-1.fm3_12.el9.noarch
  • rubygem-foreman_dhcp_browser-0.0.8-6.fm3_10.el9.noarch
  • rubygem-foreman_discovery-24.0.2-1.fm3_12.el9.noarch
  • rubygem-foreman_dlm-3.0.0-1.fm3_11.el9.noarch
  • rubygem-foreman_expire_hosts-8.2.0-2.fm3_12.el9.noarch
  • rubygem-foreman_leapp-1.2.1-2.fm3_11.el9.noarch
  • rubygem-foreman_maintain-1.7.6-1.el9.noarch
  • rubygem-foreman_monitoring-3.2.0-2.fm3_12.el9.noarch
  • rubygem-foreman_puppet-7.0.0-2.fm3_12.el9.noarch
  • rubygem-foreman_remote_execution-13.2.5-1.fm3_12.el9.noarch
  • rubygem-foreman_rescue-4.0.1-1.fm3_11.el9.noarch
  • rubygem-foreman_scc_manager-3.1.1-1.fm3_12.el9.noarch
  • rubygem-foreman_snapshot_management-3.0.1-1.fm3_11.el9.noarch
  • rubygem-foreman_statistics-2.1.0-3.fm3_11.el9.noarch
  • rubygem-foreman_supervisory_authority-0.1.1-1.el9.noarch
  • rubygem-foreman_templates-9.5.1-1.fm3_12.el9.noarch
  • rubygem-foreman_vault-2.0.0-1.fm3_11.el9.noarch
  • rubygem-foreman_virt_who_configure-0.5.23-1.fm3_12.el9.noarch
  • rubygem-foreman_webhooks-3.2.3-1.fm3_12.el9.noarch
  • rubygem-foreman_wreckingball-5.0.0-1.fm3_11.el9.noarch
  • rubygem-hammer_cli-3.12.0-1.el9.noarch
  • rubygem-hammer_cli_foreman-3.12.0-1.el9.noarch
  • rubygem-hammer_cli_foreman_remote_execution-0.3.0-1.el9.noarch
  • rubygem-hammer_cli_foreman_ssh-0.0.3-1.el9.noarch
  • rubygem-hammer_cli_foreman_tasks-0.0.21-1.fm3_11.el9.noarch
  • rubygem-hammer_cli_foreman_virt_who_configure-0.1.1-1.fm3_12.el9.noarch
  • rubygem-hammer_cli_katello-1.14.3-1.el9.noarch
  • rubygem-katello-4.14.3-1.el9.noarch
  • rubygem-pulp_ansible_client-0.21.7-1.el9.noarch
  • rubygem-pulp_certguard_client-3.49.17-1.el9.noarch
  • rubygem-pulp_container_client-2.20.2-1.el9.noarch
  • rubygem-pulp_deb_client-3.2.1-1.el9.noarch
  • rubygem-pulp_file_client-3.49.17-1.el9.noarch
  • rubygem-pulp_ostree_client-2.3.2-1.el9.noarch
  • rubygem-pulp_python_client-3.11.2-1.el9.noarch
  • rubygem-pulp_rpm_client-3.26.1-1.el9.noarch
  • rubygem-pulpcore_client-3.49.17-1.el9.noarch
  • rubygem-puppetdb_foreman-6.0.2-1.fm3_10.el9.noarch
  • rubygem-smart_proxy_pulp-3.3.0-1.el9.noarch

Distribution and version:
RHEL 9.5

Hi @ochnerd,

Thanks for your post.

We’d like to get some information about your server for debug.
Does it ever publish? Does this issue happen to one repo or any repo?
Anything special with the repo that failed to publish? Were you able to publish this repo? What changes were made if any?
Do you think this issue is re-producible? If you can provide the steps to re-create it, it would help to speed up the troubleshooting.

Thank you.

Hi @lfu,

it is only this repository which has a problem. Any other repository works completly fine.
The Remove orphans Job failes with the error:

The repository version cannot be deleted because it (or its publications) are currently being used to distribute content. Please update the necessary distributions first.

And I see the following in /var/log/messages:

/var/log/messages:May  5 09:30:16 infra-tfmaio-01 pulpcore-api[2747386]: pulp [51f86280dac74000ba3fe3e03db1c1e3]: django.request:WARNING: Bad Request: /pulp/api/v3/repositories/rpm/rpm/0191e152-5f17-7cd9-a8d4-8ede25437446/versions/91/

Via pulp-cli I checked which Repo is the Problem and it is the same Repo with the File already exists error.
It is possible that I resumed a failed Sync and now this Repos is in this state.
The orphaned Cleanup failure lead me to the following sattelite Ticket https://issues.redhat.com/browse/SAT-31400.
But I did not find how to fix the error.

We hit the same problem. The upstream repomd.xml lists two <data type="updateinfo"> entries, each pointing to a different errata file, but one of the type attributes is prefixed with a hash. DNF silently skips the hashed entry, whereas Pulp parses every data‑type and fails on the duplicate. It might be worth checking if the repomd.xml of your repository has similar issues.

For what it’s worth, current pulp‑rpm releases no longer stumble over this.

3 Likes

In our case the affected repository is a SLES repo.

@hstct , interesting, we have the same issue with a sles repository … might be the same problem and repo

This is fixed in upstream Bug #38205: Orphan deletion fails with "The repository version cannot be deleted because it (or its publications) are currently being used to distribute content. Please update the necessary distributions first." - Katello - Foreman and will land in Katello 4.17…That should at least allow you to delete the content in question.

1 Like

ok but seems like I have to live with this Problem … @hstct what did you end up doing with this repository?

I was also involved here and our current summary is:

  1. This is a problem with the metadata in the SLES repo, and users should open a ticket with Suse to fix the repo. As far as we can tell they somehow added two versions of the same updateinfo metadata file to a single repo, which does not make any sense, and necessarily leads to undefined behavior in clients.
  2. If you are using the latest version of pulp_rpm, you will still get some undefined behavior in your Pulp publication (I believe pulp_rpm will add one of the updateinfo at random to your publication and try to add the second one albeit in a broken way where it does not actually become part of the repo metadata) but it is no longer a hard fail. (Of course as a Katello user, you can’t just upgrade pulp_rpm to the latest version, you need to wait for a new Katello version to do that for you…)

=> But yes, the only one who can really fix this is whoever is maintaining that repo at Suse. The only thing anyone else can do (yum, dnf, zypper, Pulp) is guess at random which of the two updateinfo should be used. Which implies that when you try to apply security errata from this repo, something random will happen. It might happen to be what you want/what Suse intended, or it might not be. You don’t want guessing to be involved in applying security patches, so do open a ticket with Suse!

If you don’t care about errata and just want to sync the repo so you have those packages available, it would be possible to patch pulp_rpm, with an ugly workaround, but we currently don’t want to do this.

What I did not test is whether you might be able to work around this issue by using “mirror complete” mode on the sync. That way pulp_rpm should use the upstream metadata verbatim and you get exactly what you would get from the original Suse repo. You can try this, but be aware that mirror complete has certain advantages and disadvantages that you will be getting as a package deal.

3 Likes

@quba42 , thanks for the explanation/clarification. Now I understand the issue completly.
I will open a ticket with suse to resolve this issue and thanks for your help :slightly_smiling_face:

2 Likes