Sync errors on all syncs including the initial sync between new katello server and content proxy

Problem:

For migration from EL7 to EL8 I have installed and configured a new Katello server with 4.5.0 and added a content proxy also with 4.5.0. Repositories, environments, etc. have been set up with the ansible modules. Now I have run the very first syncs to get my testing and production environments synced to the content proxy.

However, I always get sync errors. I have tried multiple times, either optimized sync or complete sync but each time I get one or more errors, except for a single optimized sync in between.

  1. Sync (optimized?) started by publish and promote:
duplicate key value violates unique constraint "core_repositoryversion_repository_id_number_3c54ce50_uniq"
DETAIL:  Key (repository_id, number)=(94f2daba-1ae5-4808-99fb-2248c3e8e929, 1) already exists.
Could not lookup a publication_href for repo 429
  1. Sync (optimized?) started at almost the same time as no. 1 by publish and promote:
insert or update on table "core_repositorycontent" violates foreign key constraint "core_repositoryconte_version_added_id_d5113f18_fk_core_repo"
DETAIL:  Key (version_added_id)=(b21d5610-806a-4f0b-abd5-467741881dfd) is not present in table "core_repositoryversion".
Could not lookup a publication_href for repo 490
  1. Manual complete sync
insert or update on table "core_repositorycontent" violates foreign key constraint "core_repositoryconte_version_removed_id_4d75bc32_fk_core_repo"
DETAIL:  Key (version_removed_id)=(28b7e2a4-54ad-4357-aa5f-bd03c58b1ff6) is not present in table "core_repositoryversion".
Could not lookup a publication_href for repo 1119insert or update on table "core_repositorycontent" violates foreign key constraint "core_repositoryconte_version_added_id_d5113f18_fk_core_repo"
DETAIL:  Key (version_added_id)=(c9ed5dab-ece0-48be-9d4e-f5fdb6c2abb0) is not present in table "core_repositoryversion".
Could not lookup a publication_href for repo 1090
  1. Yet, another manual complete sync after no. 3 finished with an error:
duplicate key value violates unique constraint "core_repositoryversion_repository_id_number_3c54ce50_uniq"
DETAIL:  Key (repository_id, number)=(94f2daba-1ae5-4808-99fb-2248c3e8e929, 4) already exists.
Could not lookup a publication_href for repo 1119
  1. Manual, optimized sync with no errors.
  2. Manual complete sync:
insert or update on table "core_repositorycontent" violates foreign key constraint "core_repositoryconte_version_added_id_d5113f18_fk_core_repo"
DETAIL:  Key (version_added_id)=(25e11040-2be6-4f0a-8141-8e870d7fa430) is not present in table "core_repositoryversion".
  1. Manual optimized sync:
duplicate key value violates unique constraint "core_repositoryversion_repository_id_number_3c54ce50_uniq"
DETAIL:  Key (repository_id, number)=(94f2daba-1ae5-4808-99fb-2248c3e8e929, 5) already exists.

Now how do I get this fixed?

Expected outcome:
No sync errors.

Foreman and Proxy versions:
Foreman 3.3.0, Katello 4.5.0 up-to-date

Distribution and version:
AlmaLinux 8.6

Other relevant data:
All problems seem to be related with AlmaLinux 8 and CentOS Stream 8 BaseOS:

I dug up the ids from the databases on the main server and the content proxy:

94f2daba-1ae5-4808-99fb-2248c3e8e929

pulpcore=# select repository_ptr_id,last_sync_details from rpm_rpmrepository where repository_ptr_id = '94f2daba-1ae5-4808-99fb-2248c3e8e929';
          repository_ptr_id           |                                                                                                                                                    last_sync_details                                                                                                                                            
         
--------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------
 94f2daba-1ae5-4808-99fb-2248c3e8e929 | {"url": "https://foreman8.example.com/pulp/content/ORG/Testing/el8-epel8/custom/almalinux8/BaseOS_x86_64_os/"...

which matches repo 429 on the main server:

foreman=# select id,relative_path,publication_href from katello_repositories where id in (429);
 id  |                    relative_path                    |                            publication_href                             
-----+-----------------------------------------------------+-------------------------------------------------------------------------
 429 | ORG/Testing/el8/custom/almalinux8/BaseOS_x86_64_os | /pulp/api/v3/publications/rpm/rpm/942ffc44-dd34-4403-8b7d-84662d8cb6e6/

b21d5610-806a-4f0b-abd5-467741881dfd doesn’t exist (anymore?) in the database.
but repo 490 is

foreman=# select id,relative_path,publication_href from katello_repositories where id in (490);
 id  |                       relative_path                       |                            publication_href                             
-----+-----------------------------------------------------------+-------------------------------------------------------------------------
 490 | ORG/Production/el8/custom/centos8stream/BaseOS_x86_64_os | /pulp/api/v3/publications/rpm/rpm/e242c2d0-ab31-4a0f-ac5b-c79257ab221c/

Neither are 28b7e2a4-54ad-4357-aa5f-bd03c58b1ff6, c9ed5dab-ece0-48be-9d4e-f5fdb6c2abb0, 25e11040-2be6-4f0a-8141-8e870d7fa430

pulpcore=# select * from core_repositoryversion where pulp_id in ( '28b7e2a4-54ad-4357-aa5f-bd03c58b1ff6', 'c9ed5dab-ece0-48be-9d4e-f5fdb6c2abb0', '25e11040-2be6-4f0a-8141-8e870d7fa430');
 pulp_id | pulp_created | pulp_last_updated | number | complete | base_version_id | repository_id 
---------+--------------+-------------------+--------+----------+-----------------+---------------
(0 rows)

But repo 1119 and repo 1090 are the other two baseos repositories:

foreman=# select id,relative_path,publication_href from katello_repositories where id in (1119,1090);
  id  |                        relative_path                         |                            publication_href                             
------+--------------------------------------------------------------+-------------------------------------------------------------------------
 1090 | ORG/Testing/el8-epel8/custom/centos8stream/BaseOS_x86_64_os | /pulp/api/v3/publications/rpm/rpm/4a5ee05b-7966-4c48-8eb1-74545536c754/
 1119 | ORG/Production/el8-epel8/custom/almalinux8/BaseOS_x86_64_os | /pulp/api/v3/publications/rpm/rpm/b8142cdb-ff00-4c93-a6bd-a0eaabac6214/
(2 rows)

So it seems to evolve around repo id 94f2daba-1ae5-4808-99fb-2248c3e8e929 affecting all other BaseOS versions. I guess the first two parallels syncs initiated by cv promotions caused some inconsistency in the database and all following syncs cannot straighten it out.

O.K. It gets worse. I have removed all environments from the proxy. Sync optimized, complete, and reclaimed. Then ran

# foreman-rake katello:delete_orphaned_content RAILS_ENV=production SMART_PROXY_ID=4

four times. The first time it cleared out a lot. After that, it basically completes immediately.

So there really shouldn’t be anything on the proxy. However, there is still something left:

pulpcore=# select * from core_repository;
               pulp_id                |         pulp_created          |       pulp_last_updated       |                                    name                                    | description | next_version | pulp_type | remote_id | retain_repo_versions | user_hidden 
--------------------------------------+-------------------------------+-------------------------------+----------------------------------------------------------------------------+-------------+--------------+-----------+-----------+----------------------+-------------
 11e10445-b7a3-4d48-aecf-3ae78a49374b | 2022-07-25 12:59:55.244584+02 | 2022-07-25 14:01:20.511506+02 | AppStream-666405902494e37e80c80752e70918811bab2a89ab2011a50e22fc21926760cd |             |            2 | rpm.rpm   |           |                      | t
 94f2daba-1ae5-4808-99fb-2248c3e8e929 | 2022-07-25 12:59:44.676468+02 | 2022-07-25 13:57:01.049942+02 | AppStream-cafaa7d8a979743d2c39308ba5c31a702ee94aeea4bab81ccb5b4d7a9b668ae7 |             |            5 | rpm.rpm   |           |                      | t
 b706e419-be27-4cdd-b9f8-7d2df7e5baf4 | 2022-07-25 13:25:39.703301+02 | 2022-07-25 14:01:19.850977+02 | AppStream-ecc86edecfd6b3ed311a9eabac6d21c8d313c6f80cf813b703cb1da72613b7a7 |             |            2 | rpm.rpm   |           |                      | t
(3 rows)

pulpcore=# select count(*) from core_remote;
 count 
-------
   410
(1 row)

pulpcore=# select count(*) from core_artifact ;
 count 
-------
  1308
(1 row)

pulpcore=# select count(*) from core_content;
 count  
--------
 140871
(1 row)

pulpcore=# select repository_ptr_id,last_sync_details from rpm_rpmrepository ;
          repository_ptr_id           |                                                                                                                                                       last_sync_details                                                                                               
                                          
--------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------
 94f2daba-1ae5-4808-99fb-2248c3e8e929 | {"url": "https://foreman8.example.com/pulp/content/ORG/Testing/el8-epel8/custom/almalinux8/BaseOS_x86_64_os/", "revision": "1657778086", "sync_policy": "mirror_complete", "download_policy": "on_demand", "repomd_checksum": "7c2ded7e22200c611023891637f53e36b64bf8a112c3d18142c9aeac9
3a9e715", "most_recent_version": 5}
 b706e419-be27-4cdd-b9f8-7d2df7e5baf4 | {"url": "https://foreman8.example.com/pulp/content/ORG/Production/el8-epel8/custom/centos8stream/BaseOS_x86_64_os/", "revision": "1657777519", "sync_policy": "mirror_complete", "download_policy": "on_demand", "repomd_checksum": "012d3187dc83ce5da15a88e081c08288e0f9ab90f298cb48783
234552699bc92", "most_recent_version": 1}
 11e10445-b7a3-4d48-aecf-3ae78a49374b | {"url": "https://foreman8.example.com/pulp/content/ORG/Production/el8/custom/centos8stream/BaseOS_x86_64_os/", "revision": "1658329738", "sync_policy": "mirror_complete", "download_policy": "on_demand", "repomd_checksum": "cb1ac8fddc809d6d5529749b2e7501239345b692ddef0a881b5d8bb6d
d90fe7b", "most_recent_version": 1}
(3 rows)


So basically, three BaseOS repositories are still there. There seems to be some serious sync issue with the BaseOS repository. On the main server I have Alma 8 and CentOS Stream 8, in testing and production environments, in content view el8 and el8-epel8. So technically there are a total of 8 remote repositories to sync from the main server. Of those 8, 3 have been left behind.

The 94f2daba-1ae5-4808-99fb-2248c3e8e929 is in the first error.

How do I get this fixed or cleared out? Otherwise, I would simply reinstall the content proxy as it’s yet in production anyway…

The leftover repos seem to be user_hidden or subrepos of repos not explicitly synced…I am wondering if this is related to the other issue you filed with pulp that was recently fixed. https://github.com/pulp/pulp_rpm/issues/2459 ?

I just briefly checked on my phone:

The three repositories in core_repositories have user_hidden=t

I don‘t quite see if the patch made it into the latest katello version. I have

python39-pulp-rpm-3.17.5-3.el8.noarch
python39-pulpcore-3.18.5-2.el8.noarch

It‘s a complete new installation with Katello 4.5.0 and a new content proxy. It‘s not an upgrade of the system for which I have opened that issue.

From the commit history, seems like it made it into 3.17.7 Commits · pulp/pulp_rpm · GitHub
4.5 is shipping with 3.17.5 though.

If you’re on pulp_rpm 3.17.5 then you wouldn’t have the patch, as mentioned in landed in 3.17.7

Re-installing should of course be fine… It should also be safe to delete those 3 repositories and run orphan cleanup again.

I could never manage to re-create the scenario that led to this, so if you can work with Katello to get a reliable reproducer scenario, that would be enormously helpful. And we can verify for certain that the patch which landed is a 100% fix. I expect that it is, it’s just hard to verify.

Thanks. For the fun of it I have applied Fix a bug preventing orphan cleanup for some subrepos · pulp/pulp_rpm@d41c097 · GitHub on the content proxy and ran katello:delete_orphaned_content again, but the repositories remain there. I didn’t find a patch regarding user_hidden=t. Is there another patch I have to apply.

I really would like to test that it’s completely removing everything if I remove all environments from the content proxy.

How would I delete those 3 repositories? Simply in the database or is there a cli command for that?

Later I can reinstall the content proxy and try the initial sync again. Currently, it looks to me as if it is related to promoting the same CV into multiple synced environments at the same time. I had occasional sync errors on my production server (katello 4.4 and before) which I mostly avoided by putting a five minute delay between the first promotion and the second. That would match what I saw on the new installation.

However, I also checked the errors on my production server and they are always related to the AlmaLinux 8 or CentOS Stream 8 BaseOS repository.

For one error on the production server I have also just noticed something odd:

duplicate key value violates unique constraint "core_repositoryversion_repository_id_number_3c54ce50_uniq"
DETAIL:  Key (repository_id, number)=(ac46caa8-0b8b-4da6-bcb0-e2e40ec77e19, 12) already exists.
duplicate key value violates unique constraint "core_repositoryversion_repository_id_number_3c54ce50_uniq"
DETAIL:  Key (repository_id, number)=(ac46caa8-0b8b-4da6-bcb0-e2e40ec77e19, 12) already exists.

Database on production content proxy:

pulpcore=# select * from core_repository where pulp_id = 'ac46caa8-0b8b-4da6-bcb0-e2e40ec77e19';
               pulp_id                |         pulp_created          |       pulp_last_updated       |                                    name                                    | description | next_version | pulp_type | remote_id | retain_repo_versions | use
r_hidden 
--------------------------------------+-------------------------------+-------------------------------+----------------------------------------------------------------------------+-------------+--------------+-----------+-----------+----------------------+----
---------
 ac46caa8-0b8b-4da6-bcb0-e2e40ec77e19 | 2022-05-13 03:55:29.965914+02 | 2022-07-26 04:14:37.209507+02 | AppStream-cafaa7d8a979743d2c39308ba5c31a702ee94aeea4bab81ccb5b4d7a9b668ae7 |             |           15 | rpm.rpm   |           |                      | t
(1 row)

pulpcore=# select * from rpm_rpmrepository where repository_ptr_id = 'ac46caa8-0b8b-4da6-bcb0-e2e40ec77e19';
          repository_ptr_id           | metadata_signing_service_id |                                                     original_checksum_types                                                     | retain_package_versions | autopublish | gpgcheck | metadata_
checksum_type | package_checksum_type | repo_gpgcheck | sqlite_metadata |                                                                                                                                                    last_sync_details                      
                                                                                                                               
--------------------------------------+-----------------------------+---------------------------------------------------------------------------------------------------------------------------------+-------------------------+-------------+----------+----------
--------------+-----------------------+---------------+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------
 ac46caa8-0b8b-4da6-bcb0-e2e40ec77e19 |                             | {"group": "sha256", "other": "sha256", "modules": "sha256", "primary": "sha256", "filelists": "sha256", "updateinfo": "sha256"} |                       0 | f           |        0 |          
              |                       |             0 | f               | {"url": "https://foreman.example.com/pulp/content/ORG/Testing/el8-epel8/custom/almalinux8/BaseOS_x86_64_os/", "revision": "1658774516", "sync_policy": "mirror_complete", "download_policy": 
"on_demand", "repomd_checksum": "80e8bc521df91d3bf99d41ca3172f93ef72fd8104083bfe6090d079a1f56f34f", "most_recent_version": 14}
(1 row)

Why is the BaseOS repository named AppStream-cafaa7d8a979743d2c39308ba5c31a702ee94aeea4bab81ccb5b4d7a9b668ae7?

@dralley I have reverted my content proxy to the initial pre-installation snapshot and reinstalled it again.

My main server has two LEs “Testing” and “Production” and two CVs containing EL8 repositories “EL8” and “EL8-EPEL8”. Both CVs contain AlmaLinux 8 and CentOS Stream 8 repositories including the BaseOS repository of either one.

So basically, I have 8 BaseOS repositories in my LEs which I sync to the content proxy.

I have assigned both LEs to my content proxy and started an optimized sync. Of course, it took a while but in the end generated errors on 5 repositories again:

insert or update on table "core_repositorycontent" violates foreign key constraint "core_repositoryconte_version_added_id_d5113f18_fk_core_repo"
DETAIL:  Key (version_added_id)=(b7a79953-1816-45b2-bc4e-b705c5deaf26) is not present in table "core_repositoryversion".
Could not lookup a publication_href for repo 1076
insert or update on table "core_repositorycontent" violates foreign key constraint "core_repositoryconte_version_added_id_d5113f18_fk_core_repo"
DETAIL:  Key (version_added_id)=(74ae3a16-9c3a-4d60-9b77-1e7151c51bff) is not present in table "core_repositoryversion".
Could not lookup a publication_href for repo 480
insert or update on table "core_repositorycontent" violates foreign key constraint "core_repositoryconte_version_added_id_d5113f18_fk_core_repo"
DETAIL:  Key (version_added_id)=(7fcf8b11-598c-431c-ba49-00163eea0564) is not present in table "core_repositoryversion".
Could not lookup a publication_href for repo 429
insert or update on table "core_repositorycontent" violates foreign key constraint "core_repositoryconte_version_added_id_d5113f18_fk_core_repo"
DETAIL:  Key (version_added_id)=(439afd64-cfcc-47d4-9414-8a788901d8ab) is not present in table "core_repositoryversion".
Could not lookup a publication_href for repo 1090
insert or update on table "core_repositorycontent" violates foreign key constraint "core_repositoryconte_version_added_id_d5113f18_fk_core_repo"
DETAIL:  Key (version_added_id)=(e252af3a-a57a-4360-b7b1-c7fc59e5adfd) is not present in table "core_repositoryversion".
Could not lookup a publication_href for repo 490

Lookup up those repo ids on the main server:

foreman=# select id,relative_path from katello_repositories where id in ( 1076,480,429,1090,490);
  id  |                        relative_path
------+--------------------------------------------------------------
  480 | ORG/Production/el8/custom/almalinux8/BaseOS_x86_64_os
  490 | ORG/Production/el8/custom/centos8stream/BaseOS_x86_64_os
 1090 | ORG/Testing/el8-epel8/custom/centos8stream/BaseOS_x86_64_os
  429 | ORG/Testing/el8/custom/almalinux8/BaseOS_x86_64_os
 1076 | ORG/Testing/el8-epel8/custom/almalinux8/BaseOS_x86_64_os
(5 rows)

The remaining three paths I can see in the /pulp/content/ view on the content proxy. They look O.K.:

ORG/Production/el8-epel8/custom/almalinux8/BaseOS_x86_64_os/
ORG/Production/el8-epel8/custom/centos8stream/BaseOS_x86_64_os/
ORG/Testing/el8/custom/centos8stream/BaseOS_x86_64_os/

The first of those three was one of the repos with errors in my initial post. So it doesn’t seem to be necessarily the identical set of repos showing errors, but looks more like some race condition confusing the sync. And it must be something about the BaseOS repository which causes this.

I don’t know if this is enough to reproduce the issue on an test system. If you need more information, please let me know.

Yesterday, I had an error again on my production system (CentOS 7.9, F3.2, K4.4 up-to-date).

 duplicate key value violates unique constraint "core_repositoryversion_repository_id_number_3c54ce50_uniq" DETAIL: Key (repository_id, number)=(b208852b-1335-4f4b-b559-9783d0cf14e7, 1) already exists. 

There were new CentOS 8 Stream updates in BaseOS. I have a script running which publishes a new version of my CVs every morning and promotes it automatically into Testing. So yesterday morning, this also happened for my CVs “EL8” and “EL8 & EPEL8”. The hammer commands in the script don’t use async calls but of course, the synchronization to the content proxies after that is started automatically.

I have started looking into the execution plans in the dynflow console and what else was running during the time of the failing sync step. I haven’t gone into full detail, yet, but started with the long running steps and immediately found that the BaseOS syncs for both CVs overlapped:

first sync:

smart_proxy_history_id: 30599
pulp_tasks:
- pulp_href: "/pulp/api/v3/tasks/1f1d1557-6f33-4624-953f-6d90789da74e/"
  pulp_created: '2022-08-01T02:05:20.131+00:00'
  state: completed
  name: pulp_rpm.app.tasks.synchronizing.synchronize
  logging_cid: 0cce1144-cf72-49d5-9c51-75dbc4f95027
  started_at: '2022-08-01T02:05:20.450+00:00'
  finished_at: '2022-08-01T02:14:29.222+00:00'
  worker: "/pulp/api/v3/workers/59082a71-8787-44e0-87ed-b71824df86a8/"
...
  created_resources:
  - "/pulp/api/v3/repositories/rpm/rpm/beec8c74-156f-418b-aaa7-0d0bb1557d6a/versions/50/"
  - "/pulp/api/v3/publications/rpm/rpm/3bd6d3ff-deda-4775-a028-41a2986f03e1/"
  reserved_resources_record:
  - "/pulp/api/v3/repositories/rpm/rpm/beec8c74-156f-418b-aaa7-0d0bb1557d6a/"
  - shared:/pulp/api/v3/remotes/rpm/rpm/e2631782-486f-4ce2-904e-7719fee96152/
task_groups: []
poll_attempts:
  total: 45
  failed: 0

Remote e2631782-486f-4ce2-904e-7719fee96152 is linked to https://foreman.example.compulp/content/ORG/Testing/el8/custom/centos8stream/BaseOS_x86_64_os/

second sync:

smart_proxy_history_id: 30608
pulp_tasks:
- pulp_href: "/pulp/api/v3/tasks/eec66b70-c676-444d-adfa-2032dbe964bf/"
  pulp_created: '2022-08-01T02:10:57.926+00:00'
  state: failed
  name: pulp_rpm.app.tasks.synchronizing.synchronize
  logging_cid: c238f5fc-d4bb-4b0b-bd1a-365fe765be00
  started_at: '2022-08-01T02:10:58.029+00:00'
  finished_at: '2022-08-01T02:14:52.391+00:00'
  error:
    traceback: !ruby/string:Sequel::SQL::Blob |2
...
    description: |
      duplicate key value violates unique constraint "core_repositoryversion_repository_id_number_3c54ce50_uniq"
      DETAIL:  Key (repository_id, number)=(b208852b-1335-4f4b-b559-9783d0cf14e7, 1) already exists.
  worker: "/pulp/api/v3/workers/a87f89a0-5da6-42d3-a168-9d556f9e9e22/"
...
  created_resources:
  - "/pulp/api/v3/repositories/rpm/rpm/4382d46b-7003-4086-b137-2ab0ddb90327/versions/50/"
  reserved_resources_record:
  - "/pulp/api/v3/repositories/rpm/rpm/4382d46b-7003-4086-b137-2ab0ddb90327/"
  - shared:/pulp/api/v3/remotes/rpm/rpm/32ad020b-966d-450e-aa67-81c64c56be55/
task_groups: []
poll_attempts:
  total: 30
  failed: 1

Remote 32ad020b-966d-450e-aa67-81c64c56be55 is linked to https://foreman.example.com/pulp/content/ORG/Testing/el8-epel8/custom/centos8stream/BaseOS_x86_64_os/

So, two syncs for the same repo have to overlap. Syncing BaseOS takes very long on my servers. The first sync took more than 9 minutes. The second, overlapping one got the error after 4 minutes.

The logs I have checked so far only show errors for BaseOS, CentOS Stream 8 or AlmaLinux 8, both are in those two CVs. I can’t tell if it is just coincidence because BaseOS takes so long to sync and thus it’s much more likely to overlap or if it something else with BaseOS causing this.

However, looking at the failed syncs of the last four weeks, those are all related to BaseOS.

@dralley From my tests it seems as if you should be able to reproduce the issue, if you have two CVs containing the same BaseOS repository and you quickly publish and promote both CVs into the same environment, e.g. using hammer. The content proxy syncs start automatically after promotion and due to that they overlap. At least on my production servers it’s very likely that the actual pulp syncs for the BaseOS repo in both CVs overlap because it takes a couple of minutes. And checking the update history of my CentOS Stream 8 servers it seems as if the sync error happened each time there were new updates in BaseOS.

And for my new foreman/katello servers on AlmaLinux 8, the initial sync for a new content proxy failed each of the 3 times I have tried. I think you should be able to reproduce the issue with a setup like that.

The katello:delete_orphaned_content task still doesn’t remove everything with 3.17.7 from a content proxy if all environments are removed. I just ran it three time. The first time removed most, the second one some more, the third and following finishes immediately.

Now there are still two repositories and some artifacts etc. left in the database, as well as some file in the file system. The repos are AlmaLinux BaseOS in production environment and CentOS Stream 8 BaseOS in testing. The files in the filesystem are all metadata I think. Considering both are BaseOS I guess it’s related to the sync issue above which seem to have left information in the database broken in a way that remove_orphan doesn’t clean it up…

I came across this:

Now I am wondering whether the sync issues has something to do with that and that would also explain why the BaseOS repository has a name starting “AppStream-…” in the core_repository table.

Well, it seems that although

# foreman-rake katello:delete_orphaned_content RAILS_ENV=production SMART_PROXY_ID=4

does not fully remove all orphaned content something else at some point does. I have just checked the database and filesystem again and now it’s empty. That’s good. So that leaves these very annoying BaseOS sync errors…

After I have removed all environments from my content proxy and it eventually really was completely empty, I have just tried again for the fun of it as I have just added CentOS Stream 9 repositories.

I have got 11 errors:

insert or update on table "core_repositorycontent" violates foreign key constraint "core_repositoryconte_version_added_id_d5113f18_fk_core_repo"
DETAIL:  Key (version_added_id)=(cf38fff6-0d34-461d-ac6a-e6f4b4916013) is not present in table "core_repositoryversion".
Could not lookup a publication_href for repo 1076

insert or update on table "core_repositorycontent" violates foreign key constraint "core_repositoryconte_version_added_id_d5113f18_fk_core_repo"
DETAIL:  Key (version_added_id)=(98012f5e-ceb2-49b7-b116-b142246c115c) is not present in table "core_repositoryversion".
Could not lookup a publication_href for repo 1449

duplicate key value violates unique constraint "core_repositoryversion_repository_id_number_3c54ce50_uniq"
DETAIL:  Key (repository_id, number)=(b34d636c-73d1-4fbc-bdb1-5d52699ca1fc, 1) already exists.
Could not lookup a publication_href for repo 480

duplicate key value violates unique constraint "core_repositoryversion_repository_id_number_3c54ce50_uniq"
DETAIL:  Key (repository_id, number)=(b34d636c-73d1-4fbc-bdb1-5d52699ca1fc, 1) already exists.
Could not lookup a publication_href for repo 1490

duplicate key value violates unique constraint "core_repositoryversion_repository_id_number_3c54ce50_uniq"
DETAIL:  Key (repository_id, number)=(b34d636c-73d1-4fbc-bdb1-5d52699ca1fc, 1) already exists.
Could not lookup a publication_href for repo 1629

insert or update on table "core_repositorycontent" violates foreign key constraint "core_repositoryconte_version_added_id_d5113f18_fk_core_repo"
DETAIL:  Key (version_added_id)=(bdf1fdf3-51a0-4a25-9c66-9fc7edab1be9) is not present in table "core_repositoryversion".
Could not lookup a publication_href for repo 429

insert or update on table "core_repositorycontent" violates foreign key constraint "core_repositoryconte_version_added_id_d5113f18_fk_core_repo"
DETAIL:  Key (version_added_id)=(6050a3f9-7f37-405c-b9ff-16d8a971ae2b) is not present in table "core_repositoryversion".
Could not lookup a publication_href for repo 439

insert or update on table "core_repositorycontent" violates foreign key constraint "core_repositoryconte_version_added_id_d5113f18_fk_core_repo"
DETAIL:  Key (version_added_id)=(b9b59f21-2c96-46b1-90c3-b4e01fe75d52) is not present in table "core_repositoryversion".
Could not lookup a publication_href for repo 1794

insert or update on table "core_repositorycontent" violates foreign key constraint "core_repositoryconte_version_added_id_d5113f18_fk_core_repo"
DETAIL:  Key (version_added_id)=(3aaf24b8-0283-4463-90e2-78d171461750) is not present in table "core_repositoryversion".
Could not lookup a publication_href for repo 490

duplicate key value violates unique constraint "core_repositoryversion_repository_id_number_3c54ce50_uniq"
DETAIL:  Key (repository_id, number)=(a3d96d2e-b2bd-4fb2-a079-ae5dbaa44ba4, 1) already exists.
Could not lookup a publication_href for repo 1937

insert or update on table "core_repositorycontent" violates foreign key constraint "core_repositoryconte_version_added_id_d5113f18_fk_core_repo"
DETAIL:  Key (version_added_id)=(a7301b58-be8d-406f-8dfe-22b43bbba781) is not present in table "core_repositoryversion".
Could not lookup a publication_href for repo 1903

which all match BaseOS repositories:

foreman=# select id,relative_path from katello_repositories where id in ( 1076,1449,480,1490,1629,429,430,1794,490,1937,1903) order by relative_path;
  id  |                          relative_path   
------+-----------------------------------------------------------------
 1490 | ORG/Production/alma8/custom/almalinux8/BaseOS_x86_64_os
 1629 | ORG/Production/alma8-epel8/custom/almalinux8/BaseOS_x86_64_os
 1794 | ORG/Production/cs8/custom/centos8stream/BaseOS_x86_64_os
 1937 | ORG/Production/cs8-epel8/custom/centos8stream/BaseOS_x86_64_os
  480 | ORG/Production/el8/custom/almalinux8/BaseOS_x86_64_os
  490 | ORG/Production/el8/custom/centos8stream/BaseOS_x86_64_os
 1449 | ORG/Testing/alma8/custom/almalinux8/BaseOS_x86_64_os
 1903 | ORG/Testing/cs8-epel8/custom/centos8stream/BaseOS_x86_64_os
  430 | ORG/Testing/el8/custom/almalinux8/AppStream_x86_64_os
  429 | ORG/Testing/el8/custom/almalinux8/BaseOS_x86_64_os
 1076 | ORG/Testing/el8-epel8/custom/almalinux8/BaseOS_x86_64_os
(11 rows)

Available on the content proxy are the following BaseOS repositories, taken from the output of curl -s 'http://foreman8-content.dkrz.de/pulp/content/' | grep 'ORG.*BaseOS'

ORG/Production/cs9-epel9/custom/centos9/BaseOS_x86_64_os/
ORG/Testing/alma8-epel8/custom/almalinux8/BaseOS_x86_64_os/
ORG/Testing/cs8/custom/centos8stream/BaseOS_x86_64_os/
ORG/Testing/cs9-epel9/custom/centos9/BaseOS_x86_64_os/
ORG/Testing/el8-epel8/custom/centos8stream/BaseOS_x86_64_os/

which are exactly those 5 repositories for which there was no error. Notably, both CentOS Stream 9 BaseOS repos got through. I haven’t tested more, yet, so it could be just coincidence.

The two uuids b34d636c-73d1-4fbc-bdb1-5d52699ca1fc and a3d96d2e-b2bd-4fb2-a079-ae5dbaa44ba4 for which there was a duplicate key value

pulpcore=# select * from core_repository where pulp_id in ('b34d636c-73d1-4fbc-bdb1-5d52699ca1fc','a3d96d2e-b2bd-4fb2-a079-ae5dbaa44ba4');
               pulp_id                |         pulp_created          |       pulp_last_updated       |                             
       name                                    | description | next_version | pulp_type | remote_id | retain_repo_versions | user_hi
dden 
--------------------------------------+-------------------------------+-------------------------------+-----------------------------
-----------------------------------------------+-------------+--------------+-----------+-----------+----------------------+--------
-----
 b34d636c-73d1-4fbc-bdb1-5d52699ca1fc | 2022-08-16 08:20:46.476563+02 | 2022-08-16 08:23:47.698258+02 | AppStream-cafaa7d8a979743d2c
39308ba5c31a702ee94aeea4bab81ccb5b4d7a9b668ae7 |             |            2 | rpm.rpm   |           |                      | t
 a3d96d2e-b2bd-4fb2-a079-ae5dbaa44ba4 | 2022-08-16 08:25:18.121076+02 | 2022-08-16 08:33:44.662956+02 | AppStream-0312709cc54e5821c0
7aca3160d7da33860d901a8348c493e7bb2949bbc60b97 |             |            2 | rpm.rpm   |           |                      | t
(2 rows)

both are named AppStream-… as you can see, even though they point to BaseOS repositories:

pulpcore=# select last_sync_details from rpm_rpmrepository where repository_ptr_id in ('b34d636c-73d1-4fbc-bdb1-5d52699ca1fc','a3d96d2e-b2bd-4fb2-a079-ae5dbaa44ba4'); 
                                                                                                                                    
                 last_sync_details                                                                                                  
                                                    
------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------
 {"url": "https://foreman8.example.com/pulp/content/ORG/Testing/alma8-epel8/custom/almalinux8/BaseOS_x86_64_os/", "revision": "16597106
60", "sync_policy": "mirror_complete", "download_policy": "on_demand", "repomd_checksum": "2dbc15a2c91e47985e09af92f66f6bbc342942668
c91b56e4dfb8784ade14112", "most_recent_version": 1}
 {"url": "https://foreman8.example.com/pulp/content/ORG/Testing/cs8/custom/centos8stream/BaseOS_x86_64_os/", "revision": "1659711971", 
"sync_policy": "mirror_complete", "download_policy": "on_demand", "repomd_checksum": "d2f72b0905bb16fedfc0c70187306c868ad7d097e469bd
d1e0bdb9827e4244ff", "most_recent_version": 1}
(2 rows)

So it’s always BaseOS repositories with those issues and they are linked in some way with AppStream…

@gvde I’m deeply sorry for not responding more quickly, I have been on PTO for most of the past week and the prior 2 weeks were extremely busy for both work and business reasons. I’m going to make sure this gets looked into

work and business → work and personal

Tracking here: Repository sync fails with duplicate key value violates unique constraint "core_repositoryversion_repository_id_number · Issue #3135 · pulp/pulpcore · GitHub

Is there a ETA when the fix comes into 4.6? I never really know what patches go into which version…

It’s really giving me a hard time. Whenever there is an update in a EL8 BaseOS repository I have to run multiple complete and optimized syncs until I get everything available in the proxy.

Do I need the patch on the server or the proxy?

Is there a Katello patch portion to this? I haven’t found a Foreman redmine yet.

As for bumping Pulp z versions in our packaging repository, that can be done outside of a Katello z-release. We just need to package the newer Pulpcore plugin version.

We’ll just need to be alerted to the pulp-rpm version that is desired once it’s released.

Whoops, sorry, I confused this with a different bug.

This is fixed upstream in pulp_rpm 3.18.9, but we can’t easily backport it to 3.17 or previous unfortunately. You’re right, there shouldn’t be a Katello patch for this.

So it looks like we’ll need to just update to 3.18.9 in our Pulpcore 3.18 repository.