Published pulp repodata directories contain extra metadata files, is this normal?

Problem:

I have several published repos that contain repodata files that are not referenced in the repomd.xml files. Telling Katello to “Regenerate Repository Metadata” seems to “fix” this.

Expected outcome:

Repodata directories should only contain files referenced in repomd.xml. More specifically, exactly 1 or 0 “other”, “filelist”, “updateinfo”, “primary” file(s).

Foreman and Proxy versions:
Foreman 2.0
Katello 3.15.0
Pulp 2.21

Note: This installation did start on earlier versions and has been upgraded. It was Katello 3.12 and associated Foreman/Pulp versions.

Distribution and version:

CentOS 7, up to date ~5/15/2020

Other relevant data:

Before “Regenerate Repository Metadata”

root@katello ~ # ls -l /var/lib/pulp/published/yum/master/yum_distributor/3fd459f6-6505-4e33-8a1b-1827549bc982/1590524279.87/repodata -ltr
total 5342112
-rw-r--r-- 1 apache apache  15840175 May 13 20:15 24b94daf5a634e5026f9750c7489f86b0d129277-filelists.xml.gz
-rw-r--r-- 1 apache apache 546372792 May 13 20:15 82f58814a0ba0449e14b3f7731755da9479b7693-other.xml.gz
-rw-r--r-- 1 apache apache  17244004 May 13 20:17 8423c03498c32d69b8831ddc0659d783a3b67adf-primary.xml.gz
-rw-r--r-- 1 apache apache   1311512 May 13 20:18 429c3ae6c67bc26cc365ac13e218e2f6b4334a20-updateinfo.xml.gz
-rw-r--r-- 1 apache apache  16244076 May 14 20:36 d02c049b81ac68b3a216e5813b4e4d73f6018808-filelists.xml.gz
-rw-r--r-- 1 apache apache 575495021 May 14 20:36 871f06c338d6065e145e74cd39c0a5a2e0a9f31c-other.xml.gz
-rw-r--r-- 1 apache apache  17953284 May 14 20:37 8e77e6888d6a7b6e0f4d08e917dce923f58133a1-primary.xml.gz
-rw-r--r-- 1 apache apache   1328141 May 14 20:39 64254e4d1ddbd67ceb8a57ef015081e44de1644a-updateinfo.xml.gz
-rw-r--r-- 1 apache apache  16244076 May 15 19:48 b855754dac8ea21e8972f12dfc11457a4759af7b-filelists.xml.gz
-rw-r--r-- 1 apache apache 575495021 May 15 19:48 3512e13e70028ce0d3dd956b944e81a296b76021-other.xml.gz
-rw-r--r-- 1 apache apache  17953284 May 15 19:49 6e91d7460cd7320581ad6f2c1deeb4bd63fa96ad-primary.xml.gz
-rw-r--r-- 1 apache apache   1328141 May 15 19:50 dcb30608b4f89f5b3b9d6e8914774806a65fc9e5-updateinfo.xml.gz
-rw-r--r-- 1 apache apache  16244076 May 16 19:57 fd4bb8810d4bc1b6646307fa69a91c3c86e48035-filelists.xml.gz
-rw-r--r-- 1 apache apache 575495021 May 16 19:57 46ff160bd56078edd2fcb44a40e48a21dcf7fc48-other.xml.gz
-rw-r--r-- 1 apache apache  17953284 May 16 19:59 522f4392d1719a594e3ccf27e7da5f14bd0cf66b-primary.xml.gz
-rw-r--r-- 1 apache apache   1328141 May 16 20:00 5c005999a6e5e2efbd7b48c8dc7ef3a44423b87d-updateinfo.xml.gz
-rw-r--r-- 1 apache apache  16244076 May 17 19:58 080e9d11b1cab95679242b97c5f07a150f8cb004-filelists.xml.gz
-rw-r--r-- 1 apache apache 575495021 May 17 19:58 4a2f6b233e38863a74e647c2b4c82aa0a617d3b6-other.xml.gz
-rw-r--r-- 1 apache apache  17953284 May 17 19:59 7c149db9e68dbffa8652d5d9f547c79e2f79a549-primary.xml.gz
-rw-r--r-- 1 apache apache   1328141 May 17 20:00 4c25e72698f03b609785230cd495aaa9ff908922-updateinfo.xml.gz
-rw-r--r-- 1 apache apache  16252155 May 18 20:00 86bebbc2652d8d4025c7c62d6593e9202f856c63-filelists.xml.gz
-rw-r--r-- 1 apache apache 575609676 May 18 20:00 092b1f63eec1e68f96aeb5c99b86d58338ec91f2-other.xml.gz
-rw-r--r-- 1 apache apache  17956650 May 18 20:01 b548fb3f9a39c5cb07bc2933573a06beef87c46d-primary.xml.gz
-rw-r--r-- 1 apache apache   1328891 May 18 20:03 b8b29418710176dada86c827788ab1e66e040bb5-updateinfo.xml.gz
-rw-r--r-- 1 apache apache  16252155 May 19 19:57 9640eb08e5f9ffcc4f1422f207ef2553aef4192b-filelists.xml.gz
-rw-r--r-- 1 apache apache 575609676 May 19 19:57 45d8aa65d2fcd8b3fe22798dcd6dc7883a971b0c-other.xml.gz
-rw-r--r-- 1 apache apache  17956650 May 19 19:58 7824174494033b303dd1c83c1c160c89d3c474f7-primary.xml.gz
-rw-r--r-- 1 apache apache   1328891 May 19 19:59 878cf259b0a908941b242946acf83cbb25546e30-updateinfo.xml.gz
-rw-r--r-- 1 apache apache  16252155 May 20 20:04 7b2f2ca71b1bc6977de0289e4313476099595e29-filelists.xml.gz
-rw-r--r-- 1 apache apache 575609676 May 20 20:04 56a5ea8df7e8c4325e68d930e34e571314d55531-other.xml.gz
-rw-r--r-- 1 apache apache  17956650 May 20 20:06 14bbe31506092c4baa4143f396379b0a072a6c9e-primary.xml.gz
-rw-r--r-- 1 apache apache   1328891 May 20 20:07 7b58f3069d62ad19cad51914cb74c568b85b4f51-updateinfo.xml.gz
-rw-r--r-- 1 apache apache  16252155 May 26 13:21 92d0986c0839910db42100b075df213c98d17d15-filelists.xml.gz
-rw-r--r-- 1 apache apache 575609676 May 26 13:21 cfd719e645ceec1a6281118fd5f6e494abc1e5f3-other.xml.gz
-rw-r--r-- 1 apache apache  17956650 May 26 13:23 242317b3d6f9336b4e3ce2399e3dac89820afd10-primary.xml.gz
-rw-r--r-- 1 apache apache   1328891 May 26 13:24 6499b4c737931bf259d48409609d8d0ff2c876f1-updateinfo.xml.gz
-rw-r--r-- 1 apache apache    817244 May 26 13:24 52565fb3129e9a0973cc4c3a81921a0978b31ba9-comps.xml
-rw-r--r-- 1 apache apache      1836 May 26 13:24 repomd.xml

After “Regenerate Repository Metadata”

root@katello ~ # ls -l /var/lib/pulp/published/yum/master/yum_distributor/3fd459f6-6505-4e33-8a1b-1827549bc982/1590532553.46/repodata -ltrh
total 584M
-rw-r--r-- 1 apache apache  16M May 26 15:43 300d047dddbb3a83a0a3e9b8a08d8497f91eb22c-filelists.xml.gz
-rw-r--r-- 1 apache apache 550M May 26 15:43 e34fd94941faac50f6561934e84cf31ca851abe8-other.xml.gz
-rw-r--r-- 1 apache apache  18M May 26 15:44 ae832adc63fc0200279f726b967c674490c1d215-primary.xml.gz
-rw-r--r-- 1 apache apache 1.3M May 26 15:45 0818342d57eb3904ef535265066555230d4afd34-updateinfo.xml.gz
-rw-r--r-- 1 apache apache 799K May 26 15:45 52565fb3129e9a0973cc4c3a81921a0978b31ba9-comps.xml
-rw-r--r-- 1 apache apache 1.8K May 26 15:45 repomd.xml

@Justin_Sherrill any input on this one? Note that it’s been upgraded since 3.12

Do you only see this with ‘Library’ repos? or do you also see it with Content View repos?

It looks like the one mentioned might have been a library repo.

Sorry I didn’t see that there had been a reply.

Yes, it does look like it’s just Library repos but I have not extensively used CVs in this deployment yet so I may not have enough data to say it’s ONLY Library repos with certainty.

Here are the worst offenders and what they are. These repos are on a daily sync plan and I suspect they are receiving regular updates upstream.

root@katello ~ # for D in $(find /var/lib/pulp/published/yum/master/yum_distributor/ -type d -name repodata) ; do echo $D: $(ls $D/*-primary.xml.gz | wc -l) ; done | awk '$NF > 9'
/var/lib/pulp/published/yum/master/yum_distributor/145e1fd0-177e-41ae-87a2-7f4aec39c8c9/1591066490.36/repodata: 11
/var/lib/pulp/published/yum/master/yum_distributor/92197cb1-6c75-48e3-af15-a9c7699fe467/1591066450.8/repodata: 12
/var/lib/pulp/published/yum/master/yum_distributor/0ed7d1ae-ec03-4cdb-a31c-e775b2abe069/1591066454.11/repodata: 12
/var/lib/pulp/published/yum/master/yum_distributor/d69b6514-74f8-4efa-9025-fd39ce7ebb41/1591066465.26/repodata: 12
/var/lib/pulp/published/yum/master/yum_distributor/66120d0a-f4b2-490c-b325-2f50dd6ede83/1591066479.51/repodata: 12
/var/lib/pulp/published/yum/master/yum_distributor/74fb6149-bcec-4c97-babe-ec81ebf21434/1591066485.55/repodata: 12
/var/lib/pulp/published/yum/master/yum_distributor/e223c6fe-71d5-4785-b2a4-97404a925945/1591066491.63/repodata: 12
/var/lib/pulp/published/yum/master/yum_distributor/246b2526-e2ae-4f09-b8d8-cff4ad93f276/1591066493.09/repodata: 12
/var/lib/pulp/published/yum/master/yum_distributor/6d0b9a88-7b6d-40df-ba92-690195ddd531/1591066568.59/repodata: 11
/var/lib/pulp/published/yum/master/yum_distributor/d14f7ed5-06b3-49a7-a1cb-422226029c19/1591066595.23/repodata: 12
/var/lib/pulp/published/yum/master/yum_distributor/7f52a640-788e-4837-8445-03e4687e919a/1591066565.71/repodata: 12
/var/lib/pulp/published/yum/master/yum_distributor/e1dcc716-480c-4a01-87fc-597204edccd2/1591066566.59/repodata: 12
/var/lib/pulp/published/yum/master/yum_distributor/b044aa3f-029b-49ad-9f2c-ccfacca95842/1591066741.59/repodata: 12
/var/lib/pulp/published/yum/master/yum_distributor/14496987-c263-4c90-a8c3-1ff583ffbbff/1591066734.86/repodata: 12

root@katello ~ # for D in $(find /var/lib/pulp/published/yum/master/yum_distributor/ -type d -name repodata) ; do echo $D: $(ls $D/*-primary.xml.gz | wc -l) ; done | awk '$NF > 9' | cut -d / -f 9 | xargs -I{} pulp-admin rpm repo list --repo-id {} --fields id,display_name | grep -e ^Id -e ^Display
Id:            145e1fd0-177e-41ae-87a2-7f4aec39c8c9
Display Name:  Oracle Linux 7 Optional (x86_64)
Id:            92197cb1-6c75-48e3-af15-a9c7699fe467
Display Name:  Oracle Linux 7 Add ons (x86_64)
Id:            0ed7d1ae-ec03-4cdb-a31c-e775b2abe069
Display Name:  Oracle Linux 7 Security Validation (x86_64)
Id:            d69b6514-74f8-4efa-9025-fd39ce7ebb41
Display Name:  UEK Release 5 for Oracle Linux 7 (x86_64)
Id:            66120d0a-f4b2-490c-b325-2f50dd6ede83
Display Name:  UEK Release 4 for Oracle Linux 7 (x86_64)
Id:            74fb6149-bcec-4c97-babe-ec81ebf21434
Display Name:  Oracle Linux 6 Add ons (x86_64)
Id:            e223c6fe-71d5-4785-b2a4-97404a925945
Display Name:  Oracle Linux 6 Security Validation (x86_64)
Id:            246b2526-e2ae-4f09-b8d8-cff4ad93f276
Display Name:  UEK Release 4 for Oracle Linux 6 (x86_64)
Id:            6d0b9a88-7b6d-40df-ba92-690195ddd531
Display Name:  UEK Release 3 for Oracle Linux 6 (x86_64)
Id:            d14f7ed5-06b3-49a7-a1cb-422226029c19
Display Name:  UEK Release 2 for Oracle Linux 6 (x86_64)
Id:            7f52a640-788e-4837-8445-03e4687e919a
Display Name:  Foreman Client el7 (x86_64)
Id:            e1dcc716-480c-4a01-87fc-597204edccd2
Display Name:  Foreman Client el6 (x86_64)
Id:            b044aa3f-029b-49ad-9f2c-ccfacca95842
Display Name:  CentOS 8 BaseOS (x86_64)
Id:            14496987-c263-4c90-a8c3-1ff583ffbbff
Display Name:  CentOS 8 PowerTools (x86_64)

Let me do a little testing, and maybe reach out to the pulp developers and get back to you.