Well, maybe we have a basic misunderstanding, but I am describing only one problem: the repository isnât synced from the main server to the content proxy. Any upstream issue doesnât really bother me at the moment, because I ran a republish metadata for the repository on the main server, thus on the main server the repository is fully consistent with the metadata.
Yes, because the content proxy doesnât sync from the main proxy. The content proxy still has the state of the repository sync from upstream because it doesnât pick up the repository from the main server.
The server is connected to the content proxy which still has the upstream (broken) primary with the duplicates. The content proxy only has the packages path not the RPMs path. It seem pulp only puts the second path into its database (i.e. packages/, or processes the file in order and the second identical rpm overwrites the first entry) and yum prefers to use the first path (RPMS/). Thatâs why it breaks.
But thatâs not my problem. While annoying and should be fixed upstream and/or properly handled in pulp, I have âfixedâ the issue for the meantime by republishing meta data on the main server. Now I have a good presentation of the repository on the main server (even though itâs no mirror view anymore), but the content proxy doesnât pick it up. Thatâs currently my problem.
The repository is not on demand. The content proxy is on demand. But itâs my understanding that this shouldnât matter if I do a complete sync on the content proxy.
Pulp mirrors the repodata metadata but only mirrors the content in packages/ and not the duplicates in RPMS/. yum picks up the first entry from the metadata to find the rpm which points to RPMS while pulp only mirrors the second entry in the metadata. The packages/ path is also in the location_href in the rpm_package table on the content proxy.
Itâs kind of neither nor. Itâs mirrored upstream metadata containing duplicate entries for all rpms, one in RPMS/ and one in packages/ and pulp only picking up/presenting one of those, i.e. packages/
yum picks the first path from metadata which isnât shown on pulp.
As I wrote before: I suspect pulp reads the metadata entry by entry. First it see the RPMs location_href and puts that into the database. Later it finds the same RPM with the exact same name and exact same checksum/hash but with the packages/ location_href and now changes the previous entry to the latest location_href, probably because it thinks the repository has been reorganizedâŚ
I get why this happens. There is even an index on the rpm_package table in the database enforcing this:
"rpm_package_name_epoch_version_relea_c9003ffa_uniq" UNIQUE CONSTRAINT, btree (name, epoch, version, release, arch, checksum_type, "pkgId")
There is no âpulpâ command⌠I can offer you this:
# rpm -qa *pulp* | sort
pulpcore-selinux-1.2.7-1.el7.x86_64
python3-pulp-ansible-0.9.0-2.el7.noarch
python3-pulp-certguard-1.4.0-3.el7.noarch
python3-pulp-container-2.8.1-0.2.el7.noarch
python3-pulpcore-3.14.9-1.el7.noarch
python3-pulp-deb-2.14.1-2.el7.noarch
python3-pulp-file-1.8.2-2.el7.noarch
python3-pulp-rpm-3.14.8-1.el7.noarch
tfm-rubygem-smart_proxy_pulp-3.1.0-1.fm2_6.el7.noarch
Itâs the latest available version for 4.2/3.0âŚ
But all this doesnât explain why the content proxy doesnât pick up the perfectly correct, republished repository from the main server.