Katello 4.2.2 - Not syncing all repositories to content proxy

@gvde it sounds like the distribution in pulp3 on the smart proxy is referring to an older repository version. Or else a new repository version was never created.

On your katello server, it’d be interesting to run:

curl --cert /etc/pki/katello/certs/pulp-client.crt --key /etc/pki/katello/private/pulp-client.key https://$SMART_PROXY_HOSTNAME/pulp/api/v3/repositories/rpm/rpm/60d89e2e-a443-4d56-9c67-e191fd0739d4/

Where your smart proxy’s hostname replaces $SMART_PROXY_HOSTNAME.

That should include the latest version of the repository. You should see something like:

"latest_version_href":"/pulp/api/v3/repositories/rpm/rpm/60d89e2e-a443-4d56-9c67-e191fd0739d4/versions/1/"

Then lets see if there’s a publication that was created from it:

curl --cert /etc/pki/katello/certs/pulp-client.crt --key /etc/pki/katello/private/pulp-client.key https://$SMART_PROXY_HOSTNAME/pulp/api/v3/publications/rpm/rpm/?repository_version=$REPOSITORY_VERSION_HREF

replacing $REPOSITORY_VERSION_HREF with the latest version href from above.

That will give something like:

{
  "count": 1,
  "next": null,
  "previous": null,
  "results": [
    {
      "pulp_href": "/pulp/api/v3/publications/rpm/rpm/776af70c-f09f-4eee-9d24-fa005bc73020/",
      "pulp_created": "2022-02-08T20:44:08.296472Z",
      "repository_version": "/pulp/api/v3/repositories/rpm/rpm/09b8d64d-2cba-4233-a534-5fa297467b83/versions/1/",
      "repository": "/pulp/api/v3/repositories/rpm/rpm/09b8d64d-2cba-4233-a534-5fa297467b83/",
      "metadata_checksum_type": "sha256",
      "package_checksum_type": "sha256",
      "gpgcheck": 0,
      "repo_gpgcheck": 0,
      "sqlite_metadata": false
    }
  ]
}

So we can get our publication_href of /pulp/api/v3/publications/rpm/rpm/776af70c-f09f-4eee-9d24-fa005bc73020/

Lets switch over to the pulp database On the smart proxy:

sudo -u postgres psql pulpcore
pulpcore=# select base_path, publication_id from core_distribution;
                  base_path                   |            publication_id            
----------------------------------------------+--------------------------------------
 Default_Organization/Library/custom/test/zoo | 776af70c-f09f-4eee-9d24-fa005bc73020

In theory you should see the base_path for that repository, along side the publication_id from the previous step. I’m curious what all yours shows.

1 Like

O.K. RPM repository:

[root@foreman ~]# curl --cert /etc/pki/katello/certs/pulp-client.crt --key /etc/pki/katello/private/pulp-client.key https://foreman-content.example.com/pulp/api/v3/repositories/rpm/rpm/60d89e2e-a443-4d56-9c67-e191fd0739d4/  | python -m json.tool
{
    "autopublish": false,
    "description": null,
    "gpgcheck": 0,
    "latest_version_href": "/pulp/api/v3/repositories/rpm/rpm/60d89e2e-a443-4d56-9c67-e191fd0739d4/versions/6/",
    "metadata_checksum_type": "sha256",
    "metadata_signing_service": null,
    "name": "1-centos7-epel7-Production-a74843f9-144f-4651-9ef7-23d0c759e259",
    "package_checksum_type": "sha256",
    "pulp_created": "2021-07-07T14:26:57.094511Z",
    "pulp_href": "/pulp/api/v3/repositories/rpm/rpm/60d89e2e-a443-4d56-9c67-e191fd0739d4/",
    "pulp_labels": {},
    "remote": null,
    "repo_gpgcheck": 0,
    "retain_package_versions": 0,
    "retained_versions": null,
    "sqlite_metadata": false,
    "versions_href": "/pulp/api/v3/repositories/rpm/rpm/60d89e2e-a443-4d56-9c67-e191fd0739d4/versions/"
}

Publication of the latest_version_href:

[root@foreman ~]# curl --cert /etc/pki/katello/certs/pulp-client.crt --key /etc/pki/katello/private/pulp-client.key 'https://foreman-content.example.com/pulp/api/v3/publications/rpm/rpm/?repository_version=/pulp/api/v3/repositories/rpm/rpm/60d89e2e-a443-4d56-9c67-e191fd0739d4/versions/6/' | python -m json.tool
{
    "count": 1,
    "next": null,
    "previous": null,
    "results": [
        {
            "gpgcheck": 0,
            "metadata_checksum_type": "unknown",
            "package_checksum_type": "unknown",
            "pulp_created": "2022-02-08T05:00:16.110366Z",
            "pulp_href": "/pulp/api/v3/publications/rpm/rpm/f28363b1-dc93-4809-92e2-04a2e1fb8d57/",
            "repo_gpgcheck": 0,
            "repository": "/pulp/api/v3/repositories/rpm/rpm/60d89e2e-a443-4d56-9c67-e191fd0739d4/",
            "repository_version": "/pulp/api/v3/repositories/rpm/rpm/60d89e2e-a443-4d56-9c67-e191fd0739d4/versions/6/",
            "sqlite_metadata": true
        }
    ]
}

Searching in the database on the content proxy:

pulpcore=# select base_path, publication_id from core_distribution  where publication_id = 'f28363b1-dc93-4809-92e2-04a2e1fb8d57';
                          base_path                           |            publication_id            
--------------------------------------------------------------+--------------------------------------
 ORG/Production/centos7-epel7/custom/perfsonar/perfsonar-el7 | f28363b1-dc93-4809-92e2-04a2e1fb8d57
(1 row)

that actually all looks fine :frowning: The latest repository version has a publication created and its associated with a distribution.

So to be clear, when you navigate to 'https://HOSTNAME/ ORG/Production/centos7-epel7/custom/perfsonar/perfsonar-el7 /repodata/repomd.xml on the smart proxy you see a file that doesn’t match the same path on the katello server?

With mirrored metadata, they should match exactly.

I’m curious what your apache logs looks like, and if you see any POST requests to the
/pulp/api/v3/publications/rpm/rpm/ endpoint. Something like this might reveal it:

grep  POST /var/log/httpd/*  | grep 'publications/rpm/rpm'

Yes. repomd.xml files on both servers are different:
https://foreman.example.com/pulp/content/ORG/Production/centos7-epel7/custom/perfsonar/perfsonar-el7/repodata/repomd.xml

<?xml version="1.0" encoding="UTF-8"?>
<repomd xmlns="http://linux.duke.edu/metadata/repo" xmlns:rpm="http://linux.duke.edu/metadata/rpm">
  <revision>1644310078</revision>
  <data type="primary">
    <checksum type="sha256">0703588fcb6583294605ebc1b2c1b6bd5b26c6f77a5ccc391a453d0483f8571a</checksum>
    <open-checksum type="sha256">c354922f4a19b221e7b8cd1409f53f1aec46a9748017f558c32279a330b87f2d</open-checksum>
    <location href="repodata/0703588fcb6583294605ebc1b2c1b6bd5b26c6f77a5ccc391a453d0483f8571a-primary.xml.gz"/>
    <timestamp>1644310078</timestamp>
    <size>70169</size>
    <open-size>513178</open-size>
  </data>
...

https://foreman-content.dkrz.de/pulp/content/DKRZ/Production/centos7-epel7/custom/perfsonar/perfsonar-el7/repodata/repomd.xml

<?xml version="1.0" encoding="UTF-8"?>
<repomd xmlns="http://linux.duke.edu/metadata/repo" xmlns:rpm="http://linux.duke.edu/metadata/rpm">
  <revision>1644248069</revision>
  <data type="primary">
    <checksum type="sha256">f9d176ea6fa1dc923dcaea11834416e9be6eb866296b60dfc50252e9dac1a099</checksum>
    <open-checksum type="sha256">62a3e2d0b6644619d1a6992c2b863b605a006a7357df13d3b8bacfc3a13dcbb1</open-checksum>
    <location href="repodata/f9d176ea6fa1dc923dcaea11834416e9be6eb866296b60dfc50252e9dac1a099-primary.xml.gz"/>
    <timestamp>1644248068</timestamp>
    <size>69657</size>
    <open-size>1024355</open-size>
  </data>
...

As are the others files, directory structure and the linked rpms which are broken on the content proxy.

There are none on the content proxy since the upgrade from Katello 4.1.4 to 4.2.2. (There are some on the main server, if that matters)

same problem

ulp task error[Errno 2] No such file or directory: ‘./tmpahbdykvv’[Errno 2] No such file or directory: './tmpw4r8z_9n’Could not lookup a publication_href for repo 6593Pulp task error[Errno 2] No such file or directory: './tmpcw3qd5x6’Could not lookup a publication_href for repo 4590

how can i troubleshoot? thank you

That’s nowhere the same problem. Please open a new thread.

This is potentially relevant:

It looks like the upstream repo http://software.internet2.edu/rpms/el7/x86_64/main/ has the same packages listed multiple times in multiple places (see: RPMS/ and packages/ have the same files listed, and the metadata includes both)

I had to guess, something like this is happening: Pulp is seeing both packages and only publishing one of them, and either A) the metadata includes both and DNF / Yum are picking “the other one” or B) the metadata includes only one of them, and publishes the link to the wrong one.

At the end of the day, this upstream repo metadata is terrible and asking for issues, but if either A or B is true then we need to be handling that better on the Pulp side, because this we come across a repo like this every month or two.

I don’t have time to confirm that theory at the moment but I’ll investigate further tomorrow.

Seriously: don’t hijack threads about other issues. Open a new thread!

@gvde, I am talking about your issue, please read my comment.

My point is, the metadata itself has a few problems which may contribute to Pulp getting confused. You say that the content proxy only has

    config.repo
    packages/
    repodata/

wheras Yum is giving you a 404 on an RPM with a path prefix of RPMS/

https://foreman-content.example.com/pulp/content/ORG/Production/centos7-epel7/custom/perfsonar/perfsonar-el7/RPMS/perfsonar-core-4.4.3-1.el7.noarch.rpm: [Errno 14] HTTPS Error 404 - Not Found

If the upstream repo has the exact same packages present twice in the metadata, with one copy in packages/ and one copy in RPMS/, and Pulp is only writing links to one copy of those, but Yum / DNF is picking the other one, then this feels like a potential explanation for your issue. Or maybe not, it could be something else. I’ll investigate further tomorrow morning.

Although even if this isn’t the root cause or the only cause of your issue, we should still try to reach out to get them to fix their metadata.

  1. Your screenshot is a screenshot. It’s almost unreadable.
  2. Please read my issue: My problem is that the repository doesn’t sync metadata between the main Katello server and the content proxy. The repository is looking good on the main server.

Yes, still another topic and not even for this board.

@gvde I have read your issue. You are describing multiple problems and in any event it seems very likely to me that there could be multiple simultaneous causes for the behavior you are seeing. In any event, I am trying to help so maybe chill a bit?

You say the metadata on the content proxy has a “revision” of “1644248069”, which matches the current state of the upstream repo, and likewise the checksum of primary.xml is the same. But the layout is different. The upstream repo and therefore the metadata has an RPMS/ directory and the Pulp mirror does not. Yum hits a 404 on that file. The 404 could either be caused by

A) Yum picks a package but the location_href does not exist in the repository
B) This repo is on_demand, the content proxy is storing the wrong upstream URL for some reason, Pulp tries to download that file (on_demand) but hits a 404 and then proxies that forward to Yum.

Please help me determine whether it is A or B. I promise, we will get to the sync issue, but this is relevant to how that ought to be investigated.

Also could you provide the pulp component versions of the currently running processes on the content proxy pulp status | jq '.versions'?

1 Like

Well, maybe we have a basic misunderstanding, but I am describing only one problem: the repository isn’t synced from the main server to the content proxy. Any upstream issue doesn’t really bother me at the moment, because I ran a republish metadata for the repository on the main server, thus on the main server the repository is fully consistent with the metadata.

Yes, because the content proxy doesn’t sync from the main proxy. The content proxy still has the state of the repository sync from upstream because it doesn’t pick up the repository from the main server.

The server is connected to the content proxy which still has the upstream (broken) primary with the duplicates. The content proxy only has the packages path not the RPMs path. It seem pulp only puts the second path into its database (i.e. packages/, or processes the file in order and the second identical rpm overwrites the first entry) and yum prefers to use the first path (RPMS/). That’s why it breaks.

But that’s not my problem. While annoying and should be fixed upstream and/or properly handled in pulp, I have “fixed” the issue for the meantime by republishing meta data on the main server. Now I have a good presentation of the repository on the main server (even though it’s no mirror view anymore), but the content proxy doesn’t pick it up. That’s currently my problem.

The repository is not on demand. The content proxy is on demand. But it’s my understanding that this shouldn’t matter if I do a complete sync on the content proxy.
Pulp mirrors the repodata metadata but only mirrors the content in packages/ and not the duplicates in RPMS/. yum picks up the first entry from the metadata to find the rpm which points to RPMS while pulp only mirrors the second entry in the metadata. The packages/ path is also in the location_href in the rpm_package table on the content proxy.

It’s kind of neither nor. It’s mirrored upstream metadata containing duplicate entries for all rpms, one in RPMS/ and one in packages/ and pulp only picking up/presenting one of those, i.e. packages/
yum picks the first path from metadata which isn’t shown on pulp.
As I wrote before: I suspect pulp reads the metadata entry by entry. First it see the RPMs location_href and puts that into the database. Later it finds the same RPM with the exact same name and exact same checksum/hash but with the packages/ location_href and now changes the previous entry to the latest location_href, probably because it thinks the repository has been reorganized…

I get why this happens. There is even an index on the rpm_package table in the database enforcing this:

    "rpm_package_name_epoch_version_relea_c9003ffa_uniq" UNIQUE CONSTRAINT, btree (name, epoch, version, release, arch, checksum_type, "pkgId")

There is no “pulp” command… I can offer you this:

# rpm -qa *pulp* | sort
pulpcore-selinux-1.2.7-1.el7.x86_64
python3-pulp-ansible-0.9.0-2.el7.noarch
python3-pulp-certguard-1.4.0-3.el7.noarch
python3-pulp-container-2.8.1-0.2.el7.noarch
python3-pulpcore-3.14.9-1.el7.noarch
python3-pulp-deb-2.14.1-2.el7.noarch
python3-pulp-file-1.8.2-2.el7.noarch
python3-pulp-rpm-3.14.8-1.el7.noarch
tfm-rubygem-smart_proxy_pulp-3.1.0-1.fm2_6.el7.noarch

It’s the latest available version for 4.2/3.0…

But all this doesn’t explain why the content proxy doesn’t pick up the perfectly correct, republished repository from the main server.

It’s fair to focus on the second issue (the sync issue), but it doesn’t make discussion of the first issue irrelevant. Piecing together an step-by-step picture of what happened is important to actually resolving your problems if the two issues are interrelated somehow. I’m just trying to be methodical, not ignore your concerns. And it sounds like the 404s came first, and the sync issues were noticed when trying to work around that.

Anyway - I just remembered a discussion from a few months ago and it seems to describe what is happening here pretty well.

TL;DR, the repository (probably) synced perfectly fine, it just didn’t replace the metadata because it thought everything was the same. And this is addressed in a newer version, but wasn’t backported because it’s tied in with a bunch of of significant changes of behavior that didn’t fit the risk profile for a pre-emptive backport.

That explains why only this repo was affected, and only after regenerating the metadata and trying to sync again.

So the workaround would be: just regenerate the metadata on the content proxy, too.

I will see if the relevant portion of that PR can be split off and backported to 3.14.

Upstream issues filed:

Well, the republish of the metadata on a repository followed by a complete sync of the content proxy is my failsafe workaround for a variety of issues in the past. Now, if that doesn’t work here, it bothers me because it means I may end up in a state where a quick workaround isn’t possible and I potentially cannot use some repositories for a while.

There are only two functions for the content proxy: “Optimized Sync” and “Complete Sync”. I thought “Complete Sync” would ignored previously mirrored metadata and pick up the version from the main server. But it seems it does not.

So your workaround is unfortunately not possible with standard foreman/katello means…

I may take another go at a katello 4.3 upgrade tonight which comes with pulp 3.16. However, during my three previous, failed attempts I had a number of backtraces on a couple of repository syncs after which I have reverted the servers to my pre-update 4.1 snapshot…

@gvde, You should be able to install the pulp-cli on your capsule, and manually publish this one repo.

And we’ll roll out a patch that prevents mirroring the broken metadata in the first place.

I don’t really like using “low-level” tools as I don’t really know what else it may affect, if you make a change in pulp which is not matched in the katello database.

And still, it’s bothering me that the complete sync on the content proxy/capsule is not picking up the new repository from the main server. I thought that’s the whole purpose of the complete sync on the capsule to ignore the metadata but get everything again from the main server. As I wrote before, in the past that was my fail-safe for a couple of bugs.

@gvde, It shouldn’t hurt anything to just republish the repo on the content proxy. Katello doesn’t really manage them, just tell them to sync occasionally.

And still, it’s bothering me that the complete sync on the content proxy/capsule is not picking up the new repository from the main server. I thought that’s the whole purpose of the complete sync on the capsule to ignore the metadata but get everything again from the main server. As I wrote before, in the past that was my fail-safe for a couple of bugs.

I get it, it’s unexpected behavior, that’s why it was fixed last October. But it ended up being a sizable patch, built on top of a few other sizable patches, which don’t make sense to backport to 4.2

You said you would attempt to upgrade to Katello 4.3, are you still facing blockers there? What kind of exceptions were you hitting?

I had to deal with a couple of others things which is why I wasn’t able to try the upgrade again. Last Friday I have tried again but ran immediately into Katello 4.3 - Repo Sync Error - Errno1 Operation Not Permitted and PulpRpmClient::ApiError HTTP 500 during sync repository.

As I wasn’t sure how much time I have this week to deal with these and the sync error affected CentOS 7 Base on my system, I have reverted back to 4.2 as there everything is working at the moment, except for that one repository which is only used on a single host, but that I can live with…