Oracle Linux 8 Appstream sync fails

Syncing OL 8 Appstream repo fails for me, too. Exact same message, and increasing the sync_connect_timeout parameter to 600 seconds didn’t help, either.

It would help if there was SOME indication of what action specifically times out. Quite frankly, end users need meanigful error messages to be able to investigate the problem, whether it is an infrastructure or application issue. Python/Ruby stracktraces that do not include any specifics (which URL is being retrieved, etc.) are not good enough.

We set the download currency for the OL 8 Appstream repo to 5 using the Hammer CLI and it seems to have fixed the issue. It has successfully synced overnight twice since so hopefully it works for you too.

hammer repository list

Take note of your Appstream repo id, then replace X with your id;

hammer repository update --id=X --download-concurrency=5
1 Like

@dmann
Ah, so maybe it’s some throtteling on Oracle’s side, then? I will try tomorrow, thanks for the hint.

We’re getting this for OL8 Appstream too, I’m going to try the download concurrency workaround @dmann has suggested.

Not 100% fixed it for us, still getting some sync failures on both OL8 ‘baseos’ and ‘appstream’ repositories.

So, after having spent a number of hours yesterday chatting with the knowledgeable people over in the Pulp users chat channel, we’ve learnt more about this issue, and come up with a combination of changes which affect a workaround.

The timeout issue is caused by a number of things:

  1. yum.oracle.com/Akamai download throttling - this one can be really obvious: at times I saw really high download speeds for a while, which would later drop down to much slower speeds
  2. The default download timeout settings in Katello (“Sync Connection Timeout” in Foreman settings, under “Content”) of 300s or 5 minutes. When the downloads are really being throttled this can easily be hit.
  3. The OL8 appstream repo has quite a few large SRPMs which vary in size from 1.3GiB to 2.5GiB. When “mirror on sync” is enabled (even if you’ve enabled “Ignore SRPMs”) SRPMs get downloaded, as Pulp is providing a complete mirror of the repo.
  4. This is an arguable one…Pulp is meant to retry a download a number of times (defaults to 3), but doesn’t do this when a download hits the timeout period.
  5. If you are performing one of these large syncs, and get part way through before the sync fails with the timeout, it seems everything which was downloaded into the tmp directory is thrown away, and not kept. So, the next time the sync is tried, it has to download everything again.

So, the combination of changes to manage a workaround:

  1. I set the timeout to 3000 seconds, but without the other changes this still wasn’t enough, because even that timeout (50 minutes!) was getting hit.
  2. Change concurrent downloads from the default of 10 to 5. Ditto the last sentence of the above.
  3. Change the repository to turn off “Mirror on sync” and turn on “Ignore SRPMs”
  4. Point 3. above then managed to hit another Pulp issue for these repositories, which would cause the sync to fail for another reason, albeit a lot sooner in the sync…
  5. In /etc/pulp/settings.py set “RPM_ITERATIVE_PARSING = False” - this changes the RPM metadata parsing to an older routine, the routine which led to very high RAM usage in pulpcore-worker instances during sync of repositories with large metadata sets, like OL8/RHEL repositories.

All of the above leads to syncs which work, but don’t download SRPMs, so you have to accept that restriction. Also, RAM usage is high during the sync runs of these repositories.

2 Likes

Hmm, that is a workaround with awfully many restrictions and gotchas. Might as well just update using the official Oracle repositories for the time being, then…

I only found one relevant Pulp issue so far: #9233, which I think is important since starting from where the previous failed job left off would at least allow to make progress over time and not create an infinite loop of failed attempts.

I encountered a number of issues downloading/syncing the Oracle repositories, but came up with a workable (yet wasteful) workaround… I have a second server set up which uses reposync/createrepo to pull down all the RPMs from Oracle and then presents them using a standard Apache HTTPD Web Server. Foreman then connects to this server and synchronizes everything normally (without the throttling or timeout issues). Not the most ideal thing, but it does avoid my repositories going to 0 packages due to the failed sync and “Mirror on Sync” being checked.

We also already had the same setup as yourself in place but we were still facing the same issue. The only workaround was by changing the Download Concurrency, we upped the Apache config too to allow for larger timeouts etc but it didn’t help.

I’m not entirely the Download Concurrency was the only change that enabled it to work as we aso increased the “Sync Connection Timeout” that @John_Beranek mentioned… so it seems to be a combination of both settings that fixed our issue.

Was struggling to sync CentOS 8 Base/Appstream/Epel and Puppet 6.0 on a fresh Foreman/Katello for quite a while now… Steps 1-3 did the trick. For me the major difference was the first step, since I went from 3 to 1 and then tried different timeout settings (3600 was enough) though just changing the timeout settings without steps 2 and 3 did not help.
Thankyou.

Hi,

what, exactly, is this “Download Concurrency” parameter you guys are talking about? Is this the foreman_proxy_content_batch_size setting? That one was set to 100 by default in our installation.

By disabling mirror_on_sync, increasing the sync_connect_timeout setting to 3600, and reducing the foreman_proxy_content_batch_size to 5, I now get a completely new error message:

Task 55409e4f-c0d6-4245-aa2f-fa7964d722be failed (Package id from primary metadata (831cbe1e389d947e8015934b72c0dbb5edd8e866), does not match package id from filelists, other metadata (6681ff57427e630e80c63761b02d00455efbe23b))

I found a Pulp 3 issue (8944) that kind of matches the error message and has been marked as “closed - currentrelease”, but I am on python3-pulpcore-3.14.5 and still have this error.

A manually triggered “complete resync” did not help, either. Any ideas?

Kind Regards
Florian

Hi,

No I think that’s different. Restore it to the original value. Connect to your foreman master server by ssh and run these commands:

hammer repository list --organization=“your_organization”

From the output get the id of the required repository and then run the next command:

hammer repository update --id=“your_repository_id” --download-concurrency=5 --organization=“your_organization”

This issue is why I had to set set “RPM_ITERATIVE_PARSING = False” in Pulp’s settings.

Hey @fbachmann and @John_Beranek , regarding the error “Package id from primary metadata
(…), does not match package id from filelists, other metadata (…))”

I figured out this issue yesterday so it should be fixed in the next release of pulp_rpm. Until then, as a workaround, if you disable “skip SRPMs” (I think this is what the Katello option is called) it should resolve the issue. If you don’t have the repo configured to skip SRPMs in the first place or if turning it off doesn’t help, let me know.

This would also mean that @John_Beranek you should be able to flip RPM_ITERATIVE_PARSING back on

1 Like

Regretfully, I’ve disabled SRPM downloads for my AppStream repo, and it’s still failing.

Url:                     http://yum.oracle.com/repo/OracleLinux/OL8/appstream/x86_64
Publish Via HTTP:        yes
Published At:            <elided>
Relative Path:           <elided>
Download Policy:         immediate
Ignorable Content Units: srpm

I have 19,752 RPMs downloaded, which matches what I have in my legacy Spacewalk system, but I can’t get beyond this. It fails repeatedly with a Pulp task error, and I haven’t figured what to pull out of the error log.

@jkalchik Sorry, I was not clear enough. Skipping SRPMs is what triggers this error. If you do not skip the SRPMs (e.g. you do not ignore anything), that would avoid it (I believe)

But I’ve just released the new pulp_rpm, so you could also just wait until Katello builds those packages in a day or two.

Okay. I’ve never checked the box to skip SRPMs before, and this repository has been a pain in my side (if not a little lower south.) I see some updates tonight, but nothing for pulp. I’ll check the repos again in the morning.

I was purposely skipping SRPMs to:

  • Stop the timeouts downloading from yum.oracle.com
  • Save lots of disk space

I eagerly await the pulp_rpm release.

That feature won’t be in this release, but it will be in the very next one.

https://github.com/pulp/pulp_rpm/pull/2115

Well - since it is a change in behavior it may not be backportable. But now that the patch for the parser bug is out, you could disable “mirror” and thereafter the SRPMs will be skipped over as you expect.