Have seen this same behavior on multiple environments we have upgraded.
Before you upgrade to 4.1.x make sure you have at LEAST 16GB of RAM on your smartproxies PER CPU, because by default it will use one pulpcore-worker per cpu, and I am seeing them grow up to 16GB RAM each worker whenever doing a full sync to the smartproxy. It uses far less on optimized syncs but sometimes you have to do a full sync so you need it.
Also note, the behavior you will see is syncs will fail unexplained, but if you look at the journal/messges on the smartproxy you will see the linux OOM killing pulpcore-workers when you do not have enough RAM.
@Viwon 16gb should not be necessary for a single sync no matter how large the repository is. Which version of pulp_rpm do you have installed? And have you performed any tweaks, like disabling iterative parsing?
Definitely will not disagree with you or say your findings are wrong at all I’ve noticed the big spike in Hardware requirements too. I will say though I experienced all types of sync problems “pulp task error” errors with 4.1 until I changed the download-concurrency to like 2.
hammer repository list #To get the list of repos I’m syncing
hammer repository update --id=1 --download-concurrency=2 #Changing the setting
The sync errors got more exciting in the version I’m currently testing (katello-4.2.0.rc1-1.el8), now I’m getting intermittent
deadlock detected
DETAIL: Process 2496 waits for ShareLock on transaction 18298; blocked by process 2463.
Process 2463 waits for ShareLock on transaction 18361; blocked by process 2496.
HINT: See server log for query details.
CONTEXT: while inserting index tuple (443,4) in relation "rpm_package_pkgId_key"
@viwon So a few minutes ago I did come across a repo which causes pathologically bad memory issues that the vast majority of repos don’t seem to fall victim to.
Interesting. Our katello content is all RHEL8 based and that is one of the repos I have as well, because we have some people that connect linux apps to MSSQL databases.
That’s good to hear, hopefully it will turn out to be an issue with this particular repository. I did notice it has some duplicate packages with the same versions and different checksums, I doubt that’s at all related to this issue, but perhaps it bodes well for finding other issues with this repo.