Problem syncing repo's behind a heavily loaded proxy

Problem:
syncronization failing when random 502 errors occur

Expected outcome:
would like to either retry 502 failures or syncronize the repo in smaller chunks

Foreman and Proxy versions:
foreman-selinux-1.23.0-1.el7.noarch
foreman-cli-1.23.0-1.el7.noarch
foreman-installer-katello-1.23.0-1.el7.noarch
foreman-proxy-1.23.0-1.el7.noarch
foreman-1.23.0-1.el7.noarch
foreman-postgresql-1.23.0-1.el7.noarch
foreman-installer-1.23.0-1.el7.noarch
foreman-debug-1.23.0-1.el7.noarch
foreman-vmware-1.23.0-1.el7.noarch
python-nectar-1.6.0-1.el7.noarch
pulp-selinux-2.19.1-1.el7.noarch
pulp-puppet-plugins-2.19.1-1.el7.noarch
pulp-docker-plugins-3.2.4-1.el7.noarch
pulp-deb-plugins-1.9.1-1.el7.noarch
pulp-server-2.19.1-1.el7.noarch
pulp-puppet-tools-2.19.1-1.el7.noarch
pulp-client-1.0-1.noarch
pulp-rpm-plugins-2.19.1-1.el7.noarch
pulp-katello-1.0.3-1.el7.noarch

Distribution and version:

Other relevant data:
Initial sync of Ubuntu Universe failing. sync is happening through a proxy we have no control over and we occasionally get “failed with code 502: cannotconnect” which is enough to fail the entire initial sync.

Nov 13 00:16:51 foreman.internal pulp: nectar.downloaders.threaded:INFO: Download succeeded: http://ca.archive.ubuntu.com/ubuntu/pool/universe/g/gcc-7-cross/lib32stdc++-7-dev-s390x-cross_7.3.0-16ubuntu3cross1_all.deb.


Nov 13 16:29:52 foreman.internal pulp: nectar.downloaders.threaded:INFO: Download failed: Download of http://ca.archive.ubuntu.com/ubuntu/pool/universe/g/gcc-7-cross/lib32stdc++-7-dev-s390x-cross_7.3.0-16ubuntu3cross1_all.deb failed with code 502: cannotconnect
Nov 13 16:35:04 foreman.internal pulp: celery.app.trace:ERROR: [f0cd450f] (9977-61920) IOError: [Errno 2] No such file or directory: u'/var/cache/pulp/reserved_resource_worker-0@foreman.internal/f0cd450f-c057-475c-aa32-d8cb99459c53/packages/6b/7c/lib32stdc++-7-dev-s390x-cross_7.3.0-16ubuntu3cross1_all.deb'

Hey @sbeaton, I’m not too sure on this one. Are those all the relevant logs from journalctl (or /var/log/messages)? I see the connection error but not sure why you would get it. Is there a way to try and reproduce locally?

logs are coming from /var/log/messages. I am running with a theory that I am having trouble connecting to one of the multiple IP’s associated with the ca.archive.ubuntu.com. The 502 errors are frequent but not associated with any one file.

It would still be nice to either have foreman/pulp retry 502 errors or at least fail outright when it gets one. The failure happens once it thinks it has finished syncing and then tries to publish (after downloading gigs of data).

@sbeaton is this on the initial sync of the repo or when publishing/promoting content views?

The 502 errors occur during the sync. When the publishing/promoting occurs it fails because it is not able to locate the files that failed to download due to the 502. Once this occurs, the pulp cache/staging area is then cleared out. (meaning I start right from scratch again) I end up with 0 Packages in the Foreman repo.

I looks like I have managed to sync the ubuntu repo by changing to a more local mirror. apparently the geo mirrror for canada (ca.archive.ubuntu.com) actually points to the US one (us.archive.ubuntu.com) and I do not really have access to our proxy to debug what is happening there.

It would be still nice to figure out how to efficiently handle 502 errors when they occur so that we do not hammer mirror sites needlessly (and our proxy server)

@Justin_Sherrill any thoughts on this issue? Is there anything further we can do to debug?