OCI container push to Foreman-katello fails for large images

Problem:

I try to push an image of about 9GiB.

podman push registry.gitlab.com/eu-os/workspace-images/eu-os-base-demo/eu-os-demo:latest      my-foreman.local/eu_body/eu_os_demo_fedora/eu-os-demo:latest                                      
...
Copying blob 622c9ebd9d02 done   | 
Copying blob fc744af05892 done   | 
Copying blob 7167c79a7535 done   | 
Copying blob 64266ab39f33 done   | 
Copying blob 13b8522a89d2 done   | 
Copying blob 596d46a1191d done   | 
Copying blob 14583fc917a5 done   | 
Error: writing blob: uploading layer chunked: authentication required

Error code: 125

I tried to push alpine before, which worked well. I used in both instances the admin account.

Expected outcome:

  • there is no error after pull push
  • I can see the image + tag in foreman

Foreman and Proxy versions:

I use Foreman Version 3.15.0.

Package versions

Installed Packages
candlepin-4.4.20-1.el9.noarch
candlepin-selinux-4.4.20-1.el9.noarch
dynflow-utils-1.6.3-1.el9.x86_64
foreman-3.15.0-1.el9.noarch
foreman-cli-3.15.0-1.el9.noarch
foreman-dynflow-sidekiq-3.15.0-1.el9.noarch
foreman-installer-3.15.0-1.el9.noarch
foreman-installer-katello-3.15.0-1.el9.noarch
foreman-postgresql-3.15.0-1.el9.noarch
foreman-proxy-3.15.0-1.el9.noarch
foreman-redis-3.15.0-1.el9.noarch
foreman-release-3.15.0-1.el9.noarch
foreman-selinux-3.15.0-1.el9.noarch
foreman-service-3.15.0-1.el9.noarch
katello-4.17.0-1.el9.noarch
katello-certs-tools-2.10.0-1.el9.noarch
katello-client-bootstrap-1.7.9-2.el9.noarch
katello-common-4.17.0-1.el9.noarch
katello-repos-4.17.0-1.el9.noarch
katello-selinux-5.2.0-1.el9.noarch
python3.12-pulp-ansible-0.24.6-1.el9.noarch
python3.12-pulp-cli-0.32.3-1.el9.noarch
python3.12-pulp-container-2.24.2-2.el9.noarch
python3.12-pulp-deb-3.5.2-1.el9.noarch
python3.12-pulp-glue-0.32.3-1.el9.noarch
python3.12-pulp-ostree-2.4.8-1.el9.noarch
python3.12-pulp-python-3.13.5-1.el9.noarch
python3.12-pulp-rpm-3.29.4-1.el9.noarch
python3.12-pulpcore-3.73.12-1.el9.noarch
rubygem-dynflow-1.9.1-1.el9.noarch
rubygem-foreman-tasks-11.0.0-1.fm3_15.el9.noarch
rubygem-foreman_maintain-1.10.3-1.el9.noarch
rubygem-foreman_remote_execution-16.0.3-1.fm3_15.el9.noarch
rubygem-hammer_cli-3.15.0-1.el9.noarch
rubygem-hammer_cli_foreman-3.15.0-1.el9.noarch
rubygem-hammer_cli_foreman_remote_execution-0.3.2-1.fm3_15.el9.noarch
rubygem-hammer_cli_foreman_tasks-0.0.22-1.fm3_15.el9.noarch
rubygem-hammer_cli_katello-1.17.0-0.1.pre.main.20250514082549git2c8c109.el9.noarch
rubygem-katello-4.17.0-1.el9.noarch
rubygem-pulp_ansible_client-0.24.6-1.el9.noarch
rubygem-pulp_certguard_client-3.73.9-1.el9.noarch
rubygem-pulp_container_client-2.24.2-2.el9.noarch
rubygem-pulp_deb_client-3.5.2-1.el9.noarch
rubygem-pulp_file_client-3.73.9-1.el9.noarch
rubygem-pulp_ostree_client-2.4.8-1.el9.noarch
rubygem-pulp_python_client-3.13.5-1.el9.noarch
rubygem-pulp_rpm_client-3.29.2-1.el9.noarch
rubygem-pulpcore_client-3.73.9-1.el9.noarch
rubygem-smart_proxy_dynflow-0.9.4-1.fm3_14.el9.noarch
rubygem-smart_proxy_pulp-3.4.0-1.fm3_13.el9.noarch

Distribution and version:

Almalinux 9.6

Other relevant data:

1 Like

Gitlab had once the same issue:

I have troubles to understand the outcome of it, but the discussion is also about pushes that take longer than the lifetime of the token from the registry. For Gitlab, the issue seemed to be Cloudflare I understood. I do not use Cloudflare.

I found the code for the token lifetime in Katello:

The lifetime is hardcoded 3 or 6 minutes.

The upload of my 9GiB image takes roughly 30minutes.

I tried to call podman push --log-level=debug ... to get more debug logs, but this is not giving more insights. :frowning:

....
Copying blob 13b8522a89d2 done   | 
Copying blob 596d46a1191d done   | 
Copying blob 14583fc917a5 done   | 
DEBU[1954] Looking up image "registry.gitlab.com/eu-os/workspace-images/eu-os-base-demo/eu-os-demo:latest" in local containers storage 
DEBU[1954] Normalized platform linux/amd64 to {amd64 linux  [] } 
DEBU[1954] Trying "registry.gitlab.com/eu-os/workspace-images/eu-os-base-demo/eu-os-demo:latest" ... 
DEBU[1954] parsed reference into "[overlay@/home/rriemann/.local/share/containers/storage+/run/user/1000/containers]@024fe129075da23167b7040c43caeedce2490d52d2467b2b0f61769494873e5a" 
DEBU[1954] Found image "registry.gitlab.com/eu-os/workspace-images/eu-os-base-demo/eu-os-demo:latest" as "registry.gitlab.com/eu-os/workspace-images/eu-os-base-demo/eu-os-demo:latest" in local containers storage 
Error: writing blob: uploading layer chunked: authentication required
DEBU[1954] Shutting down engines                        
INFO[1954] Received shutdown.Stop(), terminating!        PID=3467450
1 Like

Thanks for the detailed report, I think you might be spot-on about the token expiration being an issue. I would’ve expected a new token to be provisioned.

We should deal with this soon, I’m a little surprised more people haven’t hit it. I supposed multiple upload attempts might solve the issue since the blobs uploaded remain in Pulp, maybe they’re just dealing with it that way.

Progress on the issue can be tracked here 38649

1 Like

I managed to upload the image today. I copied the image with skopeo to an oci-archive on the same host that runs foreman. Then I copied from the oci-archive to foreman with success. The blobs were already on the server and not reuploaded again and it took maybe 5min. Podman was always reuploading the blobs again and it was always taking 30min.

User João from Gitlab made a script for testing their issues. Maybe this also helps debugging here.

Debugging Image Script
#!/bin/bash

repo="jdrpereira/registry-test/issue-361279"

tmp=$(mktemp -d)
echo "Using temporary dir $tmp"
cd $tmp

gcping_bin="gcping_darwin_amd64_latest"
dd_bs="64m"
dd_iflag=""
if [[ $(uname) == 'Linux' ]]; then
    gcping_bin="gcping_linux_amd64_latest"
    dd_bs="64M"
    dd_iflag="iflag=fullblock"
fi

echo "Measuring GCP ping latency with https://github.com/GoogleCloudPlatform/gcping ..."
curl -s https://storage.googleapis.com/gcping-release/$gcping_bin > gcping
chmod +x gcping
./gcping -r us-east1

echo "Measuring internet bandwidth with https://github.com/sivel/speedtest-cli ..."
curl -s https://raw.githubusercontent.com/sivel/speedtest-cli/master/speedtest.py > speedtest-cli
chmod +x speedtest-cli
./speedtest-cli --simple --bytes

echo "Generating random layer files..."
dd if=/dev/urandom of=1GB bs=$dd_bs count=16 $dd_iflag
du -h $tmp/1GB
dd if=/dev/urandom of=2GB bs=$dd_bs count=32 $dd_iflag
du -h $tmp/2GB
dd if=/dev/urandom of=3GB bs=$dd_bs count=48 $dd_iflag
du -h $tmp/3GB

echo "Building image..."
cat <<EOT >> Dockerfile
FROM scratch
ADD 1GB /
ADD 2GB /
ADD 3GB /
EOT
docker build -t registry.gitlab.com/$repo:latest .

echo "Pushing image..."
time docker push registry.gitlab.com/$repo:latest

echo "Cleaning up..."
rm -rf $tmp
docker rmi registry.gitlab.com/$repo:latest

1 Like

Wanted to give an update - we’re fixing this with a longer token time that is also user configurable.