Is it possible to ignore/skip the file .treeinfo at the top of a remote YUM repository?

Problem:
We have an external repository, which is extremely (hmm, too extremely) strict with their HTTP permissions. You cannot browse it nor can you access the .treeinfo file. You can, however, access the respond.xml file which lists all the packages in it (and you can download them).

Unfortunately, when pulling the repository it stops with this error:
Katello::Errors::Pulp3Error: 403, message='Forbidden', url=URL('http://repo.somewhere.net/yum/abc-1.0/el7-x86_64/.treeinfo')

We used Spacewalk before and it was able to replicate the repo. Is there any setting I can make to allow pulp to mirror such a repository?

Expected outcome:
Pulp should be able to continue and go for the /repodata/repomd.xml file.

Foreman and Proxy versions:
Foreman 2.2.1, Katello 3.17

Foreman and Proxy plugin versions:

Distribution and version:
CentOS 7.9

Other relevant data:

Hey @rbremer,

Sorry to hear about the trouble, what content type is the repository?

-John

I see it’s yum now :slight_smile: @Justin_Sherrill have you seen anything like this with pulp3?

Indeed, yum. I can give you the real URL as a PM, if you need it.

I’m also having the same issue. Here’s the output of Hammer:

$ sudo hammer repository synchronize --id 384
[......................................................................................................................] [100%]
Total steps: 0/0
--------------------------------
Error: 403, message='Forbidden', url=URL('http://download.zfsonlinux.org/epel/7.9/kmod/x86_64/.treeinfo')

Is there any other info I can provide that might be useful?

It seems that it’s a bit too stringent in how it treats this kind of error. If it were a 404, it would probably move-on to the next attempted fetch, and treat that as non-fatal (because not every repo is going to have a .treeinfo file).

But because this is one of those S3-based repos, we get a 403 for trying to fetch a resource that isn’t available (which, technically speaking, goes against the HTTP spec, but that’s neither here nor there).

This gets treated as a fatal error, and all processing stops, which seems a bit extreme to me. It should at least try to fetch /repodata/repomd.xml, which will succeed for this particular repo:

$ curl -skv -X HEAD http://download.zfsonlinux.org/epel/7.9/kmod/x86_64/repodata/repomd.xml

Any more info found workaround for this one found?

I’m working with a internal Artifactory setup which returns 401 if no credentials are given but 403 if they are given for missing files.
This worked fine for us with Foreman 2.0 but now on 2.3 it isn’t able to sync repo any longer

@Justin_Sherrill Any update on this?

The .treeinfo file is extremely important as it contains kickstart information, image path, sha sums etc. I don’t think Foreman/Katello is the party that should be “fixed” in this case. It’s up to the mirror administrator to publish the content correctly. Tree info files are required, I am afraid they need to reconfigure web service to allow accessing them.

If you need an argument, well, here is one:

https://vault.centos.org/8.0.1905/BaseOS/x86_64/kickstart/.treeinfo

This is how it’s done. :slight_smile:

Now I read this happens with a 3rd party repository which is not a kickstart repository. In this case, I share your opinion that Pulp/Katello should have been skipping this as a non-fatal error.

This 403 on treeinfo can be inspected when you are using your proxy.

192.168.2.13 - - [15/Jan/2022:02:40:31 +0100] "GET /centos/7/os/x86_64/.treeinfo HTTP/1.1" 403 3992 "-" "urlgrabber/3.10"
192.168.2.13 - - [15/Jan/2022:02:40:31 +0100] "GET /centos/7/os/x86_64/treeinfo HTTP/1.1" 403 3989 "-" "urlgrabber/3.10"