Updating Foreman 2.2.1 to 2.3.1 and Katello 3.17.1 to 3.18 issue

rbremer · December 21, 2020, 3:14pm

So to me that looks like yum repos are served via Pulp 3.

But why is the pulpcore-content then trying to serve it from a Pulp 2 directory?

Justin_Sherrill · December 22, 2020, 5:46pm

I think things may have gotten really off track here. A couple of notes:

do not disable the pulp3 repository, even if you’re still on pulp2, upgrades will not be happy and will likely leave you in a broken state
You hit a dependency issue with ‘pulp-consumer-client’. I’m not sure why this package was even installed, it conflicts with katello-agent and should not be used or installed
–foreman-proxy-plugin-pulp-pulpcore-enabled false should NOT be run unless you really know what you’re doing
what makes you think that pulp3 is trying to serve content out of the pulp2 directory? /var/lib/pulp/media/artifacts is a pulp3 directory

It looks to me like there is some content in pulp3 that went missing for some reason. Can you try to ‘repair’ the repository?

Go to Content > Products > click a product > click the repository > in the top right click the drop down > click ‘Verify Content Checksum’

see if that helps at all

rbremer · December 22, 2020, 7:22pm

Hi Justin,

thank you for your assistance here. To clarify a few of your points:

During the original install and all updates since then I always had all repositories as shown enabled.
I do not know why the package pulp-consumer-client was installed, I only installed the additional plugin for VMware, everything else was handled by installing Katello in the first place (I think it was 3.16). Katello-Agent was never installed.
I did not use the switch during install or update.
The reason I am saying that is the error in /v/l/m, but I am really not sure on how to say that differently. The content seems to live in /var/lib/pulp/docroot/…, while the error shows me /var/lib/pulp/media/…

Lets get a bit more into detail with the last point, as this the situation I am faced with right now. Before the upgrade I had no errors using the Katello instance. I have about 30+ hosts subscribed and all got updates.
After the upgrade the previously shown error started to appear for all hosts. The directory /var/lib/pulp/media was completely empty. That was yesterday, interestingly enough, today I see a few packages inside this directory. So I assume the daily repo sync schedule started downloading new packages there and no longer into /var/lib/pulp/docroot. I cannot explain why that would happen.

I was under the assumption that Katello 3.16 will default to pulp 3 for a new install, so I never really give a second thought until that error started to appear. I will try the content repair and get back to you.

rbremer · December 22, 2020, 7:37pm

I can verify that the directories underneath /var/lib/pulp/media/artifact have all been created this morning during the sync schedule, they did not exist before.
docroot has 137G
media has 2.1G

I repaired the CentOS 7 base repository, here is the output of the task:

Total steps: 8362/8362
--------------------------------
Identify corrupted units: 4181/4181
Repair corrupted units: 4181/4181

and its still running. I will do that same with CentOS 7 updates after to see a change and give more feedback.

rbremer · December 22, 2020, 7:38pm

BTW: during the repair the size of the media directory has already increased to 6.2G …

Justin_Sherrill · December 22, 2020, 7:53pm

are there still files in /var/lib/pulp/media/ ? I think this directory changed from either in 3.17 or 3.18 @ekohl do you remember the particulars around this? What was supposed to move those files?

rbremer · December 22, 2020, 8:06pm

They are coming into /var/lib/pulp/media now. The three repositories I repaired have the files placed into there, however, they also still seem to exist in /var/lib/pulp/docroot, as this directory size has not changed:

[root@foreman pulp]# du -sh media/
16G media/
[root@foreman pulp]# du -sh docroot/
137G docroot/

media was empty before last night and docroot was at 137G.

Repairing the CentOS 7 updates repository had this output:

Total steps: 2256/2256
--------------------------------
Identify corrupted units: 1128/1128
Repair corrupted units: 1128/1128

rbremer · December 22, 2020, 8:09pm

in addition I have now 11400 “tmp*” files inside of /var/lib/pulp, created during the repository repair.

Justin_Sherrill · December 23, 2020, 12:17am

oh sorry, yes i meant /var/lib/pulp/docroot. That is the old location. It sounds like the app was reconfigured to serve from /var/lib/pulp/media/, but the files weren’t moved. Will chat with @ekohl when i get a chance. (Probably after the new year though)

The files left in /tmp/ kinda sounds like a pulp bug, it sounds familiar but i couldn’t find an existing issue. Those should be safe to delete now. Probably worth filing an issue at pulp.plan.io for that.

rbremer · December 23, 2020, 8:00am

Thanks Justin, I will check on that. Looking forward to hear from Ewoud.

ekohl · December 23, 2020, 3:40pm

I wrote about the installation layout in our Puppet module since upstream this was undocumented. Looking at it now, I also see I missed one thing (Correct directory tree in README by ekohl · Pull Request #159 · theforeman/puppet-pulpcore · GitHub).

Note that this layout was only applied since puppet-pulpcore 2.0.0. You can check this in /usr/share/foreman-installer/modules/pulpcore/metadata.json yourself. This is shipped since Foreman 2.3, which maps to 3.18. I also wrote a migration that’s supposed to migrate the directory layout:

github.com

theforeman/foreman-installer/blob/2.3-stable/hooks/pre/34-pulpcore_directory_layout.rb

require 'pathname'

PULP_ROOT = Pathname.new('/var/lib/pulp')
LEGACY_DIR = PULP_ROOT / 'docroot'
DESTINATION = PULP_ROOT / 'media'

if LEGACY_DIR.directory? && !LEGACY_DIR.symlink?
  logger.debug("Migrating #{LEGACY_DIR} to #{DESTINATION}")
  unless app_value(:noop)
    LEGACY_DIR.rename(DESTINATION)
    LEGACY_DIR.make_symlink(DESTINATION)
  end
end

github.com

theforeman/foreman-installer/blob/2.3-stable/hooks/post/34-pulpcore_directory_layout.rb

require 'pathname'

LEGACY_DIR = Pathname.new('/var/lib/pulp/docroot')

if LEGACY_DIR.symlink?
  logger.debug("Removing legacy symlink #{LEGACY_DIR}")
  LEGACY_DIR.unlink unless app_value(:noop)
end

It looks like I followed the upstream paths rather than our own paths. That is a major upgrade bug.

Some backstory: initially there was really no standard deployment layout and our layout was different than upstream. Obviously this is undesirable so over the past months I pushed for a standard that works well for our use case. https://github.com/pulp/pulpcore/pull/799 is not yet merged, but there is at least an approval that it’s the correct layout. This is also part of the SELinux policy.

I opened an untested PR but my working day is not that long so it’ll likely not see progress until the new year.
https://github.com/theforeman/foreman-installer/pull/634

rbremer · December 23, 2020, 4:33pm

Thank you very much for your deep analysis Ewoud!

Of course that make a lot of sense and in my case the data was not migrated.
However, the path also looks off to me:
/var/lib/pulp/artifact does not exist on my system, it is in /var/lib/pulp/docroot/artifact.
So even if the post hook would have worked, it wouldn’t have migrated anything.

So I assume I can safely continue with the repair of the repositories, as this seems to download the data again into the new standard path /var/lib/pulp/media/.
Edit: I checked the PR and there you clearly mention docroot as the legacy path, so I think this would be correctly working.

Can I assist in anything? Unfortunately, I don’t have a snapshot of the Foreman 2.2 before the migration to 2.3…

rbremer · December 27, 2020, 3:01pm

Interesting, every time I am running “yum update” now, it wants to install the katello-agent. I found the following there rpms, which got installed during Katello’s initial install and have now removed them, to avoid getting katello-agent:

pulp-consumer-client
python-pulp-agent-lib
pulp-rpm-handlers

rbremer · December 30, 2020, 9:56am

I tried to repair my repositories. Most I could, some I had to recreate because they failed during the repair with “too many open files”. I tried to set the ulimit -n to 500000, still showing the error. I saw its fixed in pulp upstream, version 3.9, we are on 3.7, so it will not help.

I raised another support issue in here regarding the repair not building the metadata and aborting when not finding files upstream. It was really not an easy process, wished for it to be more of a real “repair”.

Only one repo left which even after a recreate doesn’t find all files, weird though. I am working on that one now.

Polle · January 7, 2021, 9:47am

I second that - deleted a repo, re-created it and again, packages are all missing.
Also, again a repair results in too many open files - if my analysis about the 256 directories (00…ff) is correct, that was to be expected maybe.
Conclusion: we’re fully blocked after the 3.18 upgrade (which was supposed to fix a smart proxy sync issue …) and getting a few hundred engineers complaining about not being able to yum update/install some needed package(s) … help …

uberlinuxguy · January 28, 2021, 6:00pm

As a work around, on a running system, you can run this to adjust the open files for the running process higher:

for i in `ps axu | grep "^pulp" | awk '{print $2}'`; do sudo prlimit -n8192 -p $i; done

ekohl · January 28, 2021, 6:17pm

Isn’t that what pgrep is for? Any time you use ps | grep you should be thinking about pgrep.

uberlinuxguy · January 28, 2021, 6:18pm

sure, you can probably use pgrep. It’s short hand. So many ways to do it, they are all correct. The point being not how you get there, but the fact that you get there.

macado · February 9, 2021, 10:38pm

I believe I am running into this same issue. When I try to sync Red Hat Enterprise Linux 8 for x86_64 - AppStream RPMs 8 I get this error.

I believe it has to do with filing being /var/lib/pulp/docroot and /var/lib/pulp/media

[Errno 2] No such file or directory: '/var/lib/pulp/media/artifact/c6/ad96d3b545d1d24026446a88eb660b31d49827d5d0b66dc33b412ce7a3da54’Error message: the server returns an error
HTTP status code: 400
Response headers: {“date”=>“Tue, 09 Feb 2021 22:31:53 GMT”, “server”=>“gunicorn/20.0.4”, “content-type”=>“application/json”, “vary”=>“Accept,Cookie”, “allow”=>“GET, PUT, PATCH, DELETE, HEAD, OPTIONS”, “x-frame-options”=>“SAMEORIGIN”, “content-length”=>“62”, “via”=>“1.1 puppetmaster-prod-01”, “connection”=>“close”}
Response body: {“publication”:[“Invalid hyperlink - Object does not exist.”]}[Errno 2] No such file or directory: '/var/lib/pulp/media/artifact/c6/ad96d3b545d1d24026446a88eb660b31d49827d5d0b66dc33b412ce7a3da54’Error message: the server returns an error
HTTP status code: 400
Response headers: {“date”=>“Tue, 09 Feb 2021 22:35:23 GMT”, “server”=>“gunicorn/20.0.4”, “content-type”=>“application/json”, “vary”=>“Accept,Cookie”, “allow”=>“GET, PUT, PATCH, DELETE, HEAD, OPTIONS”, “x-frame-options”=>“SAMEORIGIN”, “content-length”=>“62”, “via”=>“1.1 puppetmaster-prod-01”, “connection”=>“close”}
Response body: {“publication”:[“Invalid hyperlink - Object does not exist.”]}

foreman-2.3.0-0.7.rc1
katello 3.18-rc1
pulp-server-2.21.4
python3-pulpcore-3.7.3

Distribution and version:
RHEL 7.9

rbremer · February 13, 2021, 10:15am

You do. For me there was no real solutions, but a lot of try and errors. I had to repair all repositories (products->repo->validate), which unfortunately failed for some. The repair is not really a repair, if it doesn’t find a local file it just aborts. So for a few repos I had to create a new one and download everything again. EPEL gave me the most headaches.