Leapp upgrade with 3.11/4.13 fails

Problem:
To prepare for the inevitable I have tested the upgrade of our main foreman/katello server running almalinux 8 to almalinux 9 following Upgrading Foreman to 3.11

However, it fails during target_userspace_creator with a

OSError: [Errno 24] Too many open files

I have tried increasing the limit with ulimit -n or in /etc/security/limits.conf but it still fails at the same place.

Expected outcome:
No error.

Foreman and Proxy versions:
foreman 3.11, katello 4.13 latest.

Distribution and version:
AlmaLinux 8.9

Other relevant data:

====> * target_userspace_creator
        Initializes a directory to be populated as a minimal environment to run binaries from the target system.
AlmaLinux 9 - BaseOS                             18 MB/s |  15 MB     00:00    
AlmaLinux 9 - AppStream                          20 MB/s |  15 MB     00:00    
Foreman 3.11                                    5.8 MB/s | 1.7 MB     00:00    
Foreman plugins 3.11                            9.3 MB/s | 1.9 MB     00:00    
Puppet 7 Repository el 9 - x86_64               6.7 MB/s | 6.5 MB     00:00    
Katello 4.13                                    1.9 MB/s | 309 kB     00:00    
Candlepin: an open source entitlement managemen 711 kB/s | 108 kB     00:00    
pulpcore: Fetch, Upload, Organize, and Distribu 2.2 MB/s | 445 kB     00:00    
Dependencies resolved.
================================================================================================
 Package                       Arch    Version                       Repository             Size
================================================================================================
Installing:
 dnf                           noarch  4.14.0-9.el9.alma.1           almalinux9-baseos     468 k
 dnf-plugins-core              noarch  4.3.0-13.el9                  almalinux9-baseos      36 k
...
  xz-libs-5.2.5-8.el9_0.x86_64                                                  
  zlib-1.2.11-40.el9.x86_64                                                     

Complete!
Process Process-455:
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/leapp/libraries/stdlib/__init__.py", line 185, in run
    stdin=stdin, env=env, encoding=encoding)
  File "/usr/lib/python3.6/site-packages/leapp/libraries/stdlib/call.py", line 174, in _call
    ep = EventLoop()
OSError: [Errno 24] Too many open files

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/etc/leapp/repos.d/system_upgrade/common/libraries/overlaygen.py", line 531, in _mount_dnf_cache
    yield cache_mount
  File "/etc/leapp/repos.d/system_upgrade/common/libraries/overlaygen.py", line 600, in create_source_overlay
    yield overlay
  File "/etc/leapp/repos.d/system_upgrade/common/actors/targetuserspacecreator/libraries/userspacegen.py", line 1255, in perform
    _create_target_userspace(context, indata, indata.packages, indata.files, target_repoids)
  File "/etc/leapp/repos.d/system_upgrade/common/actors/targetuserspacecreator/libraries/userspacegen.py", line 1117, in _create_target_userspace
    _prep_repository_access(context, target_path)
  File "/etc/leapp/repos.d/system_upgrade/common/actors/targetuserspacecreator/libraries/userspacegen.py", line 625, in _prep_repository_access
    _copy_certificates(context, target_userspace)
  File "/etc/leapp/repos.d/system_upgrade/common/actors/targetuserspacecreator/libraries/userspacegen.py", line 558, in _copy_certificates
    files_owned_by_rpms = _get_files_owned_by_rpms(target_context, '/etc/pki', recursive=True)
  File "/etc/leapp/repos.d/system_upgrade/common/actors/targetuserspacecreator/libraries/userspacegen.py", line 322, in _get_files_owned_by_rpms
    result = context.call(['rpm', '-qf', os.path.join(dirpath, fname)])
  File "/etc/leapp/repos.d/system_upgrade/common/libraries/mounting.py", line 168, in call
    return run(self.type.make_command(cmd), *args, **kwargs)
  File "/usr/lib/python3.6/site-packages/leapp/libraries/stdlib/__init__.py", line 213, in run
    'process-result', {'id': _id, 'parameters': args, 'result': audit_result, 'env': env}
  File "/usr/lib/python3.6/site-packages/leapp/utils/audit/__init__.py", line 394, in create_audit_entry
    'data': data
  File "/usr/lib/python3.6/site-packages/leapp/utils/audit/__init__.py", line 88, in store
    with get_connection(db) as connection:
  File "/usr/lib/python3.6/site-packages/leapp/utils/audit/__init__.py", line 74, in get_connection
    return create_connection(cfg.get('database', 'path'))
  File "/usr/lib/python3.6/site-packages/leapp/cli/commands/upgrade/util.py", line 26, in wrapper
    return f(*args, **kwargs)
  File "/usr/lib/python3.6/site-packages/leapp/utils/audit/__init__.py", line 61, in create_connection
    return _initialize_database(sqlite3.connect(path))
sqlite3.OperationalError: unable to open database file

During handling of the above exception, another exception occurred:
...
=========================================================================================================
Actor target_userspace_creator unexpectedly terminated with exit code: 1 - Please check the above details
=========================================================================================================
...

leapp rpms used:

leapp-0.17.0-1.1.el8.noarch
leapp-data-almalinux-0.4-1.el8.noarch
leapp-deps-0.17.0-1.1.el8.noarch
leapp-upgrade-el8toel9-0.20.0-2.2.el8.noarch
leapp-upgrade-el8toel9-deps-0.20.0-2.2.el8.noarch
python3-leapp-0.17.0-1.1.el8.noarch

Can you try applying the following patch and see if it helps?

try

ulimit -n 50000

in same terminal

Thanks. That patch did the trick. It now ran through.

I had some issues with subscription-manager. I have registered the main server with itself for content which seems to confuse leapp because it thinks it’s registered with redhat. I have played around with various options including --no-rhsm but eventually I have removed subscription-manager before the leapp run.

I also did an dnf autoremove to get rid of the python38 and python39 packages which caused conflicts. As foreman-installer is running after the reboot I was hoping it doesn’t break anything and so far I didn’t find anything.

When it booted up again, it installed the subscription-manager rpm again which reenabled the el8 repos via foreman. I suppose I should have taken care of the before the upgrade.

Now, i have the kdump.service unit in failed status: kdump: No memory reserved for crash kernel.

And dnf always complains about warning: Signature not supported. Hash algorithm SHA1 not available. but doesn’t tell me which signature it is…

But foreman so far seems to be running without complaints.

I’ll revert the vm back to its original state and do another leapp run next week, when I have more time to test the upgraded vm…