Katello 4.1.3 OOM-Kill while syncing large yum repo

Problem:
I am currently rebuilding a Katello server and also decided to completely resync all external repos.
Unfortunately I’m not able to sync the yum repo of Microsoft RHEL 8 as one of the pulpcore-worker processes eats all of the available memory.
I saw an earlier post about the nearly same topic from some months ago, and it looks like this got mitigated there.

I even tried it 7 times now, including a service restart, this lowered the idle memory usage a bit but the process still uses all of the RAM.
Once even the whole machine crashed :slight_smile:

Expected outcome:
Successful sync of the repository.

Foreman and Proxy versions:
foreman-2.5.3-1.el8.noarch

Foreman and Proxy plugin versions:
katello-4.1.3-1.el8.noarch

createrepo_c version:
python3-createrepo_c-0.17.1-1.el8.x86_64

pip-pulp packages:
pulp-ansible (0.9.0)
pulp-certguard (1.4.0)
pulp-container (2.7.1)
pulp-deb (2.14.1)
pulp-file (1.8.2)
pulp-rpm (3.14.3)
pulpcore (3.14.5)

Distribution and version:
Rocky Linux 8.4

Other relevant data:
This machine has 4 cores and 20GiB of memory (unfortunately also the maximum I can throw at it), the maximum memory I could peek on the pulpcore-worker process were 9.8GiB.

Interesting is also that there isn’t really a big repo file in the repodata.

I hope I could already gather all the necessary information to make it easier to track the real issue on that case, if something more is needed I will off course provide it :slight_smile:

Okay it even peeked at 14.4GiBs now

@lumarel thanks for posting this. Have you considered reducing the “Pulp bulk load size” setting? Check what that value is set at, I think the default may be too high in your setup.

2 Likes

See this:

Thank you for mentioning this setting, I didn’t know that up to now! I will give it a try.

And also @viwon thank you for mentioning your thread I might have overseen it :slight_smile:

Okay I nearly made it, but unfortunately if the OOM-Killer doesn’t do its thing I get an “invalid memory alloc request size” (this is with 5% of the default Pulp bulk load size (100/2000))
I also looks like, this only happens for a really small number of the time, most of the time it only uses about 5GiB.

Welp~ looks like I need more RAM :slightly_smiling_face:

One more update, upgraded to 28GiB, it doesn’t use all of the memory, stops at around 16GiB (for the single process) and crashes with the “invalid memory alloc request size error”

Now getting the same thing on 4.1.3 when syncing content on smart proxies.

invalid memory alloc request size 1073741824
invalid memory alloc request size 1073741824
invalid memory alloc request size 1073741824

Hm… interesting…

I fortunately found out for now that I don’t need this repo so much anymore as also the CentOS repo has now the PowerShell packages in it

But it would still be great to improve/fix this.

Running 4.1.3 here with:

  • python3-pulp-rpm-3.14.2-1.el7.noarch
  • python3-pulpcore-3.14.5-2.el7.noarch
  • foreman-2.5.3-1.el7.noarch
  • katello-4.1.3-1.el7.noarch

{“smart_proxy_history_id”=>585,
“pulp_tasks”=>
[{“pulp_href”=>"/pulp/api/v3/tasks/4938482f-20c9-44b8-aa8f-477da8d47b39/",
“pulp_created”=>“2021-09-14T15:48:37.920+00:00”,
“state”=>“failed”,
“name”=>“pulp_rpm.app.tasks.synchronizing.synchronize”,
“logging_cid”=>“7703b243-e783-4dac-abe1-7fc65863c117”,
“started_at”=>“2021-09-14T15:49:26.546+00:00”,
“finished_at”=>“2021-09-14T15:56:28.251+00:00”,
“error”=>
{“traceback”=>
" File “/usr/lib/python3.6/site-packages/pulpcore/tasking/pulpcore_worker.py”, line 272, in _perform_task\n" +
" result = func(*args, **kwargs)\n" +
" File “/usr/lib/python3.6/site-packages/pulp_rpm/app/tasks/synchronizing.py”, line 475, in synchronize\n" +
" version = dv.create()\n" +
" File “/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/declarative_version.py”, line 151, in create\n" +
" loop.run_until_complete(pipeline)\n" +
" File “/usr/lib64/python3.6/asyncio/base_events.py”, line 484, in run_until_complete\n" +
" return future.result()\n" +
" File “/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/api.py”, line 225, in create_pipeline\n" +
" await asyncio.gather(*futures)\n" +
" File “/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/api.py”, line 43, in call\n" +
" await self.run()\n" +
" File “/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/content_stages.py”, line 113, in run\n" +
" d_content.content.save()\n" +
" File “/usr/lib/python3.6/site-packages/pulpcore/app/models/base.py”, line 149, in save\n" +
" return super().save(*args, **kwargs)\n" +
" File “/usr/lib/python3.6/site-packages/django_lifecycle/mixins.py”, line 134, in save\n" +
" save(*args, **kwargs)\n" +
" File “/usr/lib/python3.6/site-packages/django/db/models/base.py”, line 744, in save\n" +
" force_update=force_update, update_fields=update_fields)\n" +
" File “/usr/lib/python3.6/site-packages/django/db/models/base.py”, line 782, in save_base\n" +
" force_update, using, update_fields,\n" +
" File “/usr/lib/python3.6/site-packages/django/db/models/base.py”, line 873, in _save_table\n" +
" result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)\n" +
" File “/usr/lib/python3.6/site-packages/django/db/models/base.py”, line 911, in _do_insert\n" +
" using=using, raw=raw)\n" +
" File “/usr/lib/python3.6/site-packages/django/db/models/manager.py”, line 82, in manager_method\n" +
" return getattr(self.get_queryset(), name)(*args, **kwargs)\n" +
" File “/usr/lib/python3.6/site-packages/django/db/models/query.py”, line 1186, in _insert\n" +
" return query.get_compiler(using=using).execute_sql(return_id)\n" +
" File “/usr/lib/python3.6/site-packages/django/db/models/sql/compiler.py”, line 1377, in execute_sql\n" +
" cursor.execute(sql, params)\n" +
" File “/usr/lib/python3.6/site-packages/django/db/backends/utils.py”, line 67, in execute\n" +
" return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)\n" +
" File “/usr/lib/python3.6/site-packages/django/db/backends/utils.py”, line 76, in _execute_with_wrappers\n" +
" return executor(sql, params, many, context)\n" +
" File “/usr/lib/python3.6/site-packages/django/db/backends/utils.py”, line 84, in _execute\n" +
" return self.cursor.execute(sql, params)\n" +
" File “/usr/lib/python3.6/site-packages/django/db/utils.py”, line 89, in exit\n" +
" raise dj_exc_value.with_traceback(traceback) from exc_value\n" +
" File “/usr/lib/python3.6/site-packages/django/db/backends/utils.py”, line 84, in _execute\n" +
" return self.cursor.execute(sql, params)\n",
“description”=>“invalid memory alloc request size 1073741824\n”},
“worker”=>"/pulp/api/v3/workers/d165458f-aec9-48e4-8297-4b6ab3776bad/",
“child_tasks”=>,
“progress_reports”=>
[{“message”=>“Downloading Metadata Files”,
“code”=>“sync.downloading.metadata”,
“state”=>“completed”,
“done”=>8},
{“message”=>“Downloading Artifacts”,
“code”=>“sync.downloading.artifacts”,
“state”=>“completed”,
“done”=>0},
{“message”=>“Associating Content”,
“code”=>“associating.content”,
“state”=>“canceled”,
“done”=>0},
{“message”=>“Parsed Packages”,
“code”=>“sync.parsing.packages”,
“state”=>“completed”,
“total”=>157,
“done”=>157}],
“created_resources”=>,
“reserved_resources_record”=>
["/pulp/api/v3/repositories/rpm/rpm/42426df8-9ee3-4ce1-b831-2eaef07fc7ff/",
“/pulp/api/v3/remotes/rpm/rpm/79dff40d-4e99-4709-b96b-d1a19ef5cc53/”]}],
“task_groups”=>,
“poll_attempts”=>{“total”=>45, “failed”=>1}}

Exception:

Katello::Errors::Pulp3Error: invalid memory alloc request size 1073741824

1 Like

Google points to that being a PGSQL error and pulp3 trying to do to large an INSERT.

Hi,

the last time i had issues with the Microsoft Repo was around ~2 month ago. These guys sometimes have a broken ci pipeline that produces new rpms every few minutes, the issue here is that all these changelog remarks are in that package metadata. If your machine goes oom, check the metadata of that repo and unpack them, you will be surprised that diese repo files can be 10-20gb of size due to compression hide this.

1 Like

I was able to get past the INSERT memory alloc errors.

I resynced a bunch of stuff from the intertubes manually, rebuilt the indexes on a couple of repos, removed them from the views, created new versions of the views, then readded them to the views and republished new view versions.

Now its back to the high memory use, but getting no failures or any of the INSERT memorry alloc errors on the smart proxy.

I am thinking that pulp3 smartproxy sync from Foreman is very sensitive to any error or problems with any of the repos on the Foreman side. Once the repos are all happy and good on the Foreman server, the smart proxy sync seemed to start working again.

Oh okay o.O
That sounds really not that good… I will check that repo in some days again, maybe something has changed again. It is just really… interesting, as the repofiles aren’t that big in the compressed form

Fortunately every other repo does not reach near the limit ^^

Hi all.

This particular repo seems to cause some pathological behavior which I’ve not seen before with any other repos. One of these issues was already reported upstream and I added some details to that ticket about why it occurs. That should probably be an easy-ish fix. The overall memory usage though - I do not know why that is so high with this particular repo, so that will require further investigation.

Upstream issues:

https://pulp.plan.io/issues/9399
https://pulp.plan.io/issues/9406

1 Like