Katello 4.2.x repository synchronisation issues

Problem:
Getting errors synchronising some repositories in Katello 4.2 (now 4.2.1), especially Oracle Linux 8 repositories
Expected outcome:
Repositories synchronised successfully
Foreman and Proxy versions:
foreman-3.0.1-1.el8.noarch
Foreman and Proxy plugin versions:
katello-4.2.1-1.el8.noarch
Distribution and version:
Oracle Linux 8.4
Other relevant data:
See Strange pulp3 error

Synchronising the Oracle Linux 8 “codeready-builder” repository I’m getting:

get() returned more than one UpdateRecord -- it returned 2!

Synchronising the Oracle Linux 8 “appstream” repository is failing in a different way - this morning’s daily sync is still running after 7 hours, with the error below:

Pulp task errorError message: the server returns an error
HTTP status code: 400
Response headers: {"date"=>"Fri, 12 Nov 2021 02:02:24 GMT", "server"=>"gunicorn", "content-type"=>"application/json", "vary"=>"Accept,Cookie", "allow"=>"GET, PUT, PATCH, DELETE, HEAD, OPTIONS", "x-frame-options"=>"SAMEORIGIN", "content-length"=>"62", "correlation-id"=>"0b72591f-6ea3-4b69-847a-15fa7c6b13f5", "access-control-expose-headers"=>"Correlation-ID", "via"=>"1.1 katello.example.com", "connection"=>"close"}
Response body: {"publication":["Invalid hyperlink - Object does not exist."]}

Generating a new CVV in a CV which contains these repositories also leads to the 2nd failure, as do promotions.

Hi @John_Beranek,

I have a couple of suggestions:

  1. If it isn’t a pain, deleting the repo, running orphan cleanup, and then resyncing it should fix the problem.
  2. Applying this patch to the installed pulp-rpm plugin may help: https://github.com/pulp/pulp_rpm/pull/2150
  3. If none of the above suite, then I’ll need to confer with the Pulp team about it.

We can start by taking a look at your Pulp distributions and publications to see if they do exist for your AppStream repo.

Go into the Foreman console and run the following:

::Katello::Repository.find(<id of appstream repo>).backend_service(SmartProxy.pulp_primary).distribution_reference

Take note of the path and the href.

Then, we’ll query the Pulp server:

curl https://`hostname`/pulp/api/v3/distributions/rpm/rpm/?base_path=<path goes here>   --cert /etc/pki/katello/certs/pulp-client.crt  --key /etc/pki/katello/private/pulp-client.key

Look at the returned distribution and its pulp_href. Does the pulp_href match the href from before? Also, is there a value for the publication field? If there is, note the publication field. I’ll refer to it as the appstream publication href.

Go back to the Foreman console and run the following:

::Katello::Repository.find(<id of appstream repo>).publication_href

Does that match the appstream publication href from before (if there was one)?

For all of the ‘appstream’ investigations, it looks to check out OK, Pulp’s ‘publication’ field matches the ‘publication_href’ in Katello.

As for deleting the ‘codeready-builder’ repository and recreating it - this is quite a pain as the repository is used in quite a few published CVs/CVVS, which are used by subscribed hosts - so all those would have to be deleted.

I will take a look at the Pulp patch.

Trouble with the pulp patch, seems to not apply to the currently released pulp_rpm in Katello 4.2.1:

# git apply -v --check 2150.patch
Checking patch CHANGES/9503.bugfix...
Checking patch pulp_rpm/app/advisory.py...
error: while searching for:
        # diff for querysets works fine but the result is not fully functional queryset,
        # e.g. filtering doesn't work
        added_advisories = current_advisories.difference(previous_advisories)
        if len(list(added_advisories)) != len(set(added_advisories)):
            raise AdvisoryConflict(
                _(
                    "It is not possible to add two advisories of the same id to "
                    "a repository version."
                )
            )
        added_advisory_ids = set(adv.id for adv in added_advisories)
        advisory_conflicts = added_advisory_ids.intersection(previous_advisory_ids)

        added_advisory_pks = [adv.pk for adv in added_advisories]
        for advisory_id in advisory_conflicts:
            previous_advisory = previous_advisories.get(id=advisory_id)
            added_advisory = UpdateRecord.objects.get(id=advisory_id, pk__in=added_advisory_pks)
            added_advisory.touch()
            to_add, to_remove, to_exclude = resolve_advisory_conflict(
                previous_advisory, added_advisory
            )
            content_pks_to_add.update(to_add)
            content_pks_to_remove.update(to_remove)
            content_pks_to_exclude.update(to_exclude)

    if content_pks_to_add:
        version.add_content(Content.objects.filter(pk__in=content_pks_to_add))

error: patch failed: pulp_rpm/app/advisory.py:72
error: pulp_rpm/app/advisory.py: patch does not apply
Checking patch pulp_rpm/tests/functional/api/test_advisory_conflict.py...

I found a backport patch for Pulp 3.14: https://patch-diff.githubusercontent.com/raw/pulp/pulp_rpm/pull/2152.patch

@John_Beranek can we see the Dynflow output for your failing sync task that caused {"publication":["Invalid hyperlink - Object does not exist."]} ? Seeing the input and output for the action that failed inside the Dynflow task would be especially great.

Interestingly, it seems to have a few different failure methods. The daily sync I posted above has this:

Actions::Pulp3::Repository::Sync
Input:
{"repo_id"=>41,
 "smart_proxy_id"=>1,
 "options"=>{},
 "remote_user"=>"admin",
 "remote_cp_user"=>"admin",
 "current_request_id"=>"f8aaef5f-820b-412e-a5e3-0e127faba39a",
 "current_timezone"=>"London",
 "current_organization_id"=>1,
 "current_location_id"=>2,
 "current_user_id"=>1}
Output:
{"pulp_tasks"=>
  [{"pulp_href"=>"/pulp/api/v3/tasks/5cb45bf3-354a-47ba-8cdf-5676ca337f60/",
    "pulp_created"=>"2021-11-13T02:01:03.622+00:00",
    "state"=>"failed",
    "name"=>"pulp_rpm.app.tasks.synchronizing.synchronize",
    "logging_cid"=>"f8aaef5f-820b-412e-a5e3-0e127faba39a",
    "started_at"=>"2021-11-13T02:01:03.814+00:00",
    "finished_at"=>"2021-11-13T02:02:58.885+00:00",
    "error"=>{"reason"=>"Worker has gone missing."},
    "worker"=>"/pulp/api/v3/workers/ab0a4daf-5243-46f2-b8f8-9d37d416971a/",
    "child_tasks"=>[],
    "progress_reports"=>
     [{"message"=>"Downloading Metadata Files",
       "code"=>"sync.downloading.metadata",
       "state"=>"completed",
       "done"=>7},
      {"message"=>"Downloading Artifacts",
       "code"=>"sync.downloading.artifacts",
       "state"=>"running",
       "done"=>0},
      {"message"=>"Associating Content",
       "code"=>"associating.content",
       "state"=>"running",
       "done"=>0},
      {"message"=>"Parsed Modulemd",
       "code"=>"sync.parsing.modulemds",
       "state"=>"completed",
       "total"=>375,
       "done"=>375},
      {"message"=>"Parsed Modulemd-defaults",
       "code"=>"sync.parsing.modulemd_defaults",
       "state"=>"completed",
       "total"=>45,
       "done"=>45},
      {"message"=>"Parsed Packages",
       "code"=>"sync.parsing.packages",
       "state"=>"running",
       "done"=>8966}],
    "created_resources"=>[],
    "reserved_resources_record"=>
     ["/pulp/api/v3/repositories/rpm/rpm/c88af0ab-f8eb-4598-8a6c-33ff360748d8/",
      "/pulp/api/v3/remotes/rpm/rpm/ff7b555d-dbff-464e-9bac-136cac7d89cd/"]}],
 "create_version"=>true,
 "task_groups"=>[],
 "poll_attempts"=>{"total"=>27, "failed"=>1}}
Exception:
Katello::Errors::Pulp3Error: Pulp task error
Backtrace:
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/pulp3/abstract_async_task.rb:108:in `block in check_for_errors'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/pulp3/abstract_async_task.rb:106:in `each'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/pulp3/abstract_async_task.rb:106:in `check_for_errors'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/pulp3/abstract_async_task.rb:139:in `poll_external_task'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action/polling.rb:100:in `poll_external_task_with_rescue'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action/polling.rb:22:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action/cancellable.rb:14:in `run'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/pulp3/abstract_async_task.rb:10:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:572:in `block (3 levels) in execute_run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:32:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/middleware/remote_action.rb:16:in `block in run'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/middleware/remote_action.rb:40:in `block in as_remote_user'
/usr/share/gems/gems/katello-4.2.1/app/models/katello/concerns/user_extensions.rb:21:in `cp_config'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/middleware/remote_action.rb:27:in `as_cp_user'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/middleware/remote_action.rb:39:in `as_remote_user'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/middleware/remote_action.rb:16:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/rails_executor_wrap.rb:14:in `block in run'
/usr/share/gems/gems/activesupport-6.0.3.7/lib/active_support/execution_wrapper.rb:88:in `wrap'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/rails_executor_wrap.rb:13:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action/progress.rb:31:in `with_progress_calculation'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action/progress.rb:17:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/load_setting_values.rb:20:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_request_id.rb:15:in `block in run'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_request_id.rb:52:in `restore_current_request_id'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_request_id.rb:15:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_timezone.rb:15:in `block in run'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_timezone.rb:44:in `restore_curent_timezone'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_timezone.rb:15:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_taxonomies.rb:15:in `block in run'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_taxonomies.rb:45:in `restore_current_taxonomies'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_taxonomies.rb:15:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:32:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_user.rb:15:in `block in run'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_user.rb:54:in `restore_curent_user'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_user.rb:15:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/world.rb:31:in `execute'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:571:in `block (2 levels) in execute_run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:570:in `catch'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:570:in `block in execute_run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:473:in `block in with_error_handling'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:473:in `catch'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:473:in `with_error_handling'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:565:in `execute_run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:286:in `execute'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:18:in `block (2 levels) in execute'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/execution_plan/steps/abstract.rb:167:in `with_meta_calculation'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:17:in `block in execute'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:32:in `open_action'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:16:in `execute'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/director.rb:94:in `execute'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/executors/sidekiq/worker_jobs.rb:11:in `block (2 levels) in perform'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/executors.rb:18:in `run_user_code'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/executors/sidekiq/worker_jobs.rb:9:in `block in perform'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/executors/sidekiq/worker_jobs.rb:25:in `with_telemetry'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/executors/sidekiq/worker_jobs.rb:8:in `perform'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/executors/sidekiq/serialization.rb:27:in `perform'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:192:in `execute_job'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:165:in `block (2 levels) in process'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/middleware/chain.rb:128:in `block in invoke'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/middleware/chain.rb:133:in `invoke'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:164:in `block in process'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:137:in `block (6 levels) in dispatch'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/job_retry.rb:109:in `local'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:136:in `block (5 levels) in dispatch'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq.rb:37:in `block in <module:Sidekiq>'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:132:in `block (4 levels) in dispatch'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:250:in `stats'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:127:in `block (3 levels) in dispatch'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/job_logger.rb:8:in `call'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:126:in `block (2 levels) in dispatch'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/job_retry.rb:74:in `global'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:125:in `block in dispatch'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/logging.rb:48:in `with_context'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/logging.rb:42:in `with_job_hash_context'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:124:in `dispatch'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:163:in `process'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:83:in `process_one'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:71:in `run'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/util.rb:16:in `watchdog'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/util.rb:25:in `block in safe_thread'
/usr/share/gems/gems/logging-2.3.0/lib/logging/diagnostic_context.rb:474:in `block in create_with_logging_context'

and this

Action:
Actions::Pulp3::Repository::Sync
Input:
{"repo_id"=>41,
 "smart_proxy_id"=>1,
 "options"=>{},
 "remote_user"=>"admin",
 "remote_cp_user"=>"admin",
 "current_request_id"=>"f8aaef5f-820b-412e-a5e3-0e127faba39a",
 "current_timezone"=>"London",
 "current_organization_id"=>1,
 "current_location_id"=>2,
 "current_user_id"=>1}
Output:
{"pulp_tasks"=>
  [{"pulp_href"=>"/pulp/api/v3/tasks/5cb45bf3-354a-47ba-8cdf-5676ca337f60/",
    "pulp_created"=>"2021-11-13T02:01:03.622+00:00",
    "state"=>"failed",
    "name"=>"pulp_rpm.app.tasks.synchronizing.synchronize",
    "logging_cid"=>"f8aaef5f-820b-412e-a5e3-0e127faba39a",
    "started_at"=>"2021-11-13T02:01:03.814+00:00",
    "finished_at"=>"2021-11-13T02:02:58.885+00:00",
    "error"=>{"reason"=>"Worker has gone missing."},
    "worker"=>"/pulp/api/v3/workers/ab0a4daf-5243-46f2-b8f8-9d37d416971a/",
    "child_tasks"=>[],
    "progress_reports"=>
     [{"message"=>"Downloading Metadata Files",
       "code"=>"sync.downloading.metadata",
       "state"=>"completed",
       "done"=>7},
      {"message"=>"Downloading Artifacts",
       "code"=>"sync.downloading.artifacts",
       "state"=>"running",
       "done"=>0},
      {"message"=>"Associating Content",
       "code"=>"associating.content",
       "state"=>"running",
       "done"=>0},
      {"message"=>"Parsed Modulemd",
       "code"=>"sync.parsing.modulemds",
       "state"=>"completed",
       "total"=>375,
       "done"=>375},
      {"message"=>"Parsed Modulemd-defaults",
       "code"=>"sync.parsing.modulemd_defaults",
       "state"=>"completed",
       "total"=>45,
       "done"=>45},
      {"message"=>"Parsed Packages",
       "code"=>"sync.parsing.packages",
       "state"=>"running",
       "done"=>8966}],
    "created_resources"=>[],
    "reserved_resources_record"=>
     ["/pulp/api/v3/repositories/rpm/rpm/c88af0ab-f8eb-4598-8a6c-33ff360748d8/",
      "/pulp/api/v3/remotes/rpm/rpm/ff7b555d-dbff-464e-9bac-136cac7d89cd/"]}],
 "create_version"=>true,
 "task_groups"=>[],
 "poll_attempts"=>{"total"=>27, "failed"=>1}}

On a more positive note, after I applied the Pulp patch, I’ve just managed to sync both ‘appstream’ and ‘codeready-builder’ successfully.

Starting to think the 7.8GiB resident ‘rails’ process amongst the 24GiB of RAM used on the 30GiB RAM server account for some of the variation in results…

# journalctl -S "4 days ago" |grep memory
Nov 14 02:02:43 katello.example.com kernel:  out_of_memory+0x1ad/0x50e
Nov 14 02:02:44 katello.example.com kernel: Tasks state (memory values in pages):
Nov 14 02:02:45 katello.example.com kernel: Out of memory: Killed process 2161992 (rails) total-vm:13823700kB, anon-rss:12713360kB, file-rss:0kB, shmem-rss:0kB, UID:994 pgtables:26648kB oom_score_adj:0

I’ll be taking a look into getting that patch into Katello officially.

Ah there it is then, I’ve seen all sorts of strange things pop up when the oom killer comes in.

Is there a particular process that seems to be using more RAM than before?

Not that I could put a finger on, no - I just know that I keep adding RAM to the server, and it keeps running out. A few daily repo syncs failed this morning[1], which corresponded to another OOM event. :frowning_face:

[1]

Action:
Actions::Pulp3::Repository::Sync
Input:
{"repo_id"=>19,
 "smart_proxy_id"=>1,
 "options"=>{},
 "remote_user"=>"admin",
 "remote_cp_user"=>"admin",
 "current_request_id"=>"252f1595-034a-46ce-9ad2-b3f8b8ae73a4",
 "current_timezone"=>"London",
 "current_organization_id"=>1,
 "current_location_id"=>2,
 "current_user_id"=>1}
Output:
{"pulp_tasks"=>
  [{"pulp_href"=>"/pulp/api/v3/tasks/77598a84-9eb1-43b1-b2e5-55a3f46d0a67/",
    "pulp_created"=>"2021-11-16T02:00:29.778+00:00",
    "state"=>"failed",
    "name"=>"pulp_rpm.app.tasks.synchronizing.synchronize",
    "logging_cid"=>"252f1595-034a-46ce-9ad2-b3f8b8ae73a4",
    "started_at"=>"2021-11-16T02:00:29.894+00:00",
    "finished_at"=>"2021-11-16T02:02:15.139+00:00",
    "error"=>
     {"traceback"=>
       "  File \"/usr/lib/python3.6/site-packages/pulpcore/tasking/pulpcore_worker.py\", line 317, in _perform_task\n" +
       "    result = func(*args, **kwargs)\n" +
       "  File \"/usr/lib/python3.6/site-packages/pulp_rpm/app/tasks/synchronizing.py\", line 490, in synchronize\n" +
       "    version = dv.create()\n" +
       "  File \"/usr/lib/python3.6/site-packages/pulpcore/plugin/stages/declarative_version.py\", line 151, in create\n" +
       "    loop.run_until_complete(pipeline)\n" +
       "  File \"/usr/lib64/python3.6/tempfile.py\", line 809, in __exit__\n" +
       "    self.cleanup()\n" +
       "  File \"/usr/lib64/python3.6/tempfile.py\", line 813, in cleanup\n" +
       "    _shutil.rmtree(self.name)\n" +
       "  File \"/usr/lib64/python3.6/shutil.py\", line 477, in rmtree\n" +
       "    onerror(os.lstat, path, sys.exc_info())\n" +
       "  File \"/usr/lib64/python3.6/shutil.py\", line 475, in rmtree\n" +
       "    orig_st = os.lstat(path)\n",
      "description"=>"[Errno 2] No such file or directory: './tmpzxxdw2a6'"},
    "worker"=>"/pulp/api/v3/workers/76d05222-1706-4fc9-b852-3ff0fe84495a/",
    "child_tasks"=>[],
    "progress_reports"=>
     [{"message"=>"Downloading Metadata Files",
       "code"=>"sync.downloading.metadata",
       "state"=>"completed",
       "done"=>6},
      {"message"=>"Downloading Artifacts",
       "code"=>"sync.downloading.artifacts",
       "state"=>"canceled",
       "done"=>9},
      {"message"=>"Associating Content",
       "code"=>"associating.content",
       "state"=>"canceled",
       "done"=>8},
      {"message"=>"Parsed Packages",
       "code"=>"sync.parsing.packages",
       "state"=>"canceled",
       "done"=>8009}],
    "created_resources"=>[],
    "reserved_resources_record"=>
     ["/pulp/api/v3/remotes/rpm/rpm/78e37a69-662f-41ba-9f30-bf48ea3f1d56/",
      "/pulp/api/v3/repositories/rpm/rpm/10b9c533-baab-4409-aba6-a0ac815ebeba/"]}],
 "create_version"=>true,
 "task_groups"=>[],
 "poll_attempts"=>{"total"=>25, "failed"=>1}}
Exception:
Katello::Errors::Pulp3Error: [Errno 2] No such file or directory: './tmpzxxdw2a6'
Backtrace:
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/pulp3/abstract_async_task.rb:108:in `block in check_for_errors'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/pulp3/abstract_async_task.rb:106:in `each'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/pulp3/abstract_async_task.rb:106:in `check_for_errors'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/pulp3/abstract_async_task.rb:139:in `poll_external_task'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action/polling.rb:100:in `poll_external_task_with_rescue'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action/polling.rb:22:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action/cancellable.rb:14:in `run'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/pulp3/abstract_async_task.rb:10:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:572:in `block (3 levels) in execute_run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:32:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/middleware/remote_action.rb:16:in `block in run'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/middleware/remote_action.rb:40:in `block in as_remote_user'
/usr/share/gems/gems/katello-4.2.1/app/models/katello/concerns/user_extensions.rb:21:in `cp_config'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/middleware/remote_action.rb:27:in `as_cp_user'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/middleware/remote_action.rb:39:in `as_remote_user'
/usr/share/gems/gems/katello-4.2.1/app/lib/actions/middleware/remote_action.rb:16:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/rails_executor_wrap.rb:14:in `block in run'
/usr/share/gems/gems/activesupport-6.0.3.7/lib/active_support/execution_wrapper.rb:88:in `wrap'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/rails_executor_wrap.rb:13:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action/progress.rb:31:in `with_progress_calculation'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action/progress.rb:17:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/load_setting_values.rb:20:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_request_id.rb:15:in `block in run'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_request_id.rb:52:in `restore_current_request_id'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_request_id.rb:15:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_timezone.rb:15:in `block in run'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_timezone.rb:44:in `restore_curent_timezone'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_timezone.rb:15:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_taxonomies.rb:15:in `block in run'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_taxonomies.rb:45:in `restore_current_taxonomies'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_taxonomies.rb:15:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:32:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:27:in `pass'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware.rb:19:in `pass'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_user.rb:15:in `block in run'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_user.rb:54:in `restore_curent_user'
/usr/share/gems/gems/foreman-tasks-5.1.1/app/lib/actions/middleware/keep_current_user.rb:15:in `run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/stack.rb:23:in `call'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/middleware/world.rb:31:in `execute'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:571:in `block (2 levels) in execute_run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:570:in `catch'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:570:in `block in execute_run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:473:in `block in with_error_handling'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:473:in `catch'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:473:in `with_error_handling'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:565:in `execute_run'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/action.rb:286:in `execute'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:18:in `block (2 levels) in execute'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/execution_plan/steps/abstract.rb:167:in `with_meta_calculation'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:17:in `block in execute'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:32:in `open_action'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:16:in `execute'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/director.rb:94:in `execute'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/executors/sidekiq/worker_jobs.rb:11:in `block (2 levels) in perform'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/executors.rb:18:in `run_user_code'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/executors/sidekiq/worker_jobs.rb:9:in `block in perform'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/executors/sidekiq/worker_jobs.rb:25:in `with_telemetry'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/executors/sidekiq/worker_jobs.rb:8:in `perform'
/usr/share/gems/gems/dynflow-1.5.0/lib/dynflow/executors/sidekiq/serialization.rb:27:in `perform'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:192:in `execute_job'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:165:in `block (2 levels) in process'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/middleware/chain.rb:128:in `block in invoke'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/middleware/chain.rb:133:in `invoke'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:164:in `block in process'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:137:in `block (6 levels) in dispatch'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/job_retry.rb:109:in `local'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:136:in `block (5 levels) in dispatch'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq.rb:37:in `block in <module:Sidekiq>'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:132:in `block (4 levels) in dispatch'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:250:in `stats'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:127:in `block (3 levels) in dispatch'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/job_logger.rb:8:in `call'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:126:in `block (2 levels) in dispatch'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/job_retry.rb:74:in `global'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:125:in `block in dispatch'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/logging.rb:48:in `with_context'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/logging.rb:42:in `with_job_hash_context'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:124:in `dispatch'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:163:in `process'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:83:in `process_one'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/processor.rb:71:in `run'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/util.rb:16:in `watchdog'
/usr/share/gems/gems/sidekiq-5.2.7/lib/sidekiq/util.rb:25:in `block in safe_thread'
/usr/share/gems/gems/logging-2.3.0/lib/logging/diagnostic_context.rb:474:in `block in create_with_logging_context'

From this morning’s OOM:

Nov 16 02:03:28 katello.example.com kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Nov 16 02:03:28 katello.example.com kernel: [2161284]   994 2161284   207020    68221  1159168    18163             0 rails
Nov 16 02:03:28 katello.example.com kernel: [2161977]   994 2161977   309762   153450  2064384     8249             0 rails
Nov 16 02:03:28 katello.example.com kernel: [2161980]   994 2161980   290325   140859  1798144     5657             0 rails
Nov 16 02:03:28 katello.example.com kernel: [2161985]   994 2161985   266348   134717  1724416     5461             0 rails
Nov 16 02:03:28 katello.example.com kernel: [2161997]   994 2161997   277177   132067  1716224     4991             0 rails
Nov 16 02:03:28 katello.example.com kernel: [2162002]   994 2162002   300108   127251  1908736    15366             0 rails
Nov 16 02:03:28 katello.example.com kernel: [2162008]   994 2162008   271945   135608  1736704     4910             0 rails
Nov 16 02:03:28 katello.example.com kernel: [2162013]   994 2162013   283267   126125  1748992    17311             0 rails
Nov 16 02:03:28 katello.example.com kernel: [2162018]   994 2162018   278145   101597  1769472    44314             0 rails
Nov 16 02:03:28 katello.example.com kernel: [2162023]   994 2162023   300374   136100  1925120     8583             0 rails
Nov 16 02:03:28 katello.example.com kernel: [2162027]   994 2162027   280615   135041  1757184     5103             0 rails
Nov 16 02:03:28 katello.example.com kernel: [2162031]   994 2162031  2836528  2559079 22265856    23582             0 rails
Nov 16 02:03:28 katello.example.com kernel: [2834991]   994 2834991   286049   134608  1892352     3688             0 rails
Nov 16 02:03:28 katello.example.com kernel: [3402740]  1000 3402740   157154    23892   851968      101             0 pulpcore-worker
Nov 16 02:03:28 katello.example.com kernel: [3403177]  1000 3403177   157144    18287   856064     5704             0 pulpcore-worker
Nov 16 02:03:28 katello.example.com kernel: [3403191]  1000 3403191   157122    20649   851968     3348             0 pulpcore-worker
Nov 16 02:03:28 katello.example.com kernel: [3403243]  1000 3403243   157122     8724   856064    15272             0 pulpcore-worker
Nov 16 02:03:28 katello.example.com kernel: [3403347]  1000 3403347   157145     9404   860160    14593             0 pulpcore-worker
Nov 16 02:03:28 katello.example.com kernel: [3403357]  1000 3403357   157144    17278   864256     6716             0 pulpcore-worker
Nov 16 02:03:28 katello.example.com kernel: [3403455]  1000 3403455   157065     9881   856064    14108             0 pulpcore-worker
Nov 16 02:03:28 katello.example.com kernel: [3403987]  1000 3403987   366863   104671  1593344       41             0 pulpcore-worker
Nov 16 02:03:28 katello.example.com kernel: [3404036]  1000 3404036   827691   675621  6070272      544             0 pulpcore-worker
Nov 16 02:03:28 katello.example.com kernel: [3404559]  1000 3404559   366054   194604  2224128      351             0 pulpcore-worker
Nov 16 02:03:28 katello.example.com kernel: [3404684]  1000 3404684   568158   393746  3858432     4572             0 pulpcore-worker
Nov 16 02:03:28 katello.example.com kernel: [3405157]  1000 3405157   157144    23968   860160       23             0 pulpcore-worker
Nov 16 02:03:28 katello.example.com kernel: [3405281]  1000 3405281   306247   134580  1794048       14             0 pulpcore-worker
Nov 16 02:03:28 katello.example.com kernel: [3405295]  1000 3405295   329296   175172  2080768     2450             0 pulpcore-worker
Nov 16 02:03:28 katello.example.com kernel: [3405465]  1000 3405465   157208    12011   831488    12084             0 pulpcore-worker

Still have Oracle Linux 8 repository syncs intermittently failing during the daily sync, even we no corresponding OOM events. :frowning:

Hmm, the very recent release of Oracle Linux 8.5 (and corresponding busyness of yum.oracle.com) could have something to do with it - but some of the task errors don’t look very pleasant to an end-user, I must say.

Getting quite a few of the Pulp task failures, but don’t know what to make of them - I think they’re just spotting a failed Pulp task, but show no evidence of why the task failed, as far as I can tell. They perhaps coincide with some of the sync failures, but I can’t be sure…

Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]: pulp [None]: pulpcore.tasking.pulpcore_worker:INFO: Clean offline worker 25249@katello.example.com.
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]: Traceback (most recent call last):
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:   File "/usr/bin/pulpcore-worker", line 11, in <module>
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:     load_entry_point('pulpcore==3.14.8', 'console_scripts', 'pulpcore-worker')()
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:   File "/usr/lib/python3.6/site-packages/click/core.py", line 829, in __call__
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:     return self.main(*args, **kwargs)
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:   File "/usr/lib/python3.6/site-packages/click/core.py", line 782, in main
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:     rv = self.invoke(ctx)
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:   File "/usr/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:     return ctx.invoke(self.callback, **ctx.params)
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:   File "/usr/lib/python3.6/site-packages/click/core.py", line 610, in invoke
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:     return callback(*args, **kwargs)
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:   File "/usr/lib/python3.6/site-packages/pulpcore/tasking/entrypoint.py", line 44, in worker
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:     NewPulpWorker().run_forever()
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:   File "/usr/lib/python3.6/site-packages/pulpcore/tasking/pulpcore_worker.py", line 277, in run_forever
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:     self.sleep()
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:   File "/usr/lib/python3.6/site-packages/pulpcore/tasking/pulpcore_worker.py", line 187, in sleep
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:     self.beat()
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:   File "/usr/lib/python3.6/site-packages/pulpcore/tasking/pulpcore_worker.py", line 100, in beat
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:     self.worker_cleanup()
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:   File "/usr/lib/python3.6/site-packages/pulpcore/tasking/pulpcore_worker.py", line 90, in worker_cleanup
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:     worker.delete()
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:   File "/usr/lib/python3.6/site-packages/django_lifecycle/mixins.py", line 146, in delete
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:     value = super().delete(*args, **kwargs)
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:   File "/usr/lib/python3.6/site-packages/django/db/models/base.py", line 921, in delete
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:     collector.collect([self], keep_parents=keep_parents)
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:   File "/usr/lib/python3.6/site-packages/django/db/models/deletion.py", line 224, in collect
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:     field.remote_field.on_delete(self, field, sub_objs, self.using)
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:   File "/usr/lib/python3.6/site-packages/django/db/models/deletion.py", line 16, in CASCADE
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:     source_attr=field.name, nullable=field.null)
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:   File "/usr/lib/python3.6/site-packages/django/db/models/deletion.py", line 224, in collect
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:     field.remote_field.on_delete(self, field, sub_objs, self.using)
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:   File "/usr/lib/python3.6/site-packages/django/db/models/deletion.py", line 27, in PROTECT
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]:     sub_objs
Nov 17 09:17:41 katello.example.com pulpcore-worker-7[3775518]: django.db.models.deletion.ProtectedError: ("Cannot delete some instances of model 'ReservedResource' because they are referenced through a protected foreign key: 'TaskReservedResource.resource'",>
Nov 17 09:17:42 katello.example.com pulpcore-api[3277945]: pulp [89e5995c-3b2b-437a-b367-4faf9c981c39]:  - - [17/Nov/2021:09:17:42 +0000] "GET /pulp/api/v3/tasks/039e13b0-5a99-4f8c-9906-14818f48a8d6/ HTTP/1.1" 200 1411 "-" "OpenAPI-Generator/3.14.1/ruby"
Nov 17 09:17:42 katello.example.com systemd[1]: pulpcore-worker@7.service: Main process exited, code=exited, status=1/FAILURE
Nov 17 09:17:42 katello.example.com systemd[1]: pulpcore-worker@7.service: Failed with result 'exit-code'.

User-visible error given on one repository sync this morning:

408, message='Request Time-out', url=URL('http://yum.oracle.com/repo/OracleLinux/OL8/baseos/latest/x86_64/getPackage/glibc-langpack-et-2.28-164.0.1.el8.x86_64.rpm')

- RAM and swap usage over the last 12 hours

Dynflow output for one of the syncs where the end result was “Pulp task error”:

6: Actions::Pulp3::Repository::Sync (skipped) [ 112.67s / 2.45s ]
Queue: default

Started at: 2021-11-17 09:54:29 UTC

Ended at: 2021-11-17 09:56:21 UTC

Real time: 112.67s

Execution time (excluding suspended state): 2.45s

Input:

---
repo_id: 41
smart_proxy_id: 1
options: {}
remote_user: admin
remote_cp_user: admin
current_request_id: d338ca94-5632-4315-9f01-e0e653f6b2d2
current_timezone: London
current_organization_id: 1
current_location_id: 2
current_user_id: 5
Output:

---
pulp_tasks:
- pulp_href: "/pulp/api/v3/tasks/47ed7a46-faae-4844-87f4-323ff269c8de/"
  pulp_created: '2021-11-17T09:54:29.482+00:00'
  state: failed
  name: pulp_rpm.app.tasks.synchronizing.synchronize
  logging_cid: d338ca94-5632-4315-9f01-e0e653f6b2d2
  started_at: '2021-11-17T09:54:29.568+00:00'
  finished_at: '2021-11-17T09:56:18.831+00:00'
  error:
    reason: Worker has gone missing.
  worker: "/pulp/api/v3/workers/675ec557-565a-4772-9a58-10e01f3cb7b0/"
  child_tasks: []
  progress_reports:
  - message: Downloading Metadata Files
    code: sync.downloading.metadata
    state: completed
    done: 7
  - message: Downloading Artifacts
    code: sync.downloading.artifacts
    state: running
    done: 0
  - message: Associating Content
    code: associating.content
    state: running
    done: 0
  - message: Parsed Modulemd
    code: sync.parsing.modulemds
    state: completed
    total: 404
    done: 404
  - message: Parsed Modulemd-defaults
    code: sync.parsing.modulemd_defaults
    state: completed
    total: 45
    done: 45
  - message: Parsed Packages
    code: sync.parsing.packages
    state: running
    done: 11873
  created_resources: []
  reserved_resources_record:
  - "/pulp/api/v3/repositories/rpm/rpm/c88af0ab-f8eb-4598-8a6c-33ff360748d8/"
  - "/pulp/api/v3/remotes/rpm/rpm/ff7b555d-dbff-464e-9bac-136cac7d89cd/"
create_version: true
task_groups: []
poll_attempts:
  total: 27
  failed: 1
Error:

Katello::Errors::Pulp3Error

Pulp task error

That is one of the reason i switched from OL to Alma, i know that OL has some extra stuff in their repos, but the size of these are way to excessive. With OL7/8 i had to run Katello with 32-40 GB to let it survive for around a week until i had to restart the services. This all went away after i switched to another dist for my servers.

Sorry not really helping for this issue, but maybe OL8 or better OL7 is a good repo for the developers to test for the worst expected scenario (repo wise) :wink: