Proxy sync 10x slower with Foreman 3.8/Katello 4.10

In the Dynflow for the log that you sent up in (Proxy sync 10x slower with Foreman 3.8/Katello 4.10 - #12 by tedevil), you should see something like this:

Where the Input for Actions::Katello::CapsuleContent::Sync has repository_id: <some id>

For me, that repository ID is propagated to RefreshRepos.

Can you confirm that you see the repository_id in the input for Actions::Katello::CapsuleContent::Sync?


Also I’ll mention that I tested this so far on our freshest code and it’s not immediately reproducible. Taking a look at 4.10 now.

Edit: I’m not reproducing this issue on 4.10 – after uploading a file to my repository, my smart proxy (that is syncing Library) only refreshes the single repository that got the RPM upload.


We did have a bug fixed in 4.11 that related to fixing RefreshRepos for smart proxy syncs that have no repository ID, environment ID, or content view ID: Fixes #36926 - RefreshRepos called for relevant repos only by pmoravec · Pull Request #10803 · Katello/katello · GitHub

If the logs from above were for an “unscoped” smart proxy sync, this PR could help.

@nixfu , like what I asked above for @tedevil , can you use Dynflow to see what is taking up the extra time?

Also if anyone feels like trying out a patch: https://github.com/Katello/katello/pull/10803.patch

I’m going to triage this to Katello 4.10 so the next 4.10 release receives this fix at least.

See no repository id as input (or organization and location):
image

Looking back to sync tasks in 3.7/4.9 it looks however the same for the input for “Actions::Katello::CapsuleContent::Sync”.

However if I look on the “Actions::Pulp3::Orchestration::Repository::RefreshRepos” the repository_id: " is missing on 3.8/4.10.:
3.7/4.9:
image

3.8/4.10:
image

1 Like

Appreciate the info, there’s one more comparison I’d like to see:

In Dynflow there will also be a Actions::Pulp3::CapsuleContent::Sync. Mind showing if that receives a repository_id for each Katello version?

repository_id is set for both versions so no issue for “Actions::Pulp3::CapsuleContent::Sync”.

1 Like

I made an odd discovery with the issue we were having that was taking 5-7 hours to do smartproxy syncs. All our smartproxies are on-demand only.

We had previously had the “restrict composite content view promotion” set to True in our settings.

That means that our individual content views, AS WELL as our composite content views had to be promoted to the same level such as “prod”.

Our smartproxies are set to sync everything in a promotion level, such as “prod”.

I disabled that restriction setting, and I removed all individual content views from the promotion level, so they are all just at Library now, and only the composite content views we have are now promoted through the promotion process and sent out to the smartproxies.

Needless to say this has reduced the amount of repos synced to the smartproxies quite a bit.

We went from probably 50 listed on each smartproxy, down to the less than 10 top level repos on each smart proxy which make up our much small number of composite content views we use to organize different types of systems.

Now the smartproxy syncs are about 15 mins with a “complete sync”, when it was taking 5-7 hours.

So either it was something to do with a difference between syncing content views, vs syncing composite views, or just the fact we had so many views being synced out to the smartproxies.

So to summarize this so far, would you say this patch will solve this issue? How would one try and apply it? It is only on the Foreman server itself or also the proxies?

That patch should help, but we have still yet to figure out in the code why your “refresh repos” step is not getting a repository_id. It would be a workaround at best.

You’d need to cd into /usr/share/gems/gems/katello-4.10.0/ and then run patch -p1 < 10803.patch. You can skip any patches that fail to find files, they’re likely test-related.

Example:

[root@centos8-stream-katello-4-10 katello-4.10.0]# patch -p1 < 10803.patch 
patching file app/lib/actions/katello/capsule_content/refresh_repos.rb
patching file app/lib/actions/katello/capsule_content/sync.rb
Hunk #1 succeeded at 36 (offset 1 line).
patching file app/lib/actions/katello/capsule_content/sync_capsule.rb
Hunk #1 succeeded at 14 (offset -1 lines).
can't find file to patch at input line 79
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|diff --git a/test/actions/katello/capsule_content_test.rb b/test/actions/katello/capsule_content_test.rb
|index 59fb7690a45..7202cab2258 100644
|--- a/test/actions/katello/capsule_content_test.rb
|+++ b/test/actions/katello/capsule_content_test.rb
--------------------------
File to patch: 
Skip this patch? [y] y
Skipping patch.
7 out of 7 hunks ignored

After you patch it, run foreman-maintain service restart.

If you need to revert the patch, you can just reinstall the rubygem-katello RPM via dnf.

Alternatively, you can wait for Katello 4.10.1 which is slated to receive this fix.

That’s a good point you bring up, @viwon. I can imagine some folks do need to consume from both the “component” and composite content view versions, but if you don’t, it would certainly save time and space.

Did you notice this slowdown after a specific upgrade? There was a separate Pulp issue (mentioned above I believe) that did cause a real slowdown for the syncing portion of smart proxy syncs, so that could be part of it. But here it seems we’re also having an issue where more smart proxies are being updated than there should be.

Performed an OS update on my Foreman server (AlmaLinux 8.8 to 8.9) during the holidays and seen the issue is now solved. I can see some new Foreman related packages was upgraded/installed:

Old:
pulpcore-selinux-1.3.3-1.el8.x86_64
rubygem-foreman-tasks-8.2.0-1.fm3_8.el8.noarch
rubygem-foreman_remote_execution-11.1.0-1.fm3_8.el8.noarch
rubygem-smart_proxy_remote_execution_ssh-0.10.1-1.fm3_6.el8.noarch
puppet7-release-7.0.0-14.el8.noarch

New:
pulpcore-selinux-2.0.0-1.el8.x86_64
rubygem-foreman-tasks-8.3.3-1.fm3_8.el8.noarch
rubygem-foreman_remote_execution-11.1.1-1.fm3_8.el8.noarch
rubygem-smart_proxy_remote_execution_ssh-0.10.3-1.fm3_8.el8.noarch
puppet7-release-7.0.0-15.el8.noarch

A complete global sync to all my proxies (one package added to a repo) is now down to ~55 seconds

1 Like

Appreciate the report back @tedevil , I can only wonder then if you were hitting an issue with foreman-tasks itself. Perhaps an issue with serialization of inputs? I can only guess. This will be good information for anyone who hits the issue in the future.