Tuning Dynflow's pool_size and effect on task throughput

We often see a 10-100X # of client tasks in relation to other operational calls, eg [1]:

COUNT|TYPE

320668 Actions::Katello::Host::Update
277923 Actions::Katello::Host::GenerateApplicability
141833 Actions::Katello::Host::UploadPackageProfile
1983 Actions::Katello::Repository::ScheduledSync
429 Actions::Katello::CapsuleContent::Sync
127 Actions::Katello::Repository::MetadataGenerate
118 Actions::Katello::Repository::CapsuleGenerateAndSync
109 Actions::Katello::CapsuleContent::RemoveOrphans
59 Actions::Katello::Host::Hypervisors
53 Actions::Katello::Repository::BulkMetadataGenerate

There is work to move those top 3 task types out of Dynflow, as outlined in Bugzilla [1] below. For the immediate horizon, in large environments handling 50,000->100,000 of these types of tasks on a daily basis, we need some help.

In the near term, I wanted to explore an additional avenue in tuning we have available to Dynflow, the setting: foreman-dynflow-pool-size

Our default, that has existed since Dynflow and Foreman Tasks have been included in Foreman has been 5 workers in the pool. We introduced a parameter that can be changed via the installer:

–foreman-dynflow-pool-size

I setup a test scenario to measure the throughput of tasks with various configuration settings applied. The test uses container based clients that register and then run in a loop a facts update to create Host::Update tasks:

# subscription-manager register
# for i in seq 1 1000;
# do
# subscription-manager facts --update
# echo “Completed: [$i] iterations”
# done

The test launches 50 containers in parallel and runs until completion. Visual verification shows that the queue in Foreman Tasks fills faster than can be processed as tasks grow in Planning/Pending faster than they are completed.

  1. Default Settings: 1 dynflow_executor, 5 workers in each pool

Time to process 50,000 tasks: 490.8 MINUTES - 8.1 HOURS

  1. Extra executors: 3 dynflow_executors, 5 workers in each pool

Time to process 50,000 tasks: 223.6 MINUTES - 3.7 HOURS

  1. Extra executors, double pool size: 3 dynflow_executors, 10 workers in each pool

Time to process 50,000 tasks: 105 MINUTES - 1.7 HOURS

You can see that this is a significant amount of throughput performance increase and there were no additional errors or memory consumption issues observed beyond the additional RAM for each dynflow_executor process which is expected.

I’d be curious if there are any known gotchas in utilizing additional worker threads and if the default should be increased above 5…

Mike

[[1] https://bugzilla.redhat.com/show_bug.cgi?id=1758285]

Hi, thank you for bringing this up.

In this context “X workers in the pool” means, “the dynflow executor process has X threads which function as workers”. Because of how ruby handles threads (GIL and all that), you will start to see diminishing returns if you ramp it up too high. From my observation, 5 is a good default, with 15 being the value above which it just doesn’t really help any more. I’ve seen people trying to set this into the hundreds range and that’s a no-no.

Increasing the number of executors should be a safe thing to do, but it has its own drawbacks, which are described in the upcoming Dynflow changes post[2].

Also please note that this changes the size of the pool for the default queue. Some actions use different queues, so changing this shouldn’t have any impact of processing them. As a relevant example I’d like to mention the hosts_queue, tracked by BZ1721679[1].

May I propose one more test? Maybe you have seen my proposal/report about upcoming changes to Dynflow[2]. I’d be more than happy to work with you to set it up on the testing machine, run the tests there and explore which scaling options it gives us.

In general I don’t have anything against moving the host actions from Dynflow as described in BZ 1758285, but maybe it isn’t/won’t be necessary.

[1] - https://bugzilla.redhat.com/show_bug.cgi?id=1721679
[2] - Upcoming changes to Dynflow