We often see a 10-100X # of client tasks in relation to other operational calls, eg [1]:
COUNT|TYPE
320668 Actions::Katello::Host::Update
277923 Actions::Katello::Host::GenerateApplicability
141833 Actions::Katello::Host::UploadPackageProfile
1983 Actions::Katello::Repository::ScheduledSync
429 Actions::Katello::CapsuleContent::Sync
127 Actions::Katello::Repository::MetadataGenerate
118 Actions::Katello::Repository::CapsuleGenerateAndSync
109 Actions::Katello::CapsuleContent::RemoveOrphans
59 Actions::Katello::Host::Hypervisors
53 Actions::Katello::Repository::BulkMetadataGenerate
There is work to move those top 3 task types out of Dynflow, as outlined in Bugzilla [1] below. For the immediate horizon, in large environments handling 50,000->100,000 of these types of tasks on a daily basis, we need some help.
In the near term, I wanted to explore an additional avenue in tuning we have available to Dynflow, the setting: foreman-dynflow-pool-size
Our default, that has existed since Dynflow and Foreman Tasks have been included in Foreman has been 5 workers in the pool. We introduced a parameter that can be changed via the installer:
–foreman-dynflow-pool-size
I setup a test scenario to measure the throughput of tasks with various configuration settings applied. The test uses container based clients that register and then run in a loop a facts update to create Host::Update tasks:
# subscription-manager register …
# for i in seq 1 1000
;
# do
# subscription-manager facts --update
# echo “Completed: [$i] iterations”
# done
The test launches 50 containers in parallel and runs until completion. Visual verification shows that the queue in Foreman Tasks fills faster than can be processed as tasks grow in Planning/Pending faster than they are completed.
- Default Settings: 1 dynflow_executor, 5 workers in each pool
Time to process 50,000 tasks: 490.8 MINUTES - 8.1 HOURS
- Extra executors: 3 dynflow_executors, 5 workers in each pool
Time to process 50,000 tasks: 223.6 MINUTES - 3.7 HOURS
- Extra executors, double pool size: 3 dynflow_executors, 10 workers in each pool
Time to process 50,000 tasks: 105 MINUTES - 1.7 HOURS
You can see that this is a significant amount of throughput performance increase and there were no additional errors or memory consumption issues observed beyond the additional RAM for each dynflow_executor process which is expected.
I’d be curious if there are any known gotchas in utilizing additional worker threads and if the default should be increased above 5…
Mike