Foreman with 90,000 + hosts and Salt (no Katello/Pulp)

Hello all. Ive been using Foreman for a good 7-8 years and have deployed small instances of it at various shops. I’ve come across a new project that wants to replace their Salt API UI with Foreman (thanks to Broadcom buyout of VMware). They have around 90,000 nodes, globally. I have already implemented Foreman for their Linux patching (about 50% of above hosts), with Katello/Pulp.

However, now I need to work on replacing jobs/tasks for the same number of nodes. Out of the 90k nodes, they are split at about 5-7k nodes per saltmaster. Ill be able to utilize this to my advantage, but my main concern is how Foreman will handle these requests/jobs. So the focus here is Dynflow and Sidekiq. Just starting the planning phases of this architecture, I have some questions:

How many jobs and/or tasks can Foreman handle? Lets assume that I will have the following setup:
2 Foreman servers - Foreman UI/API and Foreman ENC/Reports. PostgresQL Active/Passive backend on a separate server. Should I also have a 3rd Foreman server to just handle jobs/tasks?

Originally I was going to set those up with memcache, but after researching more, it appears I want to setup redis for this, to keep them all in sync?

I imagine that there would be times where they are going to need to run something against all 90k nodes. How will Foreman/Dynflow/Sidekiq handle something like that? If we have 18-20 saltmasters, all running a Foreman Salt/Proxy, that certainly fixes any sort of bottleneck there, but I just cant comprehend how to setup the architecture for this on the job/task running/caching side of things.

Any ideas, thoughts or help would be greatly appreciated. Yes, I have read all the blog posts, links and sites about Foreman at scale. Very helpful!

Thanks!

1 Like

Would the other two instances be somehow tweaked to not run the dynflow/sidekiq services or would they be fully featured, as set up by the installer?

Run against all 90k nodes as in fire off a single job with name ~ * search query or multiple smaller jobs covering all the hosts?

Would the other two instances be somehow tweaked to not run the dynflow/sidekiq services or would they be fully featured, as set up by the installer?

Im not sure yet. That is part of my confusion on how to setup the backend services to handle this many jobs/tasks.

Run against all 90k nodes as in fire off a single job with name ~ * search query or multiple smaller jobs covering all the hosts?

Both, possibly. It would be rare, but I can see an instance where some emergency is happening, and a command needs to be ran against all 90k hosts.

I guess the real question here is, how many jobs/tasks can a single instance of dynflow/sidekiq handle, within a reasonable time (sub 10-20 mins)?

I don’t think we have some best practices guide apart from the tuning guide, but you should be able to scale the backend services independently on the rest.

Without any data to back this claim up, I can just say that many smaller jobs should perform better than a single gigantic job due to the way it’s implemented. For details I’ll shamelessly point you to one of my older posts Help to find the bottleneck in foreman-task / remote execution - #2 by aruzicka in the hopes you haven’t seen that one yet.

That is extremely hard to quantify as not all jobs/tasks are equal and it depends on a lot of factors. I would expect a standard deployment to be able to handle a job on ~8000 hosts (as in a single job with 8k hosts in it) within ~15 minutes. Of course, this is more of an educated guess rather than anything else so ymmv.

Thank you very much for your replies, and link (which, no I have not seen yet). I also agree, I do not suggest running against 90k hosts all at once, and the chances of that happening are almost impossible.

I guess what my plan is, is to just prepare and scale it the best I can. Unleash it into production and continue to scale out if we have issues.

Now Im wondering if I can just spin up 3-4 foreman instances, with all services on each, and just put a load balancer in between them with memcache or redis behind them. Although, I dont think that would prevent the use case of a single user logging in, and kicking off a run of 30,000 jobs. The single Foreman server they end up on would still be the sole server running those jobs. So it seems moreso that I need the balancing to be on the Dynflow/Sidekiq side of things. Perhaps, I can detach Dynflow/Sidekiq from Foreman itself, and somehow clusters servers running those services so regardless of where a user initiates jobs from, it would always be load balanced against multiple Dynflow/Sidekiq instances.

After further discussions, it sounds like they often run salt queries against all hosts at once. It takes them about 10-15 minutes to start getting some results back. 60-90 mins to finish, and some never return at all.

Here is my initial architectural chart. Im not sure I need to type much to explain it, as the diagram itself should explain it. I dont generally like to ping people, but @aruzicka, maybe @TimoGoebel or @lzap - I have seen your names on many scaling/large scale deployments. Wondering if you had any input?

My two worries of bottlenecks are:

  1. A job being kicked off on 90,000 hosts (this may potentially occur frequently)
  2. The reports and grains (facts) coming back from said jobs

Eventually this will surpass 100k servers, potentially this year.

A couple of more thoughts to myself, as I stare at this.

Im a bit worried about the postgres side. If I have large numbers of reports/grains(facts) coming in to one servers instance, and it needs to replicate to the other server in order for the UI servers to display it, it feels like thats just the same as running everything on a single postgresql server…

Does anyone know how jobs/tasks are handle in relation to salt? If a job is kicked off on say 1000 servers, is there a task created for every single host, and then that request is sent over to the foreman-proxy on the saltmaster? Or is it sent over in blocks of n amount, that the saltmaster than queues up?

It’s been a while, but:

On foreman side:

  1. When the job is started a “parent” task starts running to back the entire execution
  2. The parent task spawns per-host sub-tasks, which check permissions, render the job template and so on
  3. The per-host sub tasks are collected into batches of 100 (default, changeable in settings) and sent over to smart proxies

On smart proxy side:

  1. Once a batch is received, it expands the batch into standalone tasks
  2. Each of the tasks from 1) does salt $host state.template_str $rendered_job_template
  3. When 2) finishes, each of the tasks calls back to Foreman
1 Like

HI! Sorry for offtopic comment, but I so impressed by your architecture. Especialy about runnig Foreman in k8s infrastructure.
Can you tell please, do you use custom foreman image and helm chart or some opensource solution?
If it’s not on NDA can you tell more about your case? In the internet there is a huge lack of information about runnig Foreman in k8s.
Thank you in advance!

1 Like

Honestly, I am not sure lol. We are literally just trying to comprehend how to do all this. There is a really good video about foreman in k8s - https://www.youtube.com/watch?v=mPjUvNAYp1c - but otherwise we are just trying to figure it all out for ourselves. Basically we are just breaking down all parts of foreman and then attempting to containerize the services. The idea being we want to be able to just automatically spin up new containers if and when the foreman production infrastructure needs it.

For now, that is a long ways away. We are just attempting to first replace our vmware (thanks a lot broadcom) Salt Config UI, as the cost will be skyrocketing into the millions. And Ive used Foreman in past shops, but not at this scale. So I think we wont even be really diving deep into the k8s side of things for another 6-9 months. Maybe I will have more then :slight_smile:

2 Likes