My experience is as follows:
We did 3-4 “separate” foreman infrastructures with 1k hosts each (split by datacenter). These were “one host” foreman(s) with Onboard Databases. Worked fine - performance was fine, IIRC they were 4 core 16GB RAM boxes. In these days we had <100 puppet classes total and maybe 6 environments to load these classes into.
When we brought everything together - we split our database, puppetmasters, CA, UI, and Proxy across datacenters with LoadBalancers (F5 LTM and GTM) in front of everything.
Has worked better.
As we add more managed nodes/environments/code (currently at 35 environments, 1088 puppet classes, 18k managed servers) we just add more puppet master(s). More API calls from other automated systems? beef up the front-end or add more nodes. Database getting bogged down? Increase the specs of the database tier.
A lot more overhead/management for sure, compared to a single node, but also a lot more manageable. Not quite microservices, but each role being subdivided does help. Our database (mysql) is up around ~80GB and we clean report(s) after 7 days, audits after 30 days.
If you were to keep it on a single server, scaling to 6k nodes:
Foreman Passenger is “memory intensive” - lotta RAM
Database running in memory is key
PuppetServer RAM usage can be huge
Puppet Classes x EnvironmentCount x Jruby max-active-instances means stupid level(s) of JVM heap.
Most other Smart Proxy role(s) don’t need much RAM or CPU
I’d vote for splitting it up. Or at minimum sticking everything behind loadbalancers, so when you need to split and scale horizontally, you aren’t headed back to the drawing board, and it’s just easy.