RFC: Run Foreman with Puma and an Apache Proxy in Production

ehelms · October 14, 2018, 9:17pm

This RFC proposes to switch the production application server for Foreman to use Puma and Apache to reverse proxy to the Puma server by default. This has the following benefits:

Aligns production and development environments
Aligns proposed container deployment methods with standard service deployment
Provides easier path to running Foreman as a systemd container
Separates SSL handling and load balancing from application server

As part of the design, changes will need to be made to the way puppet-foreman handles deployment. Some of the open questions that will heavily influence this design:

Do we keep Passenger as an option or keep opinionated and stick to Puma + Apache?
Do we allow switching between Apache + Passenger and Puma/Systemd + Apache for deployments?
Do we make the choice of application server configurable (e.g. use of gunicorn, etc.) ?

ekohl · October 14, 2018, 9:33pm

Initially I’d propose to add an option to configure as a reverse proxy instead of using passenger. We already have an option to use a (systemd) daemon. AFAIK we need to export some extra variables to use puma.

Another application server could be done by making the service name a parameter but I’d not focus on it initially.

Lachlan_Musicman · October 14, 2018, 10:42pm

As an observer (“someone not fully understanding every technology in the Foreman stack”) can I ask what it’s replacing? I presume it’s tomcat - but is there any more than that?

ekohl · October 14, 2018, 11:03pm

As I see it: before it’s Apache + mod_passenger to serve the Ruby on Rails application. After it’s Apache (with mod_proxy) + puma to serve the RoR app. It could easily be any other webserver to proxy (like nginx) but I’d prefer to stick with Apache in our default installs for a while.

tbrisker · October 15, 2018, 5:07am

Would this approach allow us to finally use websockets?

ik5 · October 15, 2018, 5:34am

In theory yes, this is from an hour work yesterday on the matter: https://github.com/theforeman/smart-proxy/pull/613

It’s not working yet, because I placed it just as a “backup” for my work, and today the PR will be ready for full review.

TimoGoebel · October 15, 2018, 7:10am

Clear +1 from me, I think if this unlocks websocket support it’s worth the effort. That’s also why I would be opinionated about this and drop mod_passenger support as we won’t be able to get websockets with mod_passenger + apache.
I’m curious if there is any performance impact to this. That’s why we could still offer a fallback to mod_passenger. We could tell users that if they’re having issues they can fall back to passenger but should let us know to investigate as we’ll eventually drop passenger for good.

How do you want to handle Client Certificate Handling? I currently know three places where we use this (Foreman, Katello and Foreman DLM Plugin). Do you want to configure apache to pass ssl headers to puma?

Would we gain anything from that? I would try to keep the stack streamlined and just offer puma unless there is a valid reason to provide anything else.

tbrisker · October 15, 2018, 7:23am

Only offering and supporting one stack to me seems like a much better approach, otherwise I’m afraid we’ll be stuck in the middle supporting both for a long time.

If I understand correctly, the main concerns are around performance and certificate handling. If we can show that puma is capable of handling these issues just as well as passenger, I’d be all for replacing one with the other even for just for enabling websockets support, not to mention the other motivations Eric mentioned in the OP.

aruzicka · October 15, 2018, 10:32am

As in “a container created using systemd-nspawn”?

ekohl · October 15, 2018, 11:13am

I had a quick stab at refactoring the puppet-foreman module to make the Apache config support passenger optionally. Unit tests pass but the current setup probably doesn’t work yet:

https://github.com/theforeman/puppet-foreman/pull/677

One goal I already had is to make https://github.com/theforeman/puppet-katello_devel/blob/master/manifests/apache.pp obsolete. We can probably reactor it in such a way that it uses the foreman::config::apache class which makes using the puppet-katello class a lot easier thus greatly reducing the code needed in puppet-katello_devel. This goes a long way in that direction.

Another benefit is that the SCL situation is a lot easier when you don’t have passenger in the mix. Building RPMs for it is complex.

ekohl · October 15, 2018, 12:01pm

Update: the above PR does mostly work, it configures Apache as a reverse proxy and launches the standalone foreman service. The acceptance test can retrieve the login page but is failing because it can’t properly detect whether the service is running and enabled for some unknown reason.

What’s left to do is modify the standalone service to use puma. Not sure what’s the cleanest place to do so: packaging or puppet-foreman.

lzap · October 15, 2018, 1:25pm

Big +1 from me, maintaining SELinux policy for passenger was a HUGE pain. Also scaling was more difficult since we could not easily configure Passenger FOSS to recycle processes after reaching memory limits.

Also initial work towards Puma on proxy is pending initial review, this is a good timing - having everything on a single stack is ideal.

I vote for being opinionated about Puma here and taking advantage of good timing (post 1.20 branching). If possible, in cooperating with Red Hat Satellite Performance team who could help us scale up 1.21 a bit and provide feedback to the community so we can decide or fix bugs.

lzap · October 15, 2018, 3:07pm

It looks like Puma either support app preloading or rolling restarts (they call this phased restart). Both are quite useful features, which one we will configure by default? To me it looks the former is more useful.

To workaround rolling restarts partially, Puma offers systemd socket activation, which can nicely allow restarts of daemon without disturbing clients. Shall we configure this by default? https://github.com/puma/puma/blob/master/docs/systemd.md#socket-activation

Shall we use local UNIX socket instead of TCP between Apache and Puma? It looks like Apache 2.4 supports it. This should be faster.

Couple of work items ideas to be added into RedMine once we start with implementration:

Add support for monitoring via PCP and Prometheus of the Puma process
Replace passenger-recycler with https://github.com/schneems/puma_worker_killer
Write SELinux policy for Puma and Apache
Test log rotation and signal handling

ehelms · October 15, 2018, 4:31pm

Not to me, I offered it as a question to get the response you gave I think as a community we’ve leaned towards choice rather than being prescriptive and thus left this question here to gauge thoughts. Generally from this thread I get the sense of pick one way and go that path rather than providing a matrix of options to users which I generally I like as it reduces testing, support, etc.

ehelms · October 15, 2018, 4:32pm

As in running a systemd service, like foreman, where instead of running puma natively as a process, we’d run a container as the process.

TimoGoebel · October 18, 2018, 7:49am

I know, @ekohl can speak for himself, but I’m so excited, that I want to share some progress.

@ekohl has a working prototype running. The only real obstacle we faced was how the client certificate is passed to the rails app. We have to use headers that are set in the apache reverse proxy but that replaced newlines with spaces. When we then try to load the certificate in ruby, it fails.

So basically we have to teach our rails app to handle this. I guess it would be best to introduce some middleware that handles this throughout the app.

ohadlevy · October 18, 2018, 8:17am

did anyone look into memory usage over time with Puma? the main benefit
with passenger was the ability to scale worker, and recycle them over xxx
requests, this would ensure no memory abuse happen over time.

Puma does have a cluster mode, but afair, we are not planning to use it…
in the old days i used to use a tool like monit or god to ensure memory
thresholds for web processes, imho we would probably need to introduce such
a tool along with the change…?

lastly, regarding websockets, IMHO we should not default to ruby stack for
websockets, its known as a non ideal, imho we should consider exposing ws
though apache to another process (we can use the same port and route WS/WSS
traffic to another local daemon).

ekohl · October 18, 2018, 11:15am

To replicate my effort you can use a forklift config. Add the following to vagrant/boxes.d/99-local.yaml:

centos7-foreman-reverse-proxy:
  box: centos7-foreman-nightly
  ansible:
    variables:
      foreman_installer_options:
        - --foreman-passenger false
      foreman_installer_module_prs:
        - foreman/foreman/677

Then it’s vagrant up centos7-foreman-reverse-proxy.

What isn’t done:

Pass the SSL cert headers
Configure Foreman to use the right (proxied) headers

The easiest way to test client certificates is /etc/puppetlabs/puppet/node.rb $(hostname -f).

lzap · October 18, 2018, 12:00pm

Puma does exactly the same, you need a configuration option to enable this. On top of that, it can also do rolling restarts or app preloading (these two features are exclusive - you need to pick one or another).

Oh we do plan to use it, except for proxy where we likely do this later (per module). Ohad, please keep in mind there are currently two concurrent threads about Puma:

bringing Puma onboard for Foreman Rails app (this one)
migrating foreman-proxy from webrick to Puma (POC PR already exists)

So to recap - this is the plan at least how I understand it for now:

foreman rails - puma in cluster mode, single-threaded
foreman proxy - puma in one-process mode, multi-threaded

Puma has a nice plugin architecture (very simplified tho) and this kind of tool already exists, for example GitHub - zombocom/puma_worker_killer: Automatically restart Puma cluster workers based on max RAM available and I believe we should be doing this from the day one.

ekohl · October 18, 2018, 12:11pm

To be clear: I fully agree we need those kind of tests but my current focus is to get the crucial functionality working.