Intro
In the early days of remote execution, we wanted to get dynflow to the smart proxy to be able to reuse some code we had on Foreman’s side and to have the same engine on both sides. Sadly at the time, smart proxy wasn’t SCL’d and Dynflow required newer ruby than what was available outside of SCL.
To address this situation, we created smart_proxy_dynflow
and smart_proxy_dynflow_core
gems. smart_proxy_dynflow_core
is a sidecar service to the smart proxy, essentially just Dynflow runtime with an API. smart_proxy_dynflow
is a smart proxy plugin which rewrites requests coming to /dynflow/*
and proxies them to smart_proxy_dynflow_core
.
From this point onwards, I’ll refer to smart_proxy_dynflow_core
as SPDC
.
To make things more interesting, SPDC
was written in a way that it could still act as a regular smart proxy plugin, meaning that if all dependencies could be satisfied, it could be loaded into smart proxy and not require the sidecar service. On debian based platforms, there are no SCLs and therefore we had no issues with things being or not being available in a SCL, and so we deployed SPDC
as a smart proxy plugin, as opposed to EL* based platforms, where it was a standalone service.
The catch(es)
Having this solved the immediate issue, but created a couple of new ones:
- different deployments on different platforms
- higher network traffic when
SPDC
runs as a standalone service due to the request proxying between smart proxy andSPDC
- code duplication (
SPDC
reimplements selected parts of smart proxy) - weird gems existing due to us wanting to be able to load the same code into Foreman, smart proxy and
SPDC
(foreman_{ansible,tasks,remote_execution}_core
and friends) - getting SSL certs right is rather hard, installer does that for us but people are struggling when using custom certs
- certain things became hard (client-cert based auth on
SPDC
's side, since we don’t get the “real” client’s cert)
Fast forward to today, smart proxy is SCL’d, the original issue is gone and we are left with just the downsides. To address this, I suggest getting rid of SPDC
(the service, we would keep the gem for a bit more) for good.
The proposal
1) Patch RPM packaging to deploy bundler.d files to both smart proxy and SPDC
Currently when installing smart proxy plugins, they drop bundler.d files into locations where SPDC
is able to pick them up. However, those are not files for the gems which are providing them, but for their *_core
counterparts, so for example smart_proxy_ansible
drops a file to have foreman_ansible_core
loaded in SPDC
, smart_proxy_remote_execution_ssh
sets up foreman_remote_execution_core
and so on.
The first step would change this to have the smart proxy plugins set up the bundler.d files to have them loaded into both SPDC
and smart proxy. Until we change external_core
setting for smart_proxy_dynflow_core
, the gems would be loaded by both services, but only used by SPDC
.
This step doesn’t really add value on its own, apart from making the following step safer by not having to coordinate gem, puppet module and installer releases precisely.
2) Toggle the setting
Once we have the gems loaded in both services, we can start phasing out SPDC
. This step would be relatively straightforward, it would boil down just to flipping a single parameter in puppet-foreman_proxy, deprecating the standalone mode and optionally stopping SPDC
service either by installer or from some packaging hook.
This should follow shortly after the first step is done. I’d be glad if we could get this done soon, ideally targeting Foreman 2.5.
3) Code cleanup
As far as SPDC
(the gem) is concerned, we could get rid of most of the code we copied from smart proxy to be able to make it run on its own. We could also drop all the request rewriting and proxying code from smart_proxy_dynflow
. Eventually, we should be able to get rid of SPDC
completely by merging it into smart_proxy_dynflow
.
There are many gems which are currently arguable at best and wouldn’t make any sense at all once we get rid of SPDC
. This would be the stage where we move most of the existing code from *_core
plugins to their smart_proxy_*
counterparts. After this is done, whatever remains of the *_core
gems can be dropped completely.
This is probably the biggest step of the three, but since the gems in question can be used as-is in both deployments, it is not urgent. In any case, I’d probably leave one release for the deployment change to settle in and then get to refactoring the plugins. If we manage to get the previous steps into 2.5, then it would suggest the refactored plugins landing in 2.6.
Finishing thoughts
While the split deployment model has some downsides, it is not all bad. The upside of it is that even if the things running on SPDC
get CPU intensive, it won’t affect the smart proxy process.
Another thing worth considering is that in the split deployment, if you push SPDC
hard enough, SPDC
may not be able too keep up with incoming requests. If we run dynflow inside regular smart proxy as per this proposal, it might start dropping other, non-dynflow, requests when under load. On the other hand, smart-proxy uses puma (as opposed to webrick on SPDC
) which should perform better. Sadly I didn’t try this at a large enough scale to have any meaningful data, so I cannot really confirm nor refute this. What keeps me optimistic is not recalling any debian users complaining about it happening.
We have a chance to simplify things quite a bit. This proposal should allow us to address all the catches described above and I believe it is a step in the right direction.