On spawning threads in our Rails app

lzap · March 27, 2018, 2:48pm

Hello,

this is just a friendly reminder, although Rails 4.x and higher is considered to be thread safe, this isn’t case for our own code and our dependencies. Our application is not thread safe, until we explicitly test, agree and say so. Therefore never ever create new thread in any of our code, unless it’s some very isolated case with proper synchronization.

As much as I would love to, we can’t simply say “we are thread safe now” or “let’s just try it and see what happens”, until we see decently-sized deployments with multi-threaded servers without any concurrency issues. Things can go really bad and debugging multi-threaded applications is way more difficult.

Recently, a multi-threaded code sneaked into Rails app - that’s Dynflow. I believe without proper testing of the core Rails application that was not a good approach to take. I’d like to propose extracting this back to Procfile. There was already an attempt:

https://github.com/theforeman/foreman/pull/4871

This is not a rant, I’ve heard it’s coming but I haven’t realized consequences until today. I know how background processing is important not only to Katello plugin and although I’d rather appreciate moving off from Dynflow to something more simple, I appreciate all the effort and work around it. It’s indeed a challenging piece of our application.

TL;DR

Don’t allow Thread.new in our codebase for now.
Let’s start talking about moving towards multi-threaded environment with proper testing plan if we need to.

iNecas · March 27, 2018, 8:00pm

I’m sorry, but I don’t agree with the multi-threading part. Dynflow has been used vastly in the Foreman’s environment for over 5 years, and nothing dramatic even happened since then regarding the mutli-threading (therefore I’m quite surprised by this anti-multi-threading campaign now

Given the growing demand for running in larger scale, I don’t see us actually moving from this model now. See also Dev server hangs on code change.

And the link you’re sending, is not about achieving one thread per process. It’s only about moving the Dynflow’s thread pool outside of web process in development, and therefore getting more closer to production, but again, this is not about always having one thread per process.

I would be rather interested into places where you think our code is not thread safe and address those, as going the other direction, as it’s really huge step backwards.

lzap · March 28, 2018, 10:23am

I apologize for being harsh, I am little surprised and scared. Let me explain my thinking. Our Rails codebase and dependencies were never considered to be thread safe, there was no discussion, no extensive testing. Now, I believe that Dynflow at some point loads Rails environment with all our code and deps, at least in one thread. We had just one thread, now we have two concurrent threads at least (maybe more) in our Rails app.

You said that Dynflow works multi-threaded now for over 5 years with nothing dramatic, sure but on the other hand we have Dynflow execution workflow and Rails execution workflow in one process and that creates vastly different environment. It used to be separate process with different execution behaviour. We don’t know what to expect, that’s my main point here.

My concern is all about complexity - we already started using multi-threaded web server in development starting with Rails 4.x, this already created some pains, for example I was debugging concurrency problem around logging for a week (https://github.com/theforeman/foreman/pull/5197). If we don’t have any plans on how to move on to multi-threading in production yet, so it makes little sense to me to spend our cycles on debugging concurrency issues when we don’t know if we are actually going to use this in the end. The fact Rails + Dynflow now works for few months without major issues in development setup does not tell anything about behaviour in production - that’s totally different workload and story.

We need a plan, discussion and proper production-setup testing to identify possible concurrent issues. Now, I think things will go bad, but maybe I am wrong and our tests will show that our application executes just fine. But we need to be sure before we start playing around with multiple threads in our development setups, for this reason I used the “no-go” term because I was always told (and I experienced myself) that debugging concurrency issues is much more difficult. Whatever the outcome of the testing is, I’d rather be debugging concurrency issues in two separate processes - Rails app (web requests) and Dynflow (background jobs) both in production or development - just to make things easier and more consistent.

Here is my proposal to workaround my concerns:

move Dynflow to Procfile
perform production multi-threaded tests of various scenarios
analyze the results

Then we can decide to leave this status quo properly:

Start moving towards multi-threading in production (move away from Passenger I guess).
Keep using single thread and also disable multi-threading in development to avoid further pains.

This is the reason why I think putting Dynflow into Rails was a bad idea in the first place - it brings nothing but possible issues, unless we decide to go towards multi-threading. I’d appreciate stepping back a bit in this case, waiting is also an option, maybe the Ruby 3x3 will solve concurrency for us, maybe single-threaded containers are the answer, I don’t know. I’d like to hear opinion from folks about moving Dynflow to Procfile and future of deployment in terms of threading and deployment.

iNecas · March 28, 2018, 1:16pm

I think what needs to be clarified is the level of multi-threading we are talking about.

It’s one thing having the whole web request handling behind a multi-threaded pool (which is something we don’t have right now) and we probably don’t need in close future, as passenger has been working for us quite well so far.

Different thing is Dynflow using thread pool internally to process the async actions: this is something that we’ve been doing for long time, has limited scope of using active record and http libraries (no request processing or assets rendering etc.) and moving away from this model would be very painful and would actually lead to worse resources utilization: There are other areas I would like to focus in the Dynflow area, than doing this.

And finally, different thing is having the dynflow thread pool inside the web process in development setup. The reason we went this path in the beginning was purely pragmatic (it was before webpack era, where one was sufficient with just one process to run the whole process). Things have changed and as I mentioned, I’m ok with doing this change. I also prefer running things closer to production. The motivation at the beginning was easier development setup, but it seems there is no point in continuing this path any longer.

lzap · March 28, 2018, 5:18pm

Agreed.
Sure, I never questioned Dynflow design in this tread. As long as this is an isolated process, I don’t care how many threads it spawns.
Agreed.