Foreman Cache Store

ehelms · June 29, 2018, 8:08pm

When scaling Foreman, a cache store is needed to ensure that multiple application instances reference the same set of cached data (e.g. user session). Today, as I understand it, this is mostly handled through including the foreman_memache plugin to configure and use a memcached server. This leads me to bring the following design questions up:

Merge foreman_memache to core

Should we deprecate foreman_memcache and move it’s simple functionality directly into Foreman core? This plugin is stupid simple and seems like overhead that’s unneeded for a core platform concept.

Preferred Cache Store Backend

If we go with bringing this functionality into core, we can open this up to using other cache stores such as Redis. This means the user can control their own destiny here. However, my question is what should be our preferred backend. The Pulp project for version 3 has switched their async task engine to RQ which by default uses Redis. Aligning on that would have benefits for content users. And while I think choice can be good for users, being opinionated has a lot of value for deployments, testing and developer environments. (Plus I’d like to centralize on one for container based deployments by default)

lzap · June 30, 2018, 8:19am

Love the idea.

I think that redis is an overkill for both Foreman (Rails) use and Pulp tasking. Both use data serialization which is excellent fit for type-less memcached. I did small research and it looks like memcached seem to be more simple in features but it scales better on a single host (it is multi-threaded) while redis has more features (which we won’t much utilize) and better multi-host scaling.

Can you elaborate the benefits other than unified caching/session service which is obvious? Just curios if there is something I don’t see.

I just think that the default implementation we have today (write files on disk) is terrible default. Speaking about what to ship by default I think we should configure memcached by default for foreman without katello and redis by default for katello scenario.

Let me explain that - we have many memcached users today and we want to make sure it works in the future. Having two options is not something necessary bad, it’s actually good to know that we have two default options which work out of box and we all use. Caching memory-only servers are not components that would be pain to support I think.

ohadlevy · June 30, 2018, 12:35pm

I like the idea, and would support redis as well, as it can help with
future UI work (such as live notifications with pub/sub).

lzap · July 1, 2018, 6:36pm

But memcached does not support anything like that. I’d rather avoid tight integration with specific broker, Foreman is not the kind of app where you have tens of thousands active users at one time with millions of notifications per minute. I think storing them in RDBM is just fine for our use case.

ekohl · July 1, 2018, 7:06pm

Redis is fairly light weight. For Pulp tasking they chose to use python-rq with Pulp 3. That requires Redis because you don’t have to reinvent queueing. Memcache is just a key-value store which has no queueing functionality.

Also note that Redis offers some features like persistent caching so that you don’t lose all your user sessions after a reboot.

I think that it shouldn’t be hard to support both Memcache and Redis. Giving the users choice would be a good thing. However, we can prefer Redis. In the Katello case it leads to a simpler stack because of Pulp. For vanilla Foreman there’s little difference but it’s good to align the stacks.

ohadlevy · July 2, 2018, 6:38am

I would prefer just to use redis, here are my reasons:

lets be opinionated, reduce the size of the stack (why would we want the add more pain for supporting our stack)?
using redis, you can get a simplified message bus, it has a lot of benefits imho, ranging from web ui live notifications (without the need to do continues pulling, and without the need to increase the db pool size)
its a common / reliable software stack, we can have (in some day) non ruby code putting data on redis, and have some ruby respond to it?
async job processing tools such as sidekiq rely on redis, maybe we can use that to reduce db /maintenance load.
we know the developers

lzap · July 2, 2018, 9:16am

First of all, I am not strictly against sticking with Redis, I just want to discuss this properly before we do the decision. I think you are actually showing with examples my biggest concern - let’s not get hard dependency on caching component. Finding common denominator between memcached and Redis is what I would like us to do, but what you actually propose is to go full-throttle Redis which is actually quite the opposite.

Because we already have foreman_memcached in our theforeman.org? I have to admit that migration should be pretty easy (well users will need to drop sessions/cache I guess). But this is by far the least significant point in the discussion IMHO.

As always, there are drawbacks. Polling is one way of doing this, many TCP connections is the other way which is exactly what Redis Pub/Sub does. There are many articles in the wild saying connecting directly from JS is a bad idea and WS should be used instead. That’s another component in our stack since Passenger/Ruby/Rails is not good at WS.

Beware! Redis was built as messaging for background processing and UI. It has no notion of durability and guaranteed delivery and it is far from being a full-blown messaging broker. Let’s not build more features than background processing or simple UI messaging on top of that or we will hit walls very soon.

Not something you wrote so I will leave a warning here: nobody should think about using Redis as enterprise service bus between (containerized) components (e.g. smart-proxy and foreman). That would be terrible design mistake to make.

What you are saying is “let’s have some more complex environment in the future”. I actually treat this one as anti-argument Unified stack with RDBMS is actually a good thing!

And we don’t use sidekiq, this is a weird statement. We do have foreman-tasks and we won’t be starting with sidekiq only because it’s using Redis which we could have. That does not make sense to me.

ohadlevy · July 2, 2018, 9:36am

my point is why support memcache and redis (and non memcache like today) and that imho we should pick one way and move forward (similar to @Gwmngilfen comment about multi distro support).

I would support WS solution going forward (infact, thats what I was imagining from the beginning), and you are right, we would need another WS server that is most likely not ruby (maybe something like anycable).
the truth is that many times we wished we had that kind of infrastructure, but we often skip it because limitation in our current architecture.

further, this would allow us to stop using the python ws just for vnc/spice and can be a generic solution for all WS traffic.

true, having said that, it can still be used in many ways without the need for a full message broker.

it is no secret that I’m in favor of using off the shelf solution for background processing, hence this comment.

At the end, if we are considering adding a required additional service, and redis is as simple as memcache but offer more flexibility and features (while being stable), I’m in favor if redis.

ekohl · July 2, 2018, 10:00am

I think my preferences are sort of captured in this: off the shelf solutions in general and the more flexible solutions.

With that in mind, is there no generic Rails caching solution out there? Currently foreman_memcache implies it’s very Foreman specific, but Caching with Rails: An Overview — Ruby on Rails Guides tells me there are out of the box Memcache and Redis cache stores out there. It means the user has full flexibility in their caching solution and we can use an off the shelf solution.

ehelms · July 2, 2018, 12:42pm

Thanks for the replies and digging into this thus far. Recapping where I believe this has landed so far:

Recapping Key points:

Pulp 3 is using RQ and it’s default backend is Redis, therefore we get stack re-usability if we defer to Redis when content services are enabled
Redis as a default provides other potential use benefits and is a well-maintained off the shelf solution
This RFC does not intend to drop memcached support
This RFC does not intend to require a cache store by default

This is where I see things currently:

General agreement to move configuration of cache store into Foreman core
When the Foreman project is deploying Foreman and requires a cache store to use Redis (e.g. Forklift environments, blog posts, containers, etc.)
Maintain support of memcached through configuration options

ohadlevy · July 2, 2018, 12:53pm

ehelms https://community.theforeman.org/u/ehelms Leader
July 2

Thanks for the replies and digging into this thus far. Recapping where I
believe this has landed so far:

Recapping Key points:

Pulp 3 is using RQ and it’s default backend is Redis, therefore we
get stack re-usability if we defer to Redis when content services are
enabled

Redis as a default provides other potential use benefits and is a
well-maintained off the shelf solution

This RFC does not intend to drop memcached support

This RFC does not intend to require a cache store by default

This is where I see things currently:

General agreement to move configuration of cache store into Foreman
core

When the Foreman project is deploying Foreman and requires a cache
store to use Redis (e.g. Forklift environments, blog posts, containers,
etc.)

Maintain support of memcached through configuration options

My proposal was to deprecate memcache support, and introduce in core
support for redis as a required service, initially for cache, later on for
UI WS and other services.

ekohl · July 2, 2018, 12:59pm

I think this is a bit premature. For pure caching I don’t see a difference between memcache and redis. The case of UI WS is still quite complex and I think we need to do this in a separate RFC.

ohadlevy · July 2, 2018, 1:11pm

While I fully agree, my motivation here is that we will be opinionated to reduce the permutation we have for our users. of course that if someone really wants to keep memcache, there is a way (or even keep the memcache plugin) but by default we should not enable that (e.g. in the installer)

ehelms · July 2, 2018, 1:29pm

I am sensing two separate questions that are worth breaking out specifically to answer:

Redis vs. memache in general for caching and which we should prefer
Whether the Foreman application, should by default, configure and use a Redis backend for caching? Rather than the current local disk caching that occurs.

iNecas · July 2, 2018, 6:57pm

This just makes it harder making Katello running on existing Foreman. If we can, we should unity on one solution that fits all, unless there is really bit reason for not oding so.

lzap · July 2, 2018, 7:14pm

I am fine with that as long as we support switching cache backends back and forth via our installer.

TimoGoebel · July 3, 2018, 8:46am

We already talked about introducing a message broker for dynflow? Could we use redis for that? @iNecas?

In general, both memcached and redis are liked by admins. We have had good experiences with both. I believe that redis gives use more features and I think we should make use of them and introduce them as a default stack.

Can we get rid of qpid completely in favour of redis?

ekohl · July 3, 2018, 9:03am

Currently both Pulp (2) and Candlepin use qpid. With Pulp 3 moving to Redis we still have Candlepin using it to notify Katello about {entitlement,pool}.{created,deleted} + compliance.created events. I do not know if this can easily be converted to Redis or some other technology as well.

Then there is the whole qdrouter/gofer stack for clients that I know nothing about, but I believe that there is an alternative of using a yum/dnf tracer plugin for reporting and using REX for active actions.

IMHO this is a good simplification in the stack and something we should aim for.

iNecas · July 3, 2018, 10:07am

This is one question of cross-process communication we need to figure out before we move to multi-workers setup.

@lzap raised his concerns, that redis was not build to provide enterprise-level messaging, due to limitations in durability and guaranteeing of delivery. On the other hand, since we serialize most of the data to database, using the pub/sub just for poking the workers when needed might be potential way to go.

The main issue with message brokers (at least in my experience) is, that they work nicely, until they don’t. And they require additional skills on those who maintain the infrastructure to keep the broker happy. It might be just our experience with specific implementation though. Not having something we could rely on in terms of durability and guarantee of delivery, and forcing us on handling that on higher levels, might be one way to go forward.

However, this requires separate investigation. What we tent to do in Dynflow is leveraging existing infra to do our job, rather than introducing new stack just for our purposes. Therefore, presence of Redis that we could rely on might influence, how we think about cross-process communication and how much we expect from it. If nothing else, it would allow us looking at Sidekiq for the tasks processing.

Anyway, I don’t want to derail this conversation too far.

lzap · July 3, 2018, 10:57am

Big caution here, Redis messaging is not persistent at all and the protocol does not allow client-to-client reliability. Dynflow was built on notification concept - fire and forget. As soon as the broken confirms, dynflow developer assumes the work will be done (now or later).

Redis is not the right tool for dynflow’s job.