Foreman Cache Store

lzap · July 3, 2018, 11:00am

Yet it can be useful for Foreman, some of my ideas include offloading facts from RDBM or aggregating telemetry data for Prometheus Ruby client (currently single-process only).

Of course Dynflow could workaround lack of durability/reliability, although I think it’s bad idea to reinvent the wheel. Let’s not build another messaging system, we already built our own tasking system and if you peek into dynflow code, you will see a lot of code there.

ohadlevy · July 3, 2018, 11:46am

lzap https://community.theforeman.org/u/lzap Discovery
July 3

lzap:

Redis is not the right tool for dynflow’s job.

Yet it can be useful for Foreman, some of my ideas include offloading
facts from RDBM or aggregating telemetry data for Prometheus Ruby client
(currently single-process only). And dynflow could workaround that,
although I think it’s bad idea to reinvent the wheel.

with regards to facts, you could have a rails web process to receive the
facts, store them in redis (as strings), and pass the uuid to a remote
worker without a need to seralize/deserilize and store it in a db.

You are right that its not great if you lose that kind of data, but imho
its OK in some cases (e.g. for facts you can request them again).

Regarding good fit or not for dynflow, IMHO if its good enough for large
community projects such as sidekiq, it should probably be good enough for
our usage cases too.

Bryan_Kearney · July 3, 2018, 11:58am

The candlepin folks are aware of the change, and moving to redis for that communication stack would be easy for them to do. They are moving to artemis for their own messaging, but the broker will be internal to candlepin so it will not be an impact on the service architecture.

lzap · July 3, 2018, 6:24pm

One more question, if we decide to configure Redis by default, how much memory do we allocate/reserve? I am assuming that the daemon allocates some pool + swap. It is a new component in our stack and memory is precious resource.

Shall we gather feedback from users who has memcached turned on? I’d be very interested in how much memory our session and cache actually takes on small, mid or large scale deployments so we can set reasonable defaults.

lzap · August 21, 2018, 1:06pm

Are we taking some actions in this regard (having Redis installed by default)?

One idea I have - if we do this for Smart Proxy as well, this would enable us to ship rack-attack DoS protection plugin by default: https://github.com/kickstarter/rack-attack

ohadlevy · August 22, 2018, 5:45am

sadly I just saw https://redislabs.com/community/commons-clause/
AFAIU - it means we should not use redis, but either find a fork or some other tool instead.

Gwmngilfen · August 22, 2018, 7:01am

I was just about to post this too (more thoughts at https://twitter.com/webmink/status/1032016976170967040) but @ohadlevy beat me to it. Will be interesting to watch this shake down - seems a shame they didn’t go for AGPL.

Redis is used a lot, I’ll be very surprised if someone doesn’t fork it…

hartmel · August 22, 2018, 8:02am

I have a very naive question, could the jsonb engine (though not in memory, and assuming linux buffer/cache management do the job) and pub/sub mechanisms provided by PostgreSQL fill the needs ?

ohadlevy · August 22, 2018, 8:12am

it might, but we are still supporting non pgsql setup, plus you would have
potential issues with connection pool to the db.

sean797 · August 22, 2018, 8:31am

By the looks of things Redis isn’t actually included under the Commons Clause, just Redisearch, ReJSON,
Redis-ML, Redis Graph and rebloom are - though I’m not sure if we’ll need any of those?
On the other hand, it may not be a good idea to settle on a technology when we’re not really confident that the license will continue to be open going forward.

See, https://redislabs.com/community/oss-projects/ and the license in their respective git repos.

Gwmngilfen · August 22, 2018, 9:14am

Yeah, that’s the centre of it, for me. This feels like a classic bait’n’switch open-core move. Will be interesting to see what other projects do (e.g. Discourse uses Redis)

ekohl · August 22, 2018, 11:08am

And that’s why I previously argued against being opinionated about the cache store in. If it’s just key-value, Redis is just one implementation.

ohadlevy · August 22, 2018, 11:16am

you might be right about the cache store, but my aim with redis was a wider
scope of pub/sub infrastructure for UI websocket notifications.

ekohl · August 22, 2018, 11:28am

Yes, but this discussion was about a cache store. IMHO those two are separate issues and pub/sub + websockets is a much bigger topic. Tying them together only results in the cache store not being solved because it’s too big of an issue to tackle. Experience tells us we can expect the same to happen here.

Bryan_Kearney · August 22, 2018, 12:08pm

Internal discussions show this common clause is only certain modules which are developed in house by RedisLabs.

lzap · August 22, 2018, 3:16pm

PostgreSQL cannot be used as a cache AFAIK because even JSONB documents are subject of transactions and the system assures they are properly stored before it confirms. In other words, it’s very slow compared to Redis or other memory-only implementations.

lzap · October 22, 2018, 3:33pm

Apologies for resurrecting such an old thread, but what is the conclusion @ehelms? Are you going to propose one or another?

I am asking because I am playing with idea of simple callback system and in case we are going to introduce Redis that would be good service I could use to publish events from Foreman: Feature #25247: Simple callback system for users - Foreman

ehelms · October 22, 2018, 3:48pm

The apology is all mine for letting this thread lie. Due to the concerns around Redis, I somewhat intentionally let it rest to see what the fallout of Redis common clause was. Given that it seems to only affect certain modules (and those modules have been forked already by a group of developers) I think we can proceed with Redis. The Pulp project has no intentions to move away from Redis so at least for Katello users would already have access to a Redis once Pulp 3 is available.

I think it would be good to hear from those that were worried about the licensing agai. As well, we had a discussion on going in thread around a cache store vs. pub-sub technology. Redis provides both of these in one package. @ekohl argued we should allow any configured cache store to work which is fair, but we do need an opinionated default in my opinion.

That is to say, I think we should hold this discussion open for a week at best to hear back for any concerns and then proceed based on that.

ehelms · May 30, 2019, 3:24pm

Adding an update based on work @TimoGoebel started regarding this topic:

TimoGoebel · May 31, 2019, 1:55pm

Let me briefly explain why I filed the PR.
We were discussing that introducing redis to the stack would help us in several places.

Pulp 3 will depend on redis, Pulp 3 will be part of Katello. We’ve also discussed some use cases of Pulp in core.
Fact Storage in Foreman is a challenge in large environments. We could introduce a caching layer that we don’t have to access the DB so much. (#26923)
Websockets for UI updates without polling can use redis.
A rails cache backend for HA setups.
As a backend for a new background tasks engine for activejob.

There is probably more.
The costs of running (standalone) redis are pretty low.