Understanding Katello event listeners

lzap · December 3, 2021, 11:00am

Hey,

as I was digging into increased memory usage for my VM and I need help to explain a design of Katello event listeners.

I am trying to understand how these Event Monitor and Event Daemon threads are supposed to work in production and what they do. There is a global (singleton) registry that is created and initialized in config.to_prepare block.

      Katello::EventDaemon::Runner.initialize
      Katello::EventDaemon::Runner.register_service(:candlepin_events, Katello::CandlepinEventListener)
      Katello::EventDaemon::Runner.register_service(:katello_events, Katello::EventMonitor::PollerThread)
      Katello::EventDaemon::Runner.register_service(:katello_agent_events, Katello::EventDaemon::Services::AgentEventReceiver) if ::Katello.with_katello_agent?

Let me explain my concerns. I noticed that these services/runners/monitors spawn several threads:

[lzap@nuc katello]$ ag Thread.new
app/services/katello/event_monitor/poller_thread.rb
86:        @thread = Thread.new do

app/lib/katello/event_daemon/runner.rb
77:          @monitor_thread = Thread.new do

app/lib/katello/event_daemon/services/agent_event_receiver.rb
33:          @thread = Thread.new do

Everytime I see a new thread in a web application, it rings a bell for me. This is not something that should be done, web application servers are not only designed with concurrency model that usually does not count with additional threads, but more importantly for Ruby environment: we run multiple instances (processes) of the application.

I do not understand Ruby on Rails initialization and Puma well enough to see where the thread should end up. But it is obviously not deterministic and I fail to understand what is the intended deployment behavior of these services. Are these threads meant to run in the master process (the one from workers are forked)? Or all of the workers?

See, in most operating systems including Linux, when a process with multiple threads is forked, only the thread that called fork() carries over to the child process. All the data created by other threads do carry over too, however, they are no longer active and will never run. So having this in the master process could work, however, data created by the threads would be wasted. But this is not the case as you will see.

Worse scenario would be that all workers will spawn these threads and only one will “win” the PID file writing “fight”. So it might appear that only one process handles the events while actually all of them would be processing the requests (or queues or whatever these are doing). If this was the intended design, then we should probably write PIDs in a correct way, but it feels weird.

This is my instance after restart:

foreman   9282  0.0  2.9 947964 293884 ?       Ssl  Dec02   0:49 puma 5.3.2 (unix:///run/foreman.sock) [foreman]
foreman   9308  0.0  4.6 992672 463960 ?       Sl   Dec02   0:42 puma: cluster worker 0: 9282 [foreman]
foreman   9314  0.1  5.3 1144712 537400 ?      Sl   Dec02   1:23 puma: cluster worker 1: 9282 [foreman]

What we see is puma app server with two worker processes. So far so good. Now, let’s find out which process is handling those events:

[root@zzzap ~]# foreman-rake console
Loading production environment (Rails 6.0.3.7)
irb(main):001:0> Katello::EventDaemon::Runner.pid
=> 9314

That is puma cluster worker 1, that feels incorrect. I would either expect this to be the master process or (preferably) a totally separate process that is not a webserver (sidekiq perhaps). Now, the problem why this is I think incorrect is that Puma can possibly recycle worker processes (during rolling restart). We do not use this as it needs some further configuration changes and luckily Puma currently does not support recycling processes after X requests like Passenger does, so worker processes can be only recycled manually. Let’s test what happens:

[root@zzzap ~]# kill 9314

[root@zzzap ~]# ps axuwww | grep puma
foreman   9282  0.0  2.9 947964 293884 ?       Ssl  Dec02   0:49 puma 5.3.2 (unix:///run/foreman.sock) [foreman]
foreman   9308  0.0  4.6 992672 463960 ?       Sl   Dec02   0:42 puma: cluster worker 0: 9282 [foreman]
foreman  24466  0.0  4.4 976840 449840 ?       Sl   11:24   0:00 puma: cluster worker 1: 9282 [foreman]

[root@zzzap ~]# foreman-rake console
Loading production environment (Rails 6.0.3.7)
irb(main):002:0> Katello::EventDaemon::Runner.pid
=> nil

As you can see, puma immediately restarted worker 1, but Katello event daemon is gone. There is a Katello event daemon runner class which is some kind of weird monitor that monitors the event daemon thread. I am not sure what this is supposed to do, both event daemon and event monitor share the same process, so if the process is killed there will be no monitor to resume anything. And monitoring threads is weird, thread can only terminate when it ends the block or exception is raised that is not cought. Both can be handled via code pretty easily.

As I said, this behavior is not deterministic, there is a report (Katello 4.2.x repository synchronisation issues - #32 by John_Beranek) when worker number 6 is the one that runs the events out of 12 workers. In my case it was always the second worker.

This design is not only bad from the administrative perspective, this cannot scale well. Puma effectively uses round-robin to distribute the load on the workers, since one of the workers need to handle those Katello events, it can be overloaded causing slower processing of events.

So my question and the purpose of this post is: What are these event listeners doing and can we move them out of the webserver process into a separate process(es)?

ekohl · December 6, 2021, 12:27pm

I certainly agree that this is much more like a backend task. For us, sidekiq is mostly filling that role and Puma is really only intended to serve the frontend. On the other hand, sidekiq is also more task based than a continuous process. This has also been on my mind, but I haven’t found any time to dive into this.

You’re also right that in the future this may be an even bigger issue. With the latest version Puma it is possible to fork off worker 0 which could then lead to increased memory usage for us. Today we don’t use the fork_worker directive, but this has been on my mind.

Looking at the documentation, in the short term the best place is probably before_fork but this would mandate the use of clustered mode.

I have not looked at Sidekiq’s options in this area.

lzap · December 6, 2021, 1:28pm

Yeah, I think even a rake task could do the job, the implementation looks like some kind of background listener. So this would be simply a rake task managed by systemd which can offer a lot of services like watchdog, logging, environment setting or even configuring multiple instances (if needed).

ekohl · December 6, 2021, 4:02pm

I’m a bit hesitant to introduce yet another daemon. We do have an orchestrator in Sidekiq. Perhaps we can somehow design something to combine the two to reduce memory overhead.

On the other hand, I think I’ve heard @aruzicka mention he’d like to avoid loading Rails in the orchestrator if possible.

It’s also a risk to introduce a bottleneck in the orchestrator. Maybe that can be reduced by forking off support services. Essentially making a special entry point that forks off a number of constant background workers where the orchestrator is just one.

In case it wasn’t clear: Katello’s event listener listens to Candlepin’s events. These are needed to keep Katello and Candlepin in sync. In the past this was implemented in a Dynflow job but it was changed to a daemon. In https://github.com/Katello/katello/pull/8366 there are a bunch of considerations. I’m sure @Jonathon_Turel can enlighten you if you have more questions.

Jonathon_Turel · December 6, 2021, 5:31pm

Hello! I had a hand in the design you’re speaking about. Let me start by explaining a bit about the EventDaemon::Runner and EventDaemon::Monitor:

EventDaemon::Runner is invoked by a request middleware. If the daemon is not running (no PID file written / invalid PID in the file) it starts the Monitor. If the PID file exists, then the fact that the daemon is running is cached for a few minutes. Whichever worker receives the first request will start the system up. If that worker dies another will take its place once the aforementioned cache expires, making it resilient to someone coming along and killing the given process as demonstrated in the first post. Stopping the Runner (at_exit) closes the Monitor thread, stops the listeners, etc.

EventDaemon::Monitor is responsible for starting each of the registered event listeners, periodically checking (and caching) their status, and starting them back up again if necessary. Today there are three listeners - Katello, Candlepin, and Katello Agent.

As far as the design goes with respect to starting threads in a web server processes: it wasn’t my first choice. I did look into a rake task and iirc it used about 1GB of memory which seemed too much for something that may be sitting idle for some period. Not to mention the extra bits and pieces needed to be created to add another standalone service to our deployment. I think we should have fewer than more of those. Since Katello 3.14 (that’s several downstream versions ago) it’s been working very well. To be clear I have no real affinity to the design as it is, but I think changing it requires that the current limits be quantified so that we can fully understand the benefits of doing things differently.

aruzicka · December 7, 2021, 9:01am

That’s right. It should only be concerned with Dynflow’s inner workings and for that it shouldn’t need to have all the user code available. I’d also like to avoid adding not really related stuff into services we already have “just” to save memory.

In general it also feels like we’d be going against the current trends and we would be making things harder for ourselves if we ever decide to have kubernetes as supported deployment platform.

I’m not sure what it exactly does so this might be a silly question, but does it actually need to have the whole application loaded?

lzap · December 7, 2021, 1:29pm

Hey, thanks for elaborate reply.

I had no idea the Runner is being called each request, it makes sense. When I initially tested this and killed the process, I did not wait long enough so the late-initialization code did not kick in.

I see one flaw, but it is minor. Rails cache is not thread safe, the default implementation (we actually use for production) is FileStore. Monitor writes to cache and worker threads read from it, they can actually read non-sense, however, I can only see the only use - in the ping controller. On top of that, status cache entry is probably just a one block on disk which is atomic on Linux. Other than that, everything looks fine when it comes to handling processes and proper shutdown.

My biggest concern is that Puma does not recycle processes at all at the moment. Passenger did this. Therefore one process (selected randomly which is a bit weird - I would expect Puma to start with the first worker) can grow bigger as it will be doing more work and it w on’t be recycled.

I agree that we need to have less components, however, I also like UNIX design - small components each doing small things and doing them well. Okay, this might sound odd in the Foreman with Katello context, but you know what I mean.

Absolutely, now I understand how this works I would only suggest to implement some kind of Puma worker recycling. Personally, I would prefer having this off the web process, but I am not the one who maintains this codebase.

Jonathon_Turel · December 7, 2021, 3:39pm

The listeners require connection to backend services including the DB, and the handling of events hits application code all over so I think that’s a definite ‘yes’

Shouldn’t cause an issue since the writing of the cache only ever happens in one thread, and the cache should have been written by the time it’s read even if the /api/v2/ping is the first request to the server! We did fix a bug there at one point iirc…

At a lower level I’d like to audit the memory consumption of the three event handlers we currently have, to identify any quick wins to reduce memory. We might get a lot of value there - it’s never been closely analyzed. And the good news is that the Katello Agent listener won’t be around for many more releases.

Jonathon_Turel · December 7, 2021, 3:43pm

In the long run, if we found that the current implementation doesn’t scale, extraction into an external service could be considered. One way to avoid loading the whole application would be to write a small app that just receives the events from the various queues. And that app could in turn call a new/special/internal Katello API(s) to handle those events. That new app then becomes very lightweight as it’s really just forwarding the handling to the Rails app. There could even be some sort of intelligent valve which throttles the processing if events of the Rails server is very busy.

ekohl · December 7, 2021, 3:48pm

The Smart Proxy does have facilities to run daemon threads. That could send REST API calls back to Foreman if desired.

Jonathon_Turel · December 7, 2021, 4:07pm

Interesting! Any pointers - docs or code examples from an existing plugin?

ehelms · December 7, 2021, 4:10pm

There is a plugin to Puma that can do worker recycling - Puma Worker Killer. I would encourage carefully reading their README.

ekohl · December 7, 2021, 4:17pm

We have two daemons in the dhcp_isc module:

github.com

theforeman/smart-proxy/blob/a2393b841d32986a6a6e774c078a2f6d6c9b7498/modules/dhcp_isc/dhcp_isc_plugin.rb#L18

    
      
              default_settings :config => '/etc/dhcp/dhcpd.conf', :leases => '/var/lib/dhcpd/dhcpd.leases',
                               :omapi_port => '7911', :blacklist_duration_minutes => 30 * 60
          
          
    requires :dhcp, ::Proxy::VERSION
              validate_readable :config, :leases
          
          
    load_classes ::Proxy::DHCP::ISC::PluginConfiguration
              load_programmable_settings ::Proxy::DHCP::ISC::PluginConfiguration
              load_dependency_injection_wirings ::Proxy::DHCP::ISC::PluginConfiguration
          
          
    start_services :leases_observer, :free_ips
            end
          end

These are defined in load_dependency_injection_wirings (which can actually be a proc in the plugin definition itself):

github.com

theforeman/smart-proxy/blob/a2393b841d32986a6a6e774c078a2f6d6c9b7498/modules/dhcp_isc/configuration_loader.rb#L28

    
      
            ::Proxy::DHCP::SubnetService.new(container.get_dependency(:memory_store),
                                             container.get_dependency(:memory_store), container.get_dependency(:memory_store),
                                             container.get_dependency(:memory_store), container.get_dependency(:memory_store))
          end)
          container.dependency :parser, -> { ::Proxy::DHCP::CommonISC::ConfigurationParser.new }
          container.dependency :service_initialization, -> { ::Proxy::DHCP::CommonISC::IscSubnetServiceInitialization.new(container.get_dependency(:subnet_service), container.get_dependency(:parser)) }
          container.dependency :state_changes_observer, (lambda do
            ::Proxy::DHCP::ISC::IscStateChangesObserver.new(settings[:config], settings[:leases], container.get_dependency(:subnet_service), container.get_dependency(:service_initialization))
          end)
          
          
container.singleton_dependency :free_ips, -> { ::Proxy::DHCP::FreeIps.new(settings[:blacklist_duration_minutes]) }
          
          
if settings[:leases_file_observer] == :inotify_leases_file_observer
            require 'dhcp_isc/inotify_leases_file_observer'
            container.singleton_dependency :leases_observer, (lambda do
              ::Proxy::DHCP::ISC::InotifyLeasesFileObserver.new(container.get_dependency(:state_changes_observer), settings[:leases])
            end)
          elsif settings[:leases_file_observer] == :kqueue_leases_file_observer
            require 'dhcp_isc/kqueue_leases_file_observer'
            container.singleton_dependency :leases_observer, (lambda do
              ::Proxy::DHCP::ISC::KqueueLeasesFileObserver.new(container.get_dependency(:state_changes_observer), settings[:leases])

And the lease observer actually has two different implementations, one for Linux and one for BSD:

github.com

theforeman/smart-proxy/blob/a2393b841d32986a6a6e774c078a2f6d6c9b7498/modules/dhcp_isc/configuration_loader.rb#L30-L40

    
      
          if settings[:leases_file_observer] == :inotify_leases_file_observer
            require 'dhcp_isc/inotify_leases_file_observer'
            container.singleton_dependency :leases_observer, (lambda do
              ::Proxy::DHCP::ISC::InotifyLeasesFileObserver.new(container.get_dependency(:state_changes_observer), settings[:leases])
            end)
          elsif settings[:leases_file_observer] == :kqueue_leases_file_observer
            require 'dhcp_isc/kqueue_leases_file_observer'
            container.singleton_dependency :leases_observer, (lambda do
              ::Proxy::DHCP::ISC::KqueueLeasesFileObserver.new(container.get_dependency(:state_changes_observer), settings[:leases])
            end)
          end

You then have an implementation which has a constructor (as used in the dependency wiring, so you’re free to define it) and start / stop:

github.com

theforeman/smart-proxy/blob/develop/modules/dhcp_isc/inotify_leases_file_observer.rb

require 'rb-inotify'

module ::Proxy::DHCP::ISC
  class InotifyLeasesFileObserver
    include ::Proxy::Log

    attr_reader :observer, :leases_filename

    def initialize(state_changes_observer, leases_path)
      @observer = state_changes_observer
      @leases_filename = File.expand_path(leases_path)
    end

    def monitor_leases
      @notifier = INotify::Notifier.new
      @notifier.watch(File.dirname(leases_filename), :modify, :moved_to) do |event|
        if event.absolute_name == leases_filename
          event.flags.each do |flag|
            case flag
            when :modify

This file has been truncated. show original

lzap · December 7, 2021, 6:42pm

But then it is writing to the cache every 15 seconds as other workers are busy handling other events. Isn’t that a (possible) issue?

I’ve seen that, what about it? We are actively working on finding all the bugs, but I believe proper recycling of worker processes is something that Ruby apps need thanks to the Ruby VM memory fragmentation. Passenger have this feature (recycles after 10000 requests), Puma don’t.

But I was not aiming at this rubygem at all. I was more thinking about its documented phased restarting or re-forking once we upgrade configuration to the fork-worker mode.

I concur. Utilizing Smart Proxies (Capsules) is key in order to achieve a good scalability.