Katello Nightly Breakage, Dynflow, Foreman Tasks and Database

Currently Katello nightly is breaking due to the MonitorEventQueue not processing properly which is affecting errata applicability calculations. The error messages I am seeing in the logs are related to database connection issues. I have not dug into this other than some stack overflow posts that seem to indicate Rails 5 and concurrency has issues with database connections. If @iNecas or @aruzicka or anyone else with some knowledge in this area could take a look. I have attempted to up the pool numbers to 20 with the same error output. If I restart dynflowd, the event queue is processed, then enters an error state and processes no more. The CandlepinEventListener shows some of the same errors as well as some of the notification background processing for the UI. Here are some errors I see, these come from:

vagrant up centos7-katello-bats-ci

2018-03-09T03:30:49  [background] [E] could not obtain a connection from the pool within 5.000 seconds (waited 5.003 seconds); all pooled connections were in use (ActiveRecord::ConnectionTimeoutError)
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:202:in `block in wait_poll'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:193:in `loop'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:193:in `wait_poll'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:154:in `internal_poll'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:278:in `internal_poll'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:148:in `block in poll'
 | /opt/rh/rh-ruby24/root/usr/share/ruby/monitor.rb:214:in `mon_synchronize'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:158:in `synchronize'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:148:in `poll'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:747:in `acquire_connection'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:500:in `checkout'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:374:in `connection'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:931:in `retrieve_connection'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_handling.rb:116:in `retrieve_connection'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_handling.rb:88:in `connection'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/associations/join_dependency.rb:96:in `initialize'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/relation/query_methods.rb:1016:in `new'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/relation/query_methods.rb:1016:in `build_join_query'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/relation/query_methods.rb:1003:in `build_joins'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/relation/query_methods.rb:942:in `build_arel'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/relation/query_methods.rb:918:in `arel'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/relation.rb:667:in `exec_queries'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/relation.rb:546:in `load'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/relation.rb:255:in `records'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/relation/delegation.rb:39:in `each'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0/app/services/katello/ui_notifications/pulp/proxy_disk_space.rb:7:in `deliver!'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0/app/jobs/create_pulp_disk_space_notifications.rb:7:in `perform'
2018-03-09T03:30:54  [dynflow] [E] could not obtain a connection from the pool within 5.000 seconds (waited 5.000 seconds); all pooled connections were in use
2018-03-09T03:30:54  [dynflow] [E] /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:202:in `block in wait_poll'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:193:in `loop'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:193:in `wait_poll'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:154:in `internal_poll'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:278:in `internal_poll'
2018-03-09T03:30:54  [dynflow] [W] Error on on_execution_plan_save event
 | ActiveRecord::ConnectionNotEstablished: No connection pool with 'primary' found.
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_adapters/abstract/connection_pool.rb:930:in `retrieve_connection'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_handling.rb:116:in `retrieve_connection'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_handling.rb:88:in `connection'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/transactions.rb:210:in `transaction'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/transactions.rb:381:in `with_transaction_returning_status'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/transactions.rb:313:in `save!'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/suppressor.rb:46:in `save!'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/foreman-tasks-0.12.0/app/models/foreman_tasks/task/dynflow_task.rb:24:in `update_from_dynflow'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/foreman-tasks-0.12.0/lib/foreman_tasks/dynflow/persistence.rb:43:in `on_execution_plan_save'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/foreman-tasks-0.12.0/lib/foreman_tasks/dynflow/persistence.rb:12:in `block in save_execution_plan'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/foreman-tasks-0.12.0/lib/foreman_tasks/dynflow/persistence.rb:10:in `tap'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/foreman-tasks-0.12.0/lib/foreman_tasks/dynflow/persistence.rb:10:in `save_execution_plan'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/persistence.rb:73:in `save_execution_plan'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/execution_plan.rb:423:in `save'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/execution_plan.rb:132:in `update_state'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/director/execution_plan_manager.rb:101:in `finish'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/director/execution_plan_manager.rb:74:in `compute_next_from_step'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/director/execution_plan_manager.rb:42:in `what_is_next'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/director.rb:107:in `work_finished'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/executors/parallel/core.rb:35:in `work_finished'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/actor.rb:6:in `on_message'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/executors/parallel/core.rb:67:in `on_message'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-edge-0.2.4/lib/concurrent/actor/context.rb:46:in `on_envelope'
2018-03-09T03:30:54  [background] [E] ActiveRecord::ConnectionNotEstablished (ActiveRecord::ConnectionNotEstablished)
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/connection_handling.rb:112:in `connection_pool'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activerecord-5.1.4/lib/active_record/query_cache.rb:27:in `run'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activesupport-5.1.4/lib/active_support/execution_wrapper.rb:25:in `before'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:413:in `block in make_lambda'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:197:in `block (2 levels) in halting'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:601:in `block (2 levels) in default_terminator'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:600:in `catch'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:600:in `block in default_terminator'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:198:in `block in halting'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:507:in `block in invoke_before'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:507:in `each'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:507:in `invoke_before'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activesupport-5.1.4/lib/active_support/callbacks.rb:130:in `run_callbacks'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activesupport-5.1.4/lib/active_support/execution_wrapper.rb:108:in `run!'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activesupport-5.1.4/lib/active_support/execution_wrapper.rb:70:in `block in run!'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activesupport-5.1.4/lib/active_support/execution_wrapper.rb:67:in `tap'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activesupport-5.1.4/lib/active_support/execution_wrapper.rb:67:in `run!'
 | /opt/rh/tfm-ror51/root/usr/share/gems/gems/activesupport-5.1.4/lib/active_support/execution_wrapper.rb:83:in `wrap'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/foreman-tasks-0.12.0/app/lib/actions/middleware/rails_executor_wrap.rb:13:in `run'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/middleware/stack.rb:22:in `call'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/middleware/stack.rb:26:in `pass'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/middleware.rb:18:in `pass'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/action/progress.rb:30:in `with_progress_calculation'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/action/progress.rb:16:in `run'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/middleware/stack.rb:22:in `call'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/middleware/stack.rb:26:in `pass'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/middleware.rb:18:in `pass'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0/app/lib/actions/middleware/keep_locale.rb:11:in `block in run'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0/app/lib/actions/middleware/keep_locale.rb:22:in `with_locale'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.7.0/app/lib/actions/middleware/keep_locale.rb:11:in `run'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/middleware/stack.rb:22:in `call'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/middleware/stack.rb:26:in `pass'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/middleware.rb:18:in `pass'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/foreman-tasks-0.12.0/app/lib/actions/middleware/keep_current_user.rb:28:in `block in run'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/foreman-tasks-0.12.0/app/lib/actions/middleware/keep_current_user.rb:51:in `restore_curent_user'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/foreman-tasks-0.12.0/app/lib/actions/middleware/keep_current_user.rb:28:in `run'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/middleware/stack.rb:22:in `call'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/middleware/stack.rb:26:in `pass'
 | /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-0.8.36/lib/dynflow/middleware.rb:18:in `pass'

There hasn’t been any changes in dynflow/tasks recently, that should influence this. We have been doing this for a long time:

This issue usually shows up, when there are some new threads spawned, that use active record object, and don’t call the cleanup methods at the end. I wonder if there is a chance a change like this was done recently in Katello.

While I’m waiting for the reproducer on my setup, do we have a link to a last job that was passing and the first that was failing, to see what were the changes that could influence this (e.g. some change in foreman, katello, Rails?)

Maybe with the recent column type changes a db access error somewhere earlier “eats up” the connections? I think the fix is not in katello’s nightlies yet.

This was the last successful run:

http://ci.theforeman.org/job/katello-nightly-release/90/

We were seeing a similar behavior and failure that we are seeing now a month or so back as well though. At the time, dropping the test and development database deinfitions seemed to fix the issue. But I question if that just got lucky or not.

https://github.com/theforeman/puppet-foreman/commit/afa2392fa1fd1a393345bcae4b58ae7f57c44222

Good thing is I’m able to reproduce it locally. The bad thing is I still don’t know why it happens, but it seems like it has something to do with the active job. Will keep the thread updated.

Ok, so I’ve spent an afternoon trying to nail the issues, and ended up with not one, not two, but three potential bugs causing this issues:

For all of the issues, I’ve opened PRs. Tested partially in my setup, but the system was already quite patched at the end, so I need to respin the environment. Any additional testing welcome as always.

1 Like

Brilliant! Thanks so much for digging into this @iNecas. I will clean room test these patches today.

Success! I ran the following scenario with all tests passing:

  1. Install all packages
  2. Patch dynflow and foreman-tasks
  3. Ran installer
  4. Ran bats tests

ACKs on all PRs for me.

On a dev server using locally modified versions of github, is it just the
server that I need to run or are there some other services that are
impacted by the changes?

@iNecas Sooo, we are seeing these same issues on Katello 3.6 RC2 testing. Could we please get an expedited foreman-tasks and dynflow release for the 1.17 release repositories with these fixes in them?

Here you are:

https://github.com/theforeman/foreman-packaging/pull/2306
https://github.com/theforeman/foreman-packaging/pull/2307

The dynflow 0.8.37, that is released in nightlies should be backward compatible an ready to be pulled to 1.17 to address the issues fixed in that part.

1 Like