May 28, 2020, 1:39pm
It’s now the third time that last couple of days that at some point two or more repository sync tasks are hanging and never finish. It’s different repositories and at different times. Currently it’s EPEL 8 and CentOS 7 Plus. It’s working fine before that for multiple times. A reboot or restart of katello and a manual sync work fine.
Tasks are pending, delayed.
Running action is Actions::Pulp::Repository::Sync in state suspended.
Dynflow console says “waiting for Pulp to finish the task”.
It will stay like this until a restart of katello/foreman.
How do I fix this? Thx
Sync finishing within seconds or minutes as usual.
Foreman and Proxy versions:
Foreman and Proxy plugin versions:
Distribution and version:
May 30, 2020, 5:12am
It’s really bad: it takes 1-2 days after a restart of katello until the next repo sync hangs. That means the sync plan hangs. That means it doesn’t sync any of those repositories. That task just sits there “running”…
June 2, 2020, 2:50pm
Has anyone any insight how to troubleshoot this issue?
Can you share us the
foreman-debug output when you get this error ?. Do you see any traces in
June 2, 2020, 4:35pm
I see one error in production.log at about the time when I would expect the sync task to finish:
2020-05-30T06:54:34 [E|dyn|] PersistenceError in executor
2020-05-30T06:54:34 [E|dyn|] caused by Sequel::PoolTimeout: timeout: 5.0, elapsed: 5.000103840007796 (Dynflow::Errors::PersistenceError)
raise_pool_timeout' /opt/theforeman/tfm/root/usr/share/gems/gems/sequel-5.7.1/lib/sequel/connection_pool/threaded.rb:149:in acquire’
hold' /opt/theforeman/tfm/root/usr/share/gems/gems/sequel-5.7.1/lib/sequel/database/connecting.rb:269:in synchronize’
literal_string_append' /opt/theforeman/tfm/root/usr/share/gems/gems/sequel-5.7.1/lib/sequel/dataset/sql.rb:82:in literal_append’
complex_expression_sql_append' /opt/theforeman/tfm/root/usr/share/gems/gems/sequel-5.7.1/lib/sequel/adapters/shared/postgres.rb:1320:in complex_expression_sql_append’
to_s_append' /opt/theforeman/tfm/root/usr/share/gems/gems/sequel-5.7.1/lib/sequel/dataset/sql.rb:1161:in literal_expression_append’
literal_append' /opt/theforeman/tfm/root/usr/share/gems/gems/sequel-5.7.1/lib/sequel/dataset/sql.rb:414:in block in complex_expression_sql_append’
each' /opt/theforeman/tfm/root/usr/share/gems/gems/sequel-5.7.1/lib/sequel/dataset/sql.rb:412:in complex_expression_sql_append’
complex_expression_sql_append' /opt/theforeman/tfm/root/usr/share/gems/gems/sequel-5.7.1/lib/sequel/sql.rb:96:in to_s_append’
literal_expression_append' /opt/theforeman/tfm/root/usr/share/gems/gems/sequel-5.7.1/lib/sequel/dataset/sql.rb:89:in literal_append’
block in sql' /opt/theforeman/tfm/root/usr/share/gems/gems/sequel-5.7.1/lib/sequel/dataset/placeholder_literalizer.rb:173:in each’
sql' /opt/theforeman/tfm/root/usr/share/gems/gems/sequel-5.7.1/lib/sequel/dataset/placeholder_literalizer.rb:159:in first’
first' /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.3/lib/dynflow/persistence_adapters/sequel.rb:312:in block in load_record’
with_retry' /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.3/lib/dynflow/persistence_adapters/sequel.rb:312:in load_record’
load_action' /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.3/lib/dynflow/persistence.rb:22:in load_action’
open_action' /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.3/lib/dynflow/execution_plan/steps/abstract_flow_step.rb:16:in execute’
execute' /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.3/lib/dynflow/executors/sidekiq/worker_jobs.rb:11:in block (2 levels) in perform’
run_user_code' /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.3/lib/dynflow/executors/sidekiq/worker_jobs.rb:9:in block in perform’
with_telemetry' /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.3/lib/dynflow/executors/sidekiq/worker_jobs.rb:8:in perform’
perform' [ sidekiq ] [ concurrent-ruby ] /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.3/lib/dynflow/executors/sidekiq/orchestrator_jobs.rb:40:in perform’
[ sidekiq ]
[ concurrent-ruby ]
Hmm. Not enough context to troubleshoot one way or the other. Could you email me the foreman debug output (
June 18, 2020, 5:21am
It happened again yesterday. I have looked through the logs and journal and tracked the progress of the yum importer (which seems to be the hanging step in dynflow) and compared it to a successful task: it actually succeeds and also seem to start the repo.publish.publish directly after that which also succeeds. The task hanging has actually finished successfully but dynflow/foreman doesn’t seem to pick that up…
Try increasing the db-pool size. We ve been noticing some issues with the
June 22, 2020, 8:38am
So far no hanging tasks. Of course, I can’t be sure as I don’t have a way to force the issue.
And even if the increase of the db-pool helps the issue at the moment, I think it should handle the whole problem, whenever it occurs, better than just keep a task hanging which blocks your sync plan and which you cannot cancel in the web gui but only by a foreman/katello restart…
There were some fixes around the connection pool handling which went into dynflow 1.4.4 (2.0 repos currently have 1.4.3). We could probably get the newer version into 2.0 repos as well.
June 24, 2020, 4:59am
It happened again yesterday. I have increased the db pool to 15. Yet, the sync of one repository is hanging again.
Can you send me the
journalctl -u dynflow-sidekiq@*
output the next time you hit this issue?
July 14, 2020, 4:59am
After a couple of days without issues lately it is happening every day again. Since tonight there are even two sync tasks hanging…
Apologize for the delay in my response. Can ya send me the logs again. I
think the one you uploaded on next cloud expired
July 26, 2020, 5:08am
Do you have any insights about this issue? It really gets annoying. I have to restart foreman almost every day now… -Gerald
Would you be able to try to upgrade your dynflow package:
yum update http://koji.katello.org/kojifiles/packages/tfm-rubygem-dynflow/1.4.6/1.fm2_0.el7/noarch/tfm-rubygem-dynflow-1.4.6-1.fm2_0.el7.noarch.rpm
This is the build that
@aruzicka mentioned previously
We rebuilt the dynflow package because I think there were some bugs with
respect to pool time out (especially yours.) Could you upgrade the dynflow
package, restart foreman-tasks and httpd.
July 29, 2020, 4:46am
O.K. I have just installed the new version. I’ll check if it happens again or not… Thanks.
I’ve been facing the same problem lately. After upgrade to dynflow 1.4.6 it seems to be all ok and all sync task running against without problem. I will keep eye on repo sync for next few days if hanging problem will abears again or not.
July 30, 2020, 4:57am
The new version doesn’t seem to help: my daily sync tonight has four hanging sync tasks. Never had that many before…