Dynaflow Orchestator Will Not Become Active

Thanks for the reply @aruzicka - Was hoping to hear from you :slight_smile:

So after 12 painful hours, yesterday, trying to get this working - I gave up and went to bed. I come back this morning and everything is working. We have seen this with dynflow many, many times - where we just wait 5 mins, 20 mins, 2 hours, or some other random amount of time, and then things all the sudden just work. So this begs the question, is there some sort of strange wait or delay? In this case, based on my logs, it looks like everything started working 16 mins after my last FLUSH/Restart of all services.

Do you have any notion how you got into this situation?

Yep, I know exactly how we got there:

Nov 09 06:08:27 10-222-206-158.ssnc-corp.cloud dynflow-sidekiq@orchestrator[1568914]: E, [2025-11-09T06:08:27.640064 #1568914] ERROR -- /connector-database-core: Receiving envelopes failed on PG::UnableToSend: no connection to the server

Nov 09 06:03:43 10-222-206-158.ssnc-corp.cloud dynflow-sidekiq@orchestrator[1568914]: 2025-11-09T06:03:43.139Z pid=1568914 tid=xnaa WARN: ActiveRecord::StatementInvalid: PG::ConnectionBad: PQconsumeInput() SSL connection has been closed unexpectedly
Nov 09 06:03:43 10-222-206-158.ssnc-corp.cloud dynflow-sidekiq@orchestrator[1568914]: 2025-11-09T06:03:43.139Z pid=1568914 tid=xnaa WARN: /usr/share/gems/gems/activerecord-7.0.8.7/lib/active_record/connection_adapters/postgresql/database_statements.rb:48:in `exec'

We had a network outage that lasted for about 4 mins - caused the connection from the foreman servers, to the postgres k8s cluster to go down.

Thanks again for the reply! Time for me to write up some simple service that reads the logs, looks for a certain state and restarts the orch/workers.