Problem: Running REX jobs on a group of servers some will fail with an error while others do not. If you rerun the job on all of the servers again some of the failed servers may connect and some of the servers that connected previous may fail. Very random and without any pattern.
Error:
1:
Error initializing command: RuntimeError - Could not establish connection to remote host using any available authentication method, tried password, publickey
2:
Exit status: EXCEPTION
3:
StandardError: Job execution failed
Here we can see the failed connection:
Then, if I rerun the job it will connect and I have done no changes:
Expected outcome: REX works every time
Foreman and Proxy versions:
3.5.3
Foreman and Proxy plugin versions:
F 3.5.3 K 4.7.6
Distribution and version:
Rocky 8
Other relevant data:
I have cleared the entire known hosts file /var/lib/foreman-proxy/ssh/known_hosts in hopes that there may have been a mismatch with the IP or hostname, but this does not solve the randomness of the error. REX connection settings is to use IP as the connection method.