REX fails randomly on random servers with error "Could not establish connection to remote host using any available authentication method, tried password, publickey"

Problem: Running REX jobs on a group of servers some will fail with an error while others do not. If you rerun the job on all of the servers again some of the failed servers may connect and some of the servers that connected previous may fail. Very random and without any pattern.

Error:

 1:
Error initializing command: RuntimeError - Could not establish connection to remote host using any available authentication method, tried password, publickey
   2:
Exit status: EXCEPTION
   3:
StandardError: Job execution failed

Here we can see the failed connection:

Then, if I rerun the job it will connect and I have done no changes:

Expected outcome: REX works every time

Foreman and Proxy versions:
3.5.3

Foreman and Proxy plugin versions:
F 3.5.3 K 4.7.6

Distribution and version:
Rocky 8

Other relevant data:
I have cleared the entire known hosts file /var/lib/foreman-proxy/ssh/known_hosts in hopes that there may have been a mismatch with the IP or hostname, but this does not solve the randomness of the error. REX connection settings is to use IP as the connection method.

That’s somewhat odd, what do the logs on the client (sshd logs, /var/log/secure maybe) say?

Hi,

So on the server side in I can see it fails authentication, we use ansible as the SSH user and the SSH keys from stored on our IDM server.

This is when the job ran and failed on this server:

Then it was reran and it connected

If I’m reading that right, the one that succeeded got in with a password?

You know, that is correct. It is authenticating with password, it should be using SSH key…Let me take a deeper look now why.