Ansible: Always get Failed status for jobs and roles

Problem:
Good day, Guys!

I’m using Foreman for couple of months.
In Foreman I use Ansible roles and Ansible playbooks as well:
=> Host => Schedule remote job =>Ansible playbook => Ansible - Run playbook

Each time I do this I receive “Failed” in the Job status.
Even if Ansible Role/Playbook works well.

Could you, please, help me to solve this issue?

Also, Guys, is there any way to run playbooks which are alredy present in yml files on same server (e.g. located in ~/Ansible/ folder)?

Expected outcome:
Fix problem with Ansible jobs status.

Foreman and Proxy versions:
Version 1.24.3 © 2009-2020 Paul Kelly and Ohad Levy
Instance 1ff98334-1042-49a4-ae5c-dc141c65ccf5

Foreman and Proxy plugin versions:
Ansible - Version3.0.1
DHCP - Version1.24.3
Dynflow - Version0.2.4
HTTPBoot - Version1.24.3
SSH - Version0.3.0
TFTP - Version1.24.3

Distribution and version:
CentOS Linux release 7.8.2003 (Core)

Other relevant data:

Hi,
when you try running a playbook there’s a table with hosts at the bottom of the page. If you click on the host’s name, it will take you to a page showing the output of the job on that host. It is really hard to guess what went wrong just from what you said, but that page should give you more insight into it

Hi,

According to job output - everything is ok.
Tasks run correctly. But job status is always the same…

Is there any way to capture detailed logs for such tasks?
To investigate and fix?

Ah, for some reason I thought that playbooks/roles run fine outside of foreman.

In the job details there’s “job task” button, on the page where it takes you click “sub tasks”, there pick a single task and look around, hopefully you’ll find something useful there.

On a side note, does it take like 10 minutes for the job to run even if it does almost nothing?

No, jobs works well. Quite fast.

I will run some simple task and will take a screenshot for it. Hope it will help.

Guys,

I want to provide small update regarding problem with Ansible jobs status:
I have same for Ansible Roles as well.

New Ansible Role was created and assigned to host.
Then in host menu I run “Run Ansible Role” command.
In the output window - everything is ok (Failed=0, Skipped=0).

But in Jobs status, after several minutes I have Run Ansible Role Failed.

Also, I have next message:
Failed to initialize: NoMethodError - undefined method `’ for nil:NilClass
And, also, a bunch of errors in Jobs Status.

Could you, please, help me in solving of this problem?

Any more details around this? A stacktrace maybe? Where does it come from?

All the other screenshots don’t really say anything as this error is exactly the same every time running something on any host in the job fails.

Unfortunately, I don’t understand clearly.
I go to Monitor=>Jobs=>Press on job description=>Press on hostname (see below):

Also, I’ve found description of similar problem here:
https://projects.theforeman.org/issues/29028

According to this article problem was solved in module foreman-tasks-1.1.0.
As I understand correctly, this module is for Foreman 2.0+, but I use 1.24.3 and have another version of foreman-tasks…

From the issue you linked:

  1. About 10 minutes later, the RunHostJob should failed with the above error.

From what you stated earlier:

No, jobs works well. Quite fast.

So which is it?

I go to Monitor=>Jobs=>Press on job description=>Press on hostname (see below):

In there, press task details, there press dynflow console and try clicking around in there. Also production.log from around that time could be useful

In my case - Role was applied quickly (I saw the progress in output windows),
but yes - information about job in status updates slowly.

Dynflow - see below:

Sure, will check production.log

Ah, so that’s what I originally had in mind.

Is is usually a misconfiguration, either wrong hostname or ssl certs in /etc/smart_proxy_dynflow_core/settings.yml.

What is happening is:

  1. A job is run
  2. Job gets delegated to the smart proxy and smart proxy dynflow core
  3. Smart proxy dynflow core runs the job (runs the actual ansible command)
  4. When the job is done, smart proxy dynflow core tries to call back to foreman and this request fails
  5. After 10 or so minutes, foreman checks the status on the smart proxy, sees the task there is failed and fails the job

Please note that the issue you linked is only a symptom, even if you had that patch, the jobs would still fail at step 4, you would just get a different error after 10 minutes.

Also, isn’t 1.24.3 already EOL?

Looks like I’ve found one issue.
Smart-Proxy tries to connect to localhost:3000, but gets “Connection refused”…just because nobody here listen this port:

Failed to open TCP connection to kh0dl1000000075.dtc.dish.corp:3000 (Connection refused - connect(2) for “kh0dl1000000075.dtc.dish.corp” port 3000) (Errno::ECONNREFUSED)

netstat -ant | grep 3000

Could you, please, tell me, which service should listen on port :3000 and how to enable it?

I’ve found that two services are disabled:

  • foreman-cockpit.service
  • foreman.service

As I understand - Foreman-Cockpit listens on :3000 and it was enabled.
Do I need to enable foreman.service as well?

Problem with port :3000 was solved.
But still no progress with Jobs status.
Currently I have next error messages:

  • 403 Forbidden (RestClient::Forbidden) - during next:
    127.0.0.1 - - [01/Oct/2020:20:30:46 EEST] “GET /tasks/542d07ff-26a8-4a10-861c-4465f9bde13c/status? HTTP/1.1” 200 7316
    403 Forbidden (RestClient::Forbidden)

Problem with port :3000 was solved.

How?

Smart-Proxy tries to connect to localhost:3000, but gets “Connection refused”…just because nobody here listen this port:

localhost:3000 is the default, most likely it is not configured to fit your environment. It it is a production deployment, then it should be fqdn of the foreman machine and port 443.

Currently I have next error messages:

Where?

Good day!

  1. Problem with port :3000 was solved by starting of foreman.service:
    systemctl enable --now foreman.service
  2. I see “403 Forbidden (RestClient::Forbidden)” in the next log:
    /var/log/foreman-proxy/smart_proxy_dynflow_core.log
  3. I will try to update /etc/smart_proxy_dynflow_core/settings.yml with FQDN:443 and will test.

I’ve changed settings in /etc/smart_proxy_dynflow_core/settings.yml to FQDN:443 and run task again.
Currently I have next right after task starts in /var/log/foreman-proxy/smart_proxy_dynflow_core.log:
127.0.0.1 - - [02/Oct/2020:11:58:25 EEST] “POST /tasks/launch? HTTP/1.1” 200 110
127.0.0.1 - - [02/Oct/2020:11:58:26 EEST] “GET /tasks/7a857992-c4e4-4255-a763-f76c2e106f83/status? HTTP/1.1” 200 6216
127.0.0.1 - - [02/Oct/2020:11:58:26 EEST] “GET /tasks/7a857992-c4e4-4255-a763-f76c2e106f83/status? HTTP/1.1” 200 6216
127.0.0.1 - - [02/Oct/2020:11:58:28 EEST] “GET /tasks/7a857992-c4e4-4255-a763-f76c2e106f83/status? HTTP/1.1” 200 6663
127.0.0.1 - - [02/Oct/2020:11:58:29 EEST] “GET /tasks/7a857992-c4e4-4255-a763-f76c2e106f83/status? HTTP/1.1” 200 6768
127.0.0.1 - - [02/Oct/2020:11:58:30 EEST] “GET /tasks/7a857992-c4e4-4255-a763-f76c2e106f83/status? HTTP/1.1” 200 7158
127.0.0.1 - - [02/Oct/2020:11:58:31 EEST] “GET /tasks/7a857992-c4e4-4255-a763-f76c2e106f83/status? HTTP/1.1” 200 7307
127.0.0.1 - - [02/Oct/2020:11:58:33 EEST] “GET /tasks/7a857992-c4e4-4255-a763-f76c2e106f83/status? HTTP/1.1” 200 7307
127.0.0.1 - - [02/Oct/2020:11:58:34 EEST] “GET /tasks/7a857992-c4e4-4255-a763-f76c2e106f83/status? HTTP/1.1” 200 7307
400 Bad Request (RestClient::BadRequest)

On 1.24 you shouldn’t have that running.

Guess you’ll have to check production.log, it might tell you what was wrong with the request.

Also how have you deployed this instance? Usually the installer sets everything up so it works out of the box