Problem:
Remote execution fails on a single host/task out of the batch of tasks in a job. Smart Proxy and production.log (included below) show “RestClient::Forbidden 403 Forbidden” and “Completed 403 Forbidden” respectively. Only occasionally - once or twice a month.
We have had this issue since we started using Foreman over a year ago and it has continued through at least 1 upgrade. It recently became an actual problem because we had to start using remote execution for an unfortunately tightly orchestrated process that fails if 1 element fails.
Expected outcome:
Remote execution does not fail randomly.
Foreman and Proxy versions:
Foreman 2.3.3-1
Katello 3.18.2-1
Distribution and version:
RHEL 7.9
What I have tried/found so far
I found a number of posts that all reference “RestClient::Forbidden 403 Forbidden” but they seemed to be persistent problems, not intermittent like mine:
https://access.redhat.com/solutions/4819051
https://foreman-users.narkive.com/OyUcEqV3/new-foreman-installation-new-proxy-won-t-connect-to-one-another-why
https://github.com/theforeman/puppet-foreman_proxy/issues/127
The solution in each of these cases was to add any CNAMEs in the ecosystem to the trusted_hosts lists on each proxy. We only have one CNAME in the ecosystem for our main Foreman server. I added the CNAME to :trusted_hosts: in /etc/foreman/settings.yml on our main Foreman server and 3 Smart Proxies.
I thought this fixed the issue, but unfortunately it did not. I also found this post, which again seems like it was persistent, not intermittent:
https://community.theforeman.org/t/remote-execution-failing-with-403-forbidden/19270/4
That post mentioned the Trusted Hosts lists in the UI, which I located at Settings->Authentication->Trusted Hosts. I have added the one CNAME for our main Foreman server there. However, I just did that today and I am not very confident it will correct the issue.
Other relevant data:
Smart Proxy Log: (Snipped, complete log can be supplied)
2021-07-05T10:30:33 a06beb0d [E] <RestClient::Forbidden> 403 Forbidden
/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.2/lib/restclient/abstract_response.rb:223:in `exception_with_response'
/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.2/lib/restclient/abstract_response.rb:103:in `return!'
/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.2/lib/restclient/request.rb:809:in `process_result'
/opt/theforeman/tfm/root/usr/share/gems/gems/rest-client-2.0.2/lib/restclient/request.rb:725:in `block in transmit'
/opt/rh/rh-ruby25/root/usr/share/ruby/net/http.rb:910:in `start'
<SNIP>
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.1.6/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:324:in `catch'
/opt/theforeman/tfm/root/usr/share/gems/gems/concurrent-ruby-1.1.6/lib/concurrent-ruby/concurrent/executor/ruby_thread_pool_executor.rb:324:in `block in create_worker'
/opt/theforeman/tfm/root/usr/share/gems/gems/logging-2.3.0/lib/logging/diagnostic_context.rb:474:in `block in create_with_logging_context'
production.log:
2021-07-05T10:30:33 [I|app|d60ba5dc] Started POST "/foreman_tasks/api/tasks/callback" for 127.0.0.1 at 2021-07-05 10:30:33 -0500
2021-07-05T10:30:33 [I|app|741b0775] Started POST "/foreman_tasks/api/tasks/callback" for 127.0.0.1 at 2021-07-05 10:30:33 -0500
2021-07-05T10:30:33 [I|app|d60ba5dc] Processing by ForemanTasks::Api::TasksController#callback as */*
2021-07-05T10:30:33 [I|app|d60ba5dc] Parameters: {"callback"=>{"task_id"=>"885ec5d8-b3a2-4185-8407-97aebabb39a6", "step_id"=>3}, "data"=>{"result"=>[{"output_type"=>"stdout", "output"=>"rex login: ", "timestamp"=>1625499032.5666928}, {"output_type"=>"stdout", "output"=>"\r\n", "timestamp"=>1625499033.5729036}, {"output_type"=>"stdout", "output"=>"127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4\r\n::1 localhost localhost.localdomain localhost6 localhost6.localdomain6\r\n\r\n172.18.5.20\tadms0006.tm.dc02.mgicint.net adms0006\r\n", "timestamp"=>1625499033.5731907}], "runner_id"=>"1ae5af2a-fb0b-4543-9136-da7d29031114", "exit_status"=>0}, "task"=>{}}
2021-07-05T10:30:33 [I|app|741b0775] Processing by ForemanTasks::Api::TasksController#callback as */*
2021-07-05T10:30:33 [I|app|741b0775] Parameters: {"callback"=>{"task_id"=>"429e184d-8dc2-4d29-b125-c67c29dbb06f", "step_id"=>3}, "data"=>{"result"=>[{"output_type"=>"stdout", "output"=>"rex login: ", "timestamp"=>1625499032.5674927}, {"output_type"=>"stdout", "output"=>"\r\n", "timestamp"=>1625499033.5762162}, {"output_type"=>"stdout", "output"=>"127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4\r\n::1 localhost localhost.localdomain localhost6 localhost6.localdomain6\r\n\r\n172.18.4.20\tadmt0002.ts.dc02.mgicint.net admt0002\r\n", "timestamp"=>1625499033.5764387}], "runner_id"=>"d73ff876-e6ac-4581-a4e3-c7eb974f01fe", "exit_status"=>0}, "task"=>{}}
2021-07-05T10:30:33 [I|bac|] Task {label: Actions::RemoteExecution::RunHostJob, id: 31e80bfb-73cd-42a9-a8e3-26bfcf427923, execution_plan_id: 63cc422b-96c7-4f5e-ba2f-704637198a18} state changed: planned
2021-07-05T10:30:33 [W|app|741b0775] No SSL cert with CN supplied - request from 172.18.1.64
2021-07-05T10:30:33 [I|app|d60ba5dc] Completed 200 OK in 11ms (Views: 0.2ms | ActiveRecord: 2.2ms | Allocations: 3349)
2021-07-05T10:30:33 [I|app|741b0775] Rendering api/v2/errors/access_denied.json.rabl within api/v2/layouts/error_layout
2021-07-05T10:30:33 [I|app|741b0775] Rendered api/v2/errors/access_denied.json.rabl within api/v2/layouts/error_layout (Duration: 1.2ms | Allocations: 332)
2021-07-05T10:30:33 [I|app|741b0775] Filter chain halted as #<Proc:0x000000000a9de6f8@/usr/share/foreman/app/controllers/concerns/foreman/controller/smart_proxy_auth.rb:14> rendered or redirected
2021-07-05T10:30:33 [I|app|741b0775] Completed 403 Forbidden in 14ms (Views: 5.0ms | ActiveRecord: 2.2ms | Allocations: 3187)
production.log from 10 minutes later when the task actually fails, I believe due to timeout:
2021-07-05T10:40:11 [E|bac|] A sub task failed (RuntimeError)
| /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/action/with_sub_plans.rb:231:in `check_for_errors!'
| /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/action/with_sub_plans.rb:137:in `try_to_finish'
| /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/action/with_polling_sub_plans.rb:19:in `poll'
| /opt/theforeman/tfm/root/usr/share/gems/gems/dynflow-1.4.7/lib/dynflow/action/with_polling_sub_plans.rb:11:in `run'
<SNIP>
2021-07-05T10:40:11 [I|bac|] Task {label: Actions::RemoteExecution::RunHostsJob, id: f4b391d5-e579-4fd4-89d8-d41916a1b4cf, execution_plan_id: af8789f8-3fc5-488e-bad7-734979032539} state changed: stopped result: warning
2021-07-05T10:40:11 [I|bac|] Task {label: Actions::RemoteExecution::RunHostsJob, id: f4b391d5-e579-4fd4-89d8-d41916a1b4cf, execution_plan_id: af8789f8-3fc5-488e-bad7-734979032539} state changed: stopped result: warning