Help debugging strange occasional proxy connection issues

Problem:

Hi there! I’ve been trying to get to the bottom of a strange issue that I’ve had with Foreman and Foreman Proxy for quite a while now. Sadly I can’t remember how long it’s been going on anymore.

Couple of “the state of things” before we get into the issue:

  1. I install and configure foreman through puppet modules
  2. I run in a very spread out separate server setup to help keep things up at all times – 3 puppet servers, 2 foreman frontend servers, 2 tftp servers, 2 puppetca servers, and 2 puppetdb servers. Load balancing and VIPS involved. However, foreman itself doesn’t -really- know about any of that. It speaks to each backend server individually through foreman proxies.
  3. I’ve been running foreman like this for quite some time now with much success =)

So – one of the easiest ways I can see this behavior is to go look at the Smart Proxies section in Foreman. Right now, every last one of them is green. No problem what-so-ever. However, if I let it sit for (unknown about of time, I think it’s 5 minutes?) and reload the page, I can almost guarantee I will see a few go red. An immediate reload puts them all green again.

The other, more direct, symptom is that if I go to do an action requiring one of the proxies, it will fail saying “no route to host”. If I simply try again, it will succeed, guaranteed.

I have performed all manner of DNS related tests on the foreman servers to make sure the names always resolve and I have yet to see them fail. I’ve even put the names in /etc/hosts to eliminate DNS as an issue.

This leads me to think that foreman has opened a connection to X proxy, that connection went stale, attempting to use it again triggered things to realize the connection was dead, and next attempt causes a fresh connection to the proxy.

So far I haven’t seen anything in any logs that tells me more than “no route to host”. =/

Any ideas? =) It’s obviously not game breaking, but it does get frustrating sometimes.

Also after said (5 minute?) wait, none fail if I do a bunch of actions rapid fire. It definitely only seems to trigger after some amount of idle.

Expected outcome:

No weird timeouts. =)

Foreman and Proxy versions:

Currently 1.20.1, but it’s been going on for quite a few versions at this point.

Foreman and Proxy plugin versions:

11.0.0 and 10.0.0 (but again, has been going on for quite a while)

Other relevant data:

One other thing that might be worth mentioning – I am using the memcached plugin for foreman.

[e.g. logs from Foreman and/or the Proxy, modified templates, commands issued, etc]
(for logs, surround with three back-ticks to get proper formatting, e.g.)

2019-01-17T09:49:09 [W|app|17ba9] Unable to execute puppet run
ProxyAPI::ProxyException: ERF12-4252 [ProxyAPI::ProxyException]: Unable to execute Puppet run ([Errno::EHOSTUNREACH]: Failed to open TCP connection to XXXHOSTNAMEXXX:8443 (No route to host - connect(2) for "XXXHOSTNAMEXXX" port 8443)) for proxy https://XXXHOSTNAMEXXX:8443/puppet
/usr/share/foreman/lib/proxy_api/puppet.rb:41:in `rescue in run'
/usr/share/foreman/lib/proxy_api/puppet.rb:38:in `run'
/usr/share/foreman/app/models/host/managed.rb:610:in `puppetrun!'
/usr/share/foreman/app/controllers/hosts_controller.rb:217:in `puppetrun'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_controller/metal/basic_implicit_render.rb:6:in `send_action'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/abstract_controller/base.rb:194:in `process_action'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_controller/metal/rendering.rb:30:in `process_action'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/abstract_controller/callbacks.rb:42:in `block in process_action'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:109:in `block in run_callbacks'
/usr/share/foreman/app/controllers/concerns/application_shared.rb:15:in `set_timezone'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:118:in `block in run_callbacks'
/usr/share/foreman/app/models/concerns/foreman/thread_session.rb:32:in `clear_thread'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:118:in `block in run_callbacks'
/usr/share/foreman/app/controllers/concerns/foreman/controller/topbar_sweeper.rb:12:in `set_topbar_sweeper_controller'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:118:in `block in run_callbacks'
/opt/theforeman/tfm/root/usr/share/gems/gems/audited-4.7.1/lib/audited/sweeper.rb:14:in `around'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:118:in `block in run_callbacks'
/opt/theforeman/tfm/root/usr/share/gems/gems/audited-4.7.1/lib/audited/sweeper.rb:14:in `around'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:118:in `block in run_callbacks'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/activesupport-5.2.1/lib/activ
e_support/callbacks.rb:136:in `run_callbacks'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/abstract_controller/callbacks.rb:41:in `process_action'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_controller/metal/rescue.rb:22:in `process_action'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_controller/metal/instrumentation.rb:34:in `block in process_action'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/activesupport-5.2.1/lib/active_support/notifications.rb:168:in `block in instrument'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/activesupport-5.2.1/lib/active_support/notifications/instrumenter.rb:23:in `instrument'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/activesupport-5.2.1/lib/active_support/notifications.rb:168:in `instrument'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_controller/metal/instrumentation.rb:32:in `process_action'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_controller/metal/params_wrapper.rb:256:in `process_action'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/activerecord-5.2.1/lib/active_record/railties/controller_runtime.rb:24:in `process_action'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/abstract_controller/base.rb:134:in `process'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionview-5.2.1/lib/action_view/rendering.rb:32:in `process'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_controller/metal.rb:191:in `dispatch'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_controller/metal.rb:252:in `dispatch'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_dispatch/routing/route_set.rb:52:in `dispatch'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_dispatch/routing/route_set.rb:34:in `serve'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_dispatch/journey/router.rb:52:in `block in serve'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_dispatch/journey/router.rb:35:in `each'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_dispatch/journey/router.rb:35:in `serve'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_dispatch/routing/route_set.rb:840:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/apipie-rails-0.5.14/lib/apipie/static_dispatcher.rb:65:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/apipie-rails-0.5.14/lib/apipie/extractor/recorder.rb:137:in `call'
/usr/share/foreman/lib/middleware/telemetry.rb:9:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/apipie-rails-0.5.14/lib/apipie/middleware/checksum_in_headers.rb:27:in `call'
/usr/share/foreman/lib/middleware/catch_json_parse_errors.rb:8:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/rack-2.0.5/lib/rack/tempfile_reaper.rb:15:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/rack-2.0.5/lib/rack/etag.rb:25:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/rack-2.0.5/lib/rack/conditional_get.rb:25:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/rack-2.0.5/lib/rack/head.rb:12:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_dispatch/http/content_security_policy.rb:18:in `call'
/usr/share/foreman/lib/middleware/logging_context.rb:23:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/rack-2.0.5/lib/rack/session/abstract/id.rb:232:in `context'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/rack-2.0.5/lib/rack/session/abstract/id.rb:226:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_dispatch/middleware/cookies.rb:670:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_dispatch/middleware/callbacks.rb:28:in `block in call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/activesupport-5.2.1/lib/active_support/callbacks.rb:98:in `run_callbacks'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_dispatch/middleware/callbacks.rb:26:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_dispatch/middleware/debug_exceptions.rb:61:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_dispatch/middleware/show_exceptions.rb:33:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/railties-5.2.1/lib/rails/rack/logger.rb:38:in `call_app'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/railties-5.2.1/lib/rails/rack/logger.rb:28:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/sprockets-rails-3.2.1/lib/sprockets/rails/quiet_assets.rb:13:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_dispatch/middleware/remote_ip.rb:81:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_dispatch/middleware/request_id.rb:27:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/rack-2.0.5/lib/rack/method_override.rb:22:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/rack-2.0.5/lib/rack/runtime.rb:22:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/activesupport-5.2.1/lib/active_support/cache/strategy/local_cache_middleware.rb:29:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_dispatch/middleware/executor.rb:14:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/actionpack-5.2.1/lib/action_dispatch/middleware/static.rb:127:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/rack-2.0.5/lib/rack/sendfile.rb:111:in `call'
/opt/theforeman/tfm/root/usr/share/gems/gems/secure_headers-6.0.0/lib/secure_headers/middleware.rb:13:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/railties-5.2.1/lib/rails/engine.rb:524:in `call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/railties-5.2.1/lib/rails/railtie.rb:190:in `public_send'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/railties-5.2.1/lib/rails/railtie.rb:190:in `method_missing'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/rack-2.0.5/lib/rack/urlmap.rb:68:in `block in call'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/rack-2.0.5/lib/rack/urlmap.rb:53:in `each'
/opt/theforeman/tfm-ror52/root/usr/share/gems/gems/rack-2.0.5/lib/rack/urlmap.rb:53:in `call'
/usr/share/ruby/vendor_ruby/phusion_passenger/rack/thread_handler_extension.rb:97:in `process_request'
/usr/share/ruby/vendor_ruby/phusion_passenger/request_handler/thread_handler.rb:149:in `accept_and_process_next_request'
/usr/share/ruby/vendor_ruby/phusion_passenger/request_handler/thread_handler.rb:110:in `main_loop'
/usr/share/ruby/vendor_ruby/phusion_passenger/request_handler.rb:415:in `block (3 levels) in start_threads'
/usr/share/ruby/vendor_ruby/phusion_passenger/utils.rb:113:in `block in create_thread_and_abort_on_exception'
/opt/theforeman/tfm/root/usr/share/gems/gems/logging-2.2.2/lib/logging/diagnostic_context.rb:474:in `block in create_with_logging_context'
2019-01-17T09:49:09 [I|app|17ba9] Redirected to https://foreman.chass.ncsu.edu/hosts/lich.chass.ncsu.edu
2019-01-17T09:49:09 [I|app|17ba9] Completed 302 Found in 89ms (ActiveRecord: 24.1ms)

Sounds like quite an interesting setup!
Are the smart proxies themselves also load-balanced? If so, one thought that comes to mind is that possibly the load balancer times out the idle connection if it isn’t used for a certain amount of time, then when the next connection attempt comes in, the connection maybe fails while the load balancer is choosing which proxy to route it to?

This might be relevant:

https://projects.theforeman.org/issues/24634

Long story short: Webrick 1.3 seems to have problems with timeouting HTTPS endpoints and it is vulnerable to slow-attacks. If you try HTTP port it works fine (timeout 30 seconds - webrick’s default setting). Looks like 1.4.2 version (current stable) does work correctly, connection is closed after 30 seconds.

Unfortunately Webrick 1.4.x requires Ruby 2.3 and we are still on RHEL Ruby 2.0. There are discussions in SCLing smart-proxy, that would solve this. Until then, there is no easy solution - according to git log in webrick there’s been huge amount of patches in regard to timeouts, concurrency, waits and synchronization.

I have tried to identify the commit in webrick which fixed this so we could backport it but I gave up as there are so many changes. Two ways out:

  • We are thinking SCLing our proxy so we can bump Ruby version and all the deps to latest and greatest versions.
  • We finalize our efforts supporting Puma as a secondary (opt-in) web server next to Webrick. A PR is pending more reviews, but this is WIP and there can be dragons.

I’m not actually load balancing the proxies themselves. I effectively followed this setup:
https://theforeman.org/2015/12/journey_to_high_availability.html
=)

(the load balancers are used for client connections to the various services [foreman’s front end, puppet, etc], but foreman speaks directly to each server for proxy purposes)

I think what Izap pointed at is exactly what’s going on though!

That does sound exactly like what I’m seeing. =) “About 5 months ago” even lines up with when I vaguely remember this all starting. If you get either option to a testable state, I’d be willing to try them out and report back – in the meantime I think I may switch to HTTP. It’s all on a single server dedicated LAN and I’m not especially concerned about someone seeing what’s going on. Thanks!!

Please report back if switching to http helped.

Do you happen to know if there’s an easy way to tell foreman-proxy to stop doing IPv6 lookups when attempting to connect via HTTP? I’m basically getting failures because it tries to look up the address.

2019-01-23T17:47:33 7568f90d [E] unable to resolve hostname for ip address ::ffff:10.143.246.77
no name for ::ffff:10.143.246.77

(since it’s an internal IP, I’m not real worried about leaving that ip in there =) )

I’m guessing it’s attempting that lookup because it doesn’t have an SSL cert to use for identity?

Note: I have no control over our DNS services. Apparently they don’t resolve IPv6 addresses of this format.

I do not see a setting that lets me tell the proxy “stop doing ipv6”. I also tried disabling ipv6 support at the kernel level and that didn’t seem to change anything. Note if I explicitly put that ::ffff:10.143.246.77 in /etc/hosts and map it to the proper hostname, it works like a charm. (though I can’t say yet whether it resolves the timeout issue, will have to observe that one over time)

Any ideas? I’m working with another issue at the moment and will roll back around to this shortly, but thought y’all might have an idea off the top of your head. =)

Ok I found a decent workaround that I think I’m going to keep. Basically I blacked out the trusted hosts list, thereby allowing anyone. I’ll basically let the firewall do it’s job instead of having foreman proxy “also” do it. =)

I’ll keep an eye on this for a while and be back in touch.

This is not configurable unfortunately, file a patch please!