Rollback drop_from_known_hosts' error

Problem:

Ansible playbook misconfiguration + callback created a host entry with a funny name.
Host delete is not possible.

Expected outcome:

Host delete works.

Foreman and Proxy versions:

Foreman 2.3.2-1.el7
Katello 3.18.1-1.el7
pulp-server 2.21.5-1.el7

Distribution and version:

RHEL 7.9

Other relevant data:

Looks like {{ }} is the problem. Is there a way to completly remove the host? The host is not existing and never was.
I am able to delete other hosts (e.g. id 242)

ID NAME OPERATING SYSTEM HOST GROUP CONTENT VIEW LIFECYCLE ENVIRONMENT
371 defr4app40.{{ joindomain }}
242 foreman.de.testworld.test.com
---- -------------------------------- ------------------ ------------ -------------- ----------------------
[root@foreman ~]# hammer host delete --id 371
Host deleted.

production.log

 2021-04-20T18:40:52 [W|app|df75178e] Rolling back due to a problem: [#<Orchestration::Task:0x000000000e2a27c8 @name="Remove SSH known hosts for defr4app40.{{ joindomain }}", @id="ssh_remove_known_hosts_host_defr4app40.{{ joindomain }}_3", @status="failed", @priority=200, @action=[#<Host::Managed id: 371, name: "defr4app40.{{ joindomain }}", last_compile: nil, last_report: [FILTERED], updated_at: "2021-03-25 22:03:07", created_at: "2021-03-25 22:03:07", root_pass: nil, architecture_id: nil, operatingsystem_id: nil, environment_id: nil, ptable_id: nil, medium_id: nil, build: false, comment: nil, disk: nil, installed_at: nil, model_id: nil, hostgroup_id: nil, owner_id: nil, owner_type: nil, enabled: true, puppet_ca_proxy_id: nil, managed: false, use_image: nil, image_file: nil, uuid: nil, compute_resource_id: nil, puppet_proxy_id: nil, certname: nil, image_id: nil, organization_id: nil, location_id: nil, type: "Host::Managed", otp: nil, realm_id: nil, compute_profile_id: nil, provision_method: nil, grub_pass: "", global_status: 2, lookup_value_matcher: [FILTERED], pxe_loader: nil, initiated_at: nil, build_errors: nil, openscap_proxy_id: nil>, :drop_from_known_hosts, 3], @created=1618936852.5261989, @timestamp=2021-04-20 16:40:52 UTC>]
    2021-04-20T18:40:52 [W|app|df75178e] Failed to perform rollback on Remove SSH known hosts for defr4app40.{{ joindomain }} - Dont know how to rollback drop_from_known_hosts
    2021-04-20T18:40:52 [I|app|df75178e] Backtrace for 'Failed to perform rollback on Remove SSH known hosts for defr4app40.{{ joindomain }} - Dont know how to rollback drop_from_known_hosts' error (RuntimeError): Dont know how to rollback drop_from_known_hosts

  df75178e | /opt/theforeman/tfm/root/usr/share/gems/gems/logging-2.3.0/lib/logging/diagnostic_context.rb:474:in `block in create_with_logging_context'
    2021-04-20T18:40:52 [W|app|df75178e] Failed to perform rollback on Remove SSH known hosts for defr4app40.{{ joindomain }} - Dont know how to rollback drop_from_known_hosts
    2021-04-20T18:40:52 [I|app|df75178e] Backtrace for 'Failed to perform rollback on Remove SSH known hosts for defr4app40.{{ joindomain }} - Dont know how to rollback drop_from_known_hosts' error (RuntimeError): Dont know how to rollback drop_from_known_hosts

Hi @plbln,

I’m curious what API you used to create that host. In the foreman console at least, I was blocked from using those characters.

Here’s something to try in the foreman console (foreman-rake console):

Host.find(371).subscription_facet.destroy
Host.find(371).content_facet.destroy
Host.find(371).destroy

Hi @iballou,

not an API.
I created some Ansible templates with a variable called “joindomain” which add its corresponding domain to the hostname. The variable was not set correctly therefore ansible tried to

“ssh user@defr4app40.{{ joindomain }}”.

This is obviously not possible, but after loading facts through the playbook, the callback function send the hostname to foreman.

I will try it later today.

Thanks

With

Host.find(371).destroy

it is the same error

 2021-04-22T10:38:13 [W|app|] Rolling back due to a problem: [#<Orchestration::Task:0x000000000a3b0ab0 @name="Remove SSH known hosts for defr4app40.{{ joindomain }}", @id="ssh_remove_known_hosts_host_defr4app40.{{ joindomain }}_3", @status="failed", @priority=200, @action=[#<Host::Managed id: 371, name: "defr4app40.{{ joindomain }}", last_compile: nil, last_report: [FILTERED], updated_at: "2021-03-25 22:03:07", created_at: "2021-03-25 22:03:07", root_pass: nil, architecture_id: nil, operatingsystem_id: nil, environment_id: nil, ptable_id: nil, medium_id: nil, build: false, comment: nil, disk: nil, installed_at: nil, model_id: nil, hostgroup_id: nil, owner_id: nil, owner_type: nil, enabled: true, puppet_ca_proxy_id: nil, managed: false, use_image: nil, image_file: nil, uuid: nil, compute_resource_id: nil, puppet_proxy_id: nil, certname: nil, image_id: nil, organization_id: nil, location_id: nil, type: "Host::Managed", otp: nil, realm_id: nil, compute_profile_id: nil, provision_method: nil, grub_pass: "", global_status: 2, lookup_value_matcher: [FILTERED], pxe_loader: nil, initiated_at: nil, build_errors: nil, openscap_proxy_id: nil>, :drop_from_known_hosts, 3], @created=1619080693.1048982, @timestamp=2021-04-22 08:38:13 UTC>]
2021-04-22T10:38:13 [W|app|] Failed to perform rollback on Remove SSH known hosts for defr4app40.{{ joindomain }} - Dont know how to rollback drop_from_known_hosts
2021-04-22T10:38:13 [I|app|] Backtrace for 'Failed to perform rollback on Remove SSH known hosts for defr4app40.{{ joindomain }} - Dont know how to rollback drop_from_known_hosts' error (RuntimeError): Dont know how to rollback drop_from_known_hosts


irb(main):001:0> Host.find(371).destroy
=> nil
irb(main):002:0> Host.find(371)
=> #<Host::Managed id: 371, name: "defr4app40.{{ joindomain }}", last_compile: nil, last_report: [FILTERED], updated_at: "2021-03-25 22:03:07", created_at: "2021-03-25 22:03:07", root_pass: nil, architecture_id: nil, operatingsystem_id: nil, environment_id: nil, ptable_id: nil, medium_id: nil, build: false, comment: nil, disk: nil, installed_at: nil, model_id: nil, hostgroup_id: nil, owner_id: nil, owner_type: nil, enabled: true, puppet_ca_proxy_id: nil, managed: false, use_image: nil, image_file: nil, uuid: nil, compute_resource_id: nil, puppet_proxy_id: nil, certname: nil, image_id: nil, organization_id: nil, location_id: nil, type: "Host::Managed", otp: nil, realm_id: nil, compute_profile_id: nil, provision_method: nil, grub_pass: "", global_status: 2, lookup_value_matcher: [FILTERED], pxe_loader: nil, initiated_at: nil, build_errors: nil, openscap_proxy_id: nil>

When you remove a host, foreman reaches out to all the smart proxies that could have been used for remote execution against the host and tries to remove the hosts public key from known hosts file. If the proxy cannot be reached (or fails hard otherwise), it fails and halt the entire removal. Are all your proxies up? Try checking their logs

@plbln after you’ve checked if @aruzicka 's post helped, are you able to share the relevant bits of your Ansible playbook(s)? I’d like to try to find out how that host got created in Foreman. The character checks should’ve blocked it.

Hi,

sorry for the late response, but i managed to delete the host.

@aruzicka
it was indeed a certificate issue in one smartproxy which was hard to discover, because the log output for some errors is currently not the best… maybe i should update to a newer version :slight_smile:

@iballou
the ansible playbook is long gone… i tried to figured it out, but had no luck
i remember having the issues in 2.2/3.17

Best

1 Like