Failed to delete test-vm-5.kt21c.net [] (issues with remote execution / ssh keys stuff?)

Problem:

When I tried to delete one test VM I had created earlier with foreman, it gives the error "Failed to delete test-vm-5.kt21c.net ":

That is a VM on a libvirt hypervisor, installations done with PXE and kickstart.

As far as I can tell, that this VM was created before I installed the remote-execution feature (yesterday, June 4th):

foreman-installer --enable-foreman-plugin-remote-execution --enable-foreman-proxy-plugin-remote-execution-ssh

[root@fuji3 qemu]# ll /etc/libvirt/qemu/test-vm-5.kt21c.net.xml
-rw------- 1 root root 2734 Jun 3 23:19 /etc/libvirt/qemu/test-vm-5.kt21c.net.xml
[root@fuji3 qemu]# date
Fri Jun 5 20:27:39 EEST 2020
[root@fuji3 qemu]#

I mention this, because I suspect this might have something to do with it, because the log reports soemthing related to ssh keys failing:

   2020-06-05T20:10:23 aa7e5f15 warning app ERF12-6886 [ProxyAPI::ProxyException]: Unable to remove host from known hosts ([TypeError]: no implicit conversion of nil into String) for proxy https://katello.kt
    21c.net:9090/ssh
    2020-06-05T20:10:23 aa7e5f15 warning app Rolling back due to a problem: [#<Orchestration::Task:0x00007fcfc08c0428 @name="Remove SSH known hosts for test-vm-5.kt21c.net", @id="ssh_remove_known_hosts_interface__1", @status="failed", @priority=200, @action=[#<Nic::Managed id: 12, mac: "52:54:00:ca:9e:bd", ip: nil, type: "Nic::Managed", name: "test-vm-5.kt21c.net", host_id: 12, subnet_id: 1, domain_id: 1, attrs: {}, created_at: "2020-06-03 20:18:50", updated_at: "2020-06-03 20:18:50", provider: nil, username: nil, password: nil, virtual: false, link: true, identifier: "", tag: "", attached_to: "", managed: true, mode: "balance-rr", attached_devices: "", bond_options: "", primary: true, provision: true, compute_attributes: {"type"=>"bridge", "bridge"=>"br-kt21c", "model"=>"virtio"}, ip6: "", subnet6_id: nil, execution: true>, :drop_from_known_hosts, [1, nil]], @created=1591377023.478005, @timestamp=2020-06-05 17:10:23 UTC>]

Expected outcome:

VM should be deleted in libvirt hypervisor and foreman.

Foreman and Proxy versions:

2.0.0

Foreman and Proxy plugin versions:

Distribution and version:

Both katello VM and the hypervisor (fuji3) are:

[root@fuji3 vms]# cat /etc/redhat-release 
CentOS Linux release 7.8.2003 (Core)
[root@fuji3 vms]# 

Other relevant data:

production.log contains this:

2020-06-05T20:10:23 aa7e5f15 info app Started DELETE "/hosts/test-vm-5.kt21c.net" for 192.168.1.148 at 2020-06-05 20:10:23 +0300
2020-06-05T20:10:23 aa7e5f15 info app Processing by HostsController#destroy as HTML
2020-06-05T20:10:23 aa7e5f15 info app   Parameters: {"authenticity_token"=>"Ahg7iib63olZtInZZSvNZTYZlj9tk2HAi/RKaEhviXXUsWA62wYEUd/R6lbevjQdwhjNaFUrVWa61yFIwEulHg==", "id"=>"test-vm-5.kt21c.net"}
2020-06-05T20:10:23 aa7e5f15 info app Delete the TFTP configuration for test-vm-5.kt21c.net
2020-06-05T20:10:23 aa7e5f15 info app Delete the TFTP configuration for test-vm-5.kt21c.net
2020-06-05T20:10:23 aa7e5f15 info app Delete the TFTP configuration for test-vm-5.kt21c.net
2020-06-05T20:10:23 aa7e5f15 info app Delete the TFTP configuration for test-vm-5.kt21c.net
2020-06-05T20:10:23 aa7e5f15 warning app ERF12-6886 [ProxyAPI::ProxyException]: Unable to remove host from known hosts ([TypeError]: no implicit conversion of nil into String) for proxy https://katello.kt
21c.net:9090/ssh
2020-06-05T20:10:23 aa7e5f15 warning app Rolling back due to a problem: [#<Orchestration::Task:0x00007fcfc08c0428 @name="Remove SSH known hosts for test-vm-5.kt21c.net", @id="ssh_remove_known_hosts_interface__1", @status="failed", @priority=200, @action=[#<Nic::Managed id: 12, mac: "52:54:00:ca:9e:bd", ip: nil, type: "Nic::Managed", name: "test-vm-5.kt21c.net", host_id: 12, subnet_id: 1, domain_id: 1, attrs: {}, created_at: "2020-06-03 20:18:50", updated_at: "2020-06-03 20:18:50", provider: nil, username: nil, password: nil, virtual: false, link: true, identifier: "", tag: "", attached_to: "", managed: true, mode: "balance-rr", attached_devices: "", bond_options: "", primary: true, provision: true, compute_attributes: {"type"=>"bridge", "bridge"=>"br-kt21c", "model"=>"virtio"}, ip6: "", subnet6_id: nil, execution: true>, :drop_from_known_hosts, [1, nil]], @created=1591377023.478005, @timestamp=2020-06-05 17:10:23 UTC>]
2020-06-05T20:10:23 aa7e5f15 info app Deploying TFTP iPXE configuration for test-vm-5.kt21c.net
2020-06-05T20:10:23 aa7e5f15 info app Deploying TFTP PXEGrub configuration for test-vm-5.kt21c.net
2020-06-05T20:10:24 aa7e5f15 info app Deploying TFTP PXEGrub2 configuration for test-vm-5.kt21c.net
2020-06-05T20:10:24 aa7e5f15 info app Deploying TFTP PXELinux configuration for test-vm-5.kt21c.net
2020-06-05T20:10:24 aa7e5f15 info app Processed 5 tasks from queue 'Host::Managed Main', completed 0/5
2020-06-05T20:10:24 aa7e5f15 err app Task 'Delete TFTP PXELinux config for test-vm-5.kt21c.net' *rollbacked*
2020-06-05T20:10:24 aa7e5f15 err app Task 'Delete TFTP PXEGrub2 config for test-vm-5.kt21c.net' *rollbacked*
2020-06-05T20:10:24 aa7e5f15 err app Task 'Delete TFTP PXEGrub config for test-vm-5.kt21c.net' *rollbacked*
2020-06-05T20:10:24 aa7e5f15 err app Task 'Delete TFTP iPXE config for test-vm-5.kt21c.net' *rollbacked*
2020-06-05T20:10:24 aa7e5f15 err app Task 'Remove SSH known hosts for test-vm-5.kt21c.net' *failed*
2020-06-05T20:10:24 aa7e5f15 err app Failed to save: 
2020-06-05T20:10:24 aa7e5f15 info app Redirected to https://katello.kt21c.net/hosts/test-vm-5.kt21c.net
2020-06-05T20:10:24 aa7e5f15 info app Completed 302 Found in 951ms (ActiveRecord: 133.6ms)
2020-06-05T20:10:24 709cefc7 info app Started GET "/hosts/test-vm-5.kt21c.net" for 192.168.1.148 at 2020-06-05 20:10:24 +0300
2020-06-05T20:10:24 709cefc7 info app Processing by HostsController#show as HTML
2020-06-05T20:10:24 709cefc7 info app   Parameters: {"id"=>"test-vm-5.kt21c.net"}
2020-06-05T20:10:24 709cefc7 info app   Rendering hosts/show.html.erb within layouts/application
2020-06-05T20:10:24 709cefc7 info app   Rendered hosts/show.html.erb within layouts/application (23.3ms)
2020-06-05T20:10:24 709cefc7 info app   Rendered layouts/_application_content.html.erb (0.5ms)
2020-06-05T20:10:24 709cefc7 info app   Rendering layouts/base.html.erb
2020-06-05T20:10:24 709cefc7 info app   Rendered layouts/base.html.erb (26.4ms)
2020-06-05T20:10:24 709cefc7 info app Completed 200 OK in 71ms (Views: 47.4ms | ActiveRecord: 7.9ms)

I still don’t know why this happens. This host had indeed a host-parameter for the remote_execution ssh keys configured. So I had indeed added a key as param, as I said “I fainly remember I did …”. I removed that host parameter, that did not change anything.

I rebuild the host once, and later marked it as unmanaged, deleted the host directly in the libvirt hypervisor, and at some point deleting it in foreman then also worked.

Not exactly sure why/what caused the hiccup, but it’s now gone so I mark this here as resolved.