Wait for "HOST" to come online

Problem:
Foreman in some cases waiting (forever) for host to come online after using image based provisioning.

Expected outcome:
Succesfull connection to new machine.

Foreman and Proxy versions: 1.17

Foreman and Proxy plugin versions:

Other relevant data:
So im using Image Based Provisioning (kvm, libvirt, basic ubuntu server image) to create new machines. I have installed foreman-dhcp and foreman-dns. I’ve configured subnet with Static boot mode and and IPAM set to randomDB (tried DHCP aswell) which i assign to new host. After i assign that subnet foreman picks some IP, however when it creates new machine that IP SOMETIMES differs and it seems foreman then cant connect back (and sometimes its the same and then machine is provisied correctly). Any ideas why does it happen and what can i do to fix it?

So the first answer is the easy part. There is a timeout, but its 600s (10 mins), so unless you waited that long, there’s probably not an issue with the timeout code.

Picking an IP at random from the DB is likely to fail, yes. You want to use DHCP and configure it to talk to libvirt.

The ideal workflow is that Foreman contacts the hypervisor and requests a new host (which replies with a MAC), and then requests an IP for that MAC. So the things to check are:

  • has the hypervisor actually created a DHCP record for this IP/MAC combo?
  • if yes, why is the booting host not getting the right IP?

Posting your proxy dns* and dhcp* config files would help so we can confirm the libvirt connection looks correct.

There was a discussion and PR on allowing users to use DNS instead of IP address but that was a stall.

Thanks for the help. I think i figured it out… I forgot that kvm libvirt uses its own DHCP server so basically i had two DHCP services in the same network, which doesnt do anything good : P So i turned off that dhcp from hypervisor and it’s all good so far.

2 Likes

We see this issue hitting users literally every month.

Is this documented anywhere?

1 Like

We assume some degree of basic networking awareness. In this case, it was a misconfiguration tho.

We can put it into the manual but I’d rather have pro-active check which works better (e.g. FQDN check) but for DHCP it’s not feasible.

If it’s not feasible (and I agree it isn’t, we can’t control people’s networks) then it should be documented, as that’s at least something. Not perfect, but something :slight_smile:

1 Like