there’s been a lot of confusion around DHCP next-server option. Let me explain. Foreman manages DHCP reservations through DHCP smart-proxy module and by default it explicitly sets (overrides) both next-server and filename options. The next-server option defines IP address of the TFTP server to boot from and filename defines full path to the file that will be downloaded and executed. Typically it’s set to IP address of the smart-proxy associated to the host subnet and something like
pxelinux.0 when PXELinux loader is set. When using iPXE or UEFI HTTP boot the next-server can be ignored and filename option can hold whole URL like
http://server/path/to/ipxe.efi or similar.
Let’s start with the easier one, the filename option. This is defined by the PXE loader option for a host or a hostgroup. It’s simple one to one mapping, let me present few examples:
- PXELinux BIOS =>
- PXELinux UEFI =>
- iPXE Chain BIOS =>
- PXEGrub2 UEFI =>
x64can differ per architecture)
- iPXE UEFI HTTP =>
- Grub2 UEFI HTTP =>
Some PXE loader options, like the last two in the list above, require Httpboot feature to be turned on and renders the filename option as full URL where
smart_proxy is a hostname of the smart-proxy as defined in the Foreman database and port is added (8448, 8000 or 9090) as reported by smart proxy via features API. Older versions of Foreman have a bug when port was set incorrectly, this should change in Foreman 1.24.
When PXE loader is set to None, Foreman will put neither filename nor next-server option into DHCP record. Note there is a bug that again will be hopefully fixed in Foreman 1.24 when Foreman was still adding next-server option.
Now, the more complicated (or confusing) part and that’s the next-server option. This must be set to the IP address or hostname of the TFTP server and our installer is able to set the configuration value which is defined in
/etc/foreman-proxy/settings.d/tftp.yml (the option name is
--foreman-proxy-tftp-servername). Each TFTP smart-proxy then reports this setting via its API and Foreman can grab that when DHCP record is being created. What’s confusing is that the installer does not set this by default and it must be set explicitly, otherwise the setting is blank.
When the setting is blank, Foreman tries to do “educated guess”. It takes smart-proxy hostname as defined in Foreman database and performs a reverse DNS (PTR) query against authoritative DNS server (usually the one that Foreman and installer manages). Authoritative nameservers are defined in a Subnet, if that’s blank Foreman will use OS resolver (
/etc/resolv.conf etc). Foreman performs the same DNS query if TFTP servername setting reported by TFTP proxy is set to hostname instead of an IP address.
This was confusing a lot, users are constantly running into issues:
- DNS timeouts during provisioning
- querying incorrect DNS server (authoritative vs caching)
- incorrect IP address of the TFTP server (PTR record was invalid)
Therefore after some research, we are changing this for Foreman 1.24. The reverse lookup will be removed from Foreman completely, Foreman will either send hostname (or IP address) reported by TFTP proxy or smart proxy hostname. It’s now up to the TFTP module on the proxy to do reverse IP query against system resolver if and only if it’s necessary.
I’ve found that ISC DHCP does support hostname next-server option although I thought that TFTP requires the next-server to be an IP address. I haven’t tried myself and there will probably be some legacy TFTP clients which will not work with hostnames, therefore the TFTP module in proxy will always attempt to do PTR query and only if that fails it will leave a warning in logs and use hostname. Therefore if user have some problems with DNS (e.g. missing PTR record) it will still work and in the worst case user will find out when a legacy TFTP client won’t boot. That’s I think sane behavior, at least we’ve tried our best.
When I was working on these changes, I cleaned up DNS resolving both in Foreman and Smart Proxy. It will be possible to define DNS query timeout as an array with a sane default of 50 seconds in 4 tries. This is now used across all resolvers in Foreman core (it did not used to be like that) and also there is now rich logging and warnings in logs if something either does not resolve or there is a time out.
All and all, 1.24 will have much better handling of DHCP filename and next-server options which will enable more workflows and iron out the confusion around the functionality. There are several patches that need to go through but I am really hoping to hit 1.24 with these and remove some major pains around TFTP, DHCP and PXE.
Give heart if you read it to the end, I am curious how many people actually read lengthy net-booting posts. If you find an error, I am making this a wiki so feel free to correct me. Cheers!
Also if you made it through here, visit this patch and do a short review: