DHCP timeout definitely sounds like a PXE attempt. Check the logs on the DHCP server, you should see DHCPDISCOVER from the host’s MAC, hopefully followed by why no DHCPOFFER is made.
I’m not seeing a DHCPDISCOVER, I am seeing a DHCPINFORM which says that <IP address is not authoritative for subnet
I checked this via journalctl
I’m also a little confused. Will not foreman handle the boot sequence for me, assuming that the host is set to use PXE. It seems a bit cumbersome to have to reset the boot order on the host.
(Side note, consider single replies with all your points in, it makes for nicer reading and is less spammy to our email-based users)
From RFC 2131:
DHCPINFORM - Client to server, asking only for local configuration
parameters; client already has externally configured
My reading of the RFC suggests that to issue a DHCPINFORM requires the NIC to already have an IP, so either it’s statically configured within the NIC itself (BIOS config option, I would guess), or the DHCPDISCOVER/OFFER conversation happened further up the log. Worth investigating.
As to the boot order, the answer is yes-and-no. I’m assuming we’re talking about a physical machine here, not a VM - in this case, no, Foreman cannot directly control the BIOS of your host. Many physical hosts don’t even have the option to set the BIOS from the OS, remotely or locally, so we cannot build a workflow around that.
What we do recommend is that the machine is permanently configured to boot from PXE first, and then local disk second. As you’ve already seen for yourself, when the host is not in Build mode, the file on the TFTP server contains a LOCALBOOT directive. Thus a PXE request will result in the host skipping to the next device, the local disk. When in Build mode, the file is rewritten for reinstallation - this way a host will always boot correctly, based on it’s Build state, and you don’t have to keep altering the BIOS settings.
Thanks for your reply. So when I hit build via Foreman, it should go out and build the server correct. What should I see via Forman when I hit the build button. As it is now it just changes from build to cancel build.
If I look at the production logs on the Foreman server I do see quite a bit of activity.
What complicates matters is that you’ve not said what features are enabled on Foreman - it can manage PuppetCA, DHCP, DNS, and TFTP, or just a subset of these - and of course expectations depend on configuration.
Assuming a complete configuration of controlling everything, then I would expect click “Build” to cause just the changes to PXE file, but that’s because alll the other stuff is done at host creation time. If this were a brand new host, you’d see:
- A DHCP reservation created for the MAC/IP combination
- A DNS A-record and PTR-record created for the IP/name combination
- A PXElinux config file created for the MAC
This would mean that when the host boots for the first time, it can get an IP from the DHCP server, and be told where the TFTP server is (‘nextserver’ option in the lease). It then queries said IP for TFTP/PXE, and is given a PXE file, which it then uses to load an initrd/kernel over the network.
Again, I’m desribing generics here, for example if your provisioning network has not got Foreman managing DHCP, then you’d be responsible for ensuring the leases give out the right nextserver IP, and so forth.
This is host creation rather than just flipping the Build flag, but you get the idea - here you see it creating a DHCP lease, creating the PXElinux cfg files, and checking if the initrd/vmlinuz files need downloading to the proxy. This is the kind of output you’re looking for in production.log. I encourage you to read that thread as there are other log examples they may help you make sense of what you’re seeing, since you can’t share it for us to see.
To answer your question, yes, you’d only expect to see the button change from Build to Cancel Build in the UI - the rest happens behind the scenes. You’d then go reboot the server at your leisure, and it should pick up the changed PXElinux cfg file.
To try to help a little, here’s a shot of one of my VMs booting TFTP - I stopped my TFTP server so it would hang while I got the shot
You can see that it records the MAC, confirms that it got IP 172.20.10.22, and that the “nextserver” is 172.20.10.1 (which is correct). It then loads pxelinux.0 (which fails as I stopped the TFTP server), and would then got on to load pxelinux.cfg/01-52-54-00-1a-ca-61 which corresponds to the MAC. Since this host is not in Build mode, that file contains LOCALBOOT 0, and the host would then boot from disk. Hope it helps.
Ok, so nor after getting our network engineer to work on a switch I’m getting messages as follows:
However on the agent, I’m getting a PXE-M01 (I think) no existing boot agent.
Are you sure you are getting DHCP answer from the correct DHCP server? We’ve seen many times users running multiple DHCP servers on one segment leading to an incorrect behavior.
We only have one DHCP server. I was wondering how to tell if the boot agent exist on the DHCP server. the messages seem to indicate its communicating.
We found that the TFTP daemon was not running. Now the error is that it is exiting the intel boot agent with a PXE-MOF error.
Not sure what you mean by MOF, a screengrab might help.
unfortunately I’m not allowed to do that on these systems. PXE-M0F: Exiting Intel Boot Agent. is the error that comes up after a DHCP timeout.
It does look like DHCP is sending the IP/mac pair to the host as near as I can tell from the logs. I think its breaking at the next step.
TFTP server by default logs requests, do you see any activity there?
Just to be clear, does Foreman control the DHCP server? I.e. is it creating the DHCP reservations for this IP/MAC pair? Can you confirm the next-server IP is correct (that is, it points to the correct TFTP IP)?
Hi, Hope you had a nice Memorial Day weekend. The DHCP server on a separate box from the Foreman server. I see the next server line in the lease file, It appars to be part of a MAC address (two fields short). I was looking through the host information and the numbers do not appear to match anything I can find. The next server should point back to the server where the boot file is stored (in this case the Foreman server???)
It looks like TFTP on the foreman server is trying to grab the boot file but is not getting to the host.
smart-proxy ------ /tfpt/fetch-boot-file
Is there a NAT of firewall in between them? TFTP is UDP stateless protocol which won’t work without special care.
Not that I know off. Network guy is not here today so I can’t check the switch. Are there some tftpd logs I could check. Something appears to be killing the tftp service.
By default TFTP runs via Xinetd, so there will be no TFTP process running on a long term basis.