Getting tftp error while doing PXE Boot through Foreman

We have configured Foreman server and its been working fine for multiple Subnets/VLAN. Now I am trying to deploy few VMs in new subnet through Foreman and configured the subnet, hostgroups. but its failing at TFTP boot with this error:

I checked its creating pxe kickstart files under /var/lib/tftpboot/pxelinux.cfg and also checked my TFTP service and its running fine.

pxelinux.cfg]# netstat -tulpn | grep 69
udp        0      0    *                           25032/xinetd        
udp        0      0    *                           1/systemd 

Seeing the below error in message file:

Error code 0: TFTP Aborted
RRQ from 10.xx.xx.xx filename pxelinux.0
Client 10.xx.xx.xx finished pxelinux.0
Client 10.xx.xx.xx timed out

On client console seeing this error:

PXE-E32: TFTP open timeout

When I run wireshark tftp on client side VLAN is listening on higher end port (2071, 2072) instead of 69.

Do you happen to have a firewall that blocks the incoming traffic? If you use a tftp client manually, can you retrieve that file?

I also have this checked with Network team. and traffic is clear. no firewall is blocking. also when I tried tftp from target subnet host, I am able to download pxelinux.0 manually.

Just a wild guess, but maybe your new VM is using the wrong VLAN? The only cases where I have seen this kind of error was when either

  • the firewall blocked the traffic or
  • the hosts had no working network connectivity at all due to using the wrong VLAN

In VMWare environments, the typical error is to have the wrong VLAN selected for the interface, for physical servers this usually means the hardware uses no or an incorrect native/default VLAN ID.

we checked firewall and its disabled. also ensured SELinux is disabled.
when I user hammer commands to create VM its creating with right configs (VLAN, disk and mem/CPUs). IP is also getting assigned(this point DHCP) working fine but after that stage its giving up at TFTP boot.

do I need to do any addl. config. on foreman. side. the only difference b/w the existing VLANs and the new one is the all existing VLANs are setup in one region(US) along with foreman. but the new vlan and Vcenter is coming from different region(UK).

You should not need to configure additional things on the foreman side if you configured the subnet analogue to the working one. You could double-check if the correct server is set as TFTP Proxy on the affected subnet.
If that is correct, you could check the TFTP Server logs if you can see any errors or if you at least can see the hosts in question connecting. Afaik, tftp is handled by xinetd by default, so journalctl -u xinetd should give you those logs.

I think I’ve heard @lzap complain that tftp only works well in local networks and can have trouble with high latency. Perhaps that’s a problem here?

Only on EL7. I remember finding out around EL 7.2 or 7.3 that there’s also a systemd socket activated daemon but never found the time for a migration. All other platforms already use a separate service. When we drop EL7, we’ll get rid of xinetd altogether.

High latency can lead to transfer errors. What is more relevant is perhaps NAT - do you happen to have NAT? TFTP is UDP based and it won’t pass NAT: request will do, however response will never reach the client as NAT will filter that out. You need to have connection tracker enabled for that on your firewall.

Thank you @lzap and @areyus for your suggestion.
Do I need to do on Foreman host side, or should I work with Network team to enable these NAT settings on target subnet/Vlan?

You need to enable TFTP connection tracking on your NAT router. If that is a Linux box, then something like:

# cat /etc/sysconfig/modules/foreman.modules
modprobe nf_nat_tftp
modprobe nf_conntrack_tftp
exit 0