While we are working on improving naming conventions for TFTP boot files (https://github.com/theforeman/foreman/pull/5244) I have a complementary proposal. I hit this again when I was testing Atomic provisioning with 1.18 RC1 to verify one issue we have in kickstart repos.
Problems:
- OS installer can end up downloading file which is being changed in parallel by wget which leads to corruption (various symptoms from Pane is dead to XFS module cant be loaded and kernel panic)
- Flag
-c
corrupt files when installation media URL change or there is an update (kickstart repo “respin”)
Solution to both problems:
The proposed workflow of /tftp/fetch_boot_file
proxy API endpoint with example parameters http://server/blah/7.5/vmlinuz
and Redhat-7.5-vmlinuz
:
- Delete existing file/symlink:
rm Redhat-7.5-vmlinuz
to prevent race condition - Download into separate directory:
mkdir Redhat-7.6-vmlinuz-orig; wget -N http://server/blah/7.5/vmlinuz Redhat-7.6-vmlinuz-orig/vmlinuz
- Create a symlink after download is done:
ln -s Redhat-7.6-vmlinuz-orig/vmlinuz Redhat-7.6-vmlinuz
- PXELinux will continue trying over and over until file appears if the download is slower than the booting host
There are two changes worth noting. First, I propose to drop -c
replacing it with -N
. The former means “continue partially downloaded file” which is the issue, latter flag turns on “don’t download file with same timestamp” which is actually what we want from the beginning. Kernel/initramdisk files are relatively small and continue flag makes little sense compared to what it can break, however we should not be downloading them over and over again if there was no change and -N
delivers here.
Second, flag -N
cannot be used with -O
(output file) so in order to achieve the correct behavior of wget, the proposal creates an extra directory for each file with -orig
suffix, that is used as the working directory for download. I tried to find other tool like curl
but it looks like all have similar behavior. Creating an extra directory for each TFTP file is not a big issue.
Final file is actually a relative symlink into the directory, we need to delete it prior redownload attempt so we are sure that PXELinux won’t try to pick the file up while it’s being redownloaded.
Important thing is nothing is changing in the API, everything works fine with our without changes which are currently being worked on and it will also work if we decide to refactor PXE files handling completely in core.
Finally, this will work out of box with current Foreman implementation as well with @Shimon_Shtein PR as well which adds ability to override TFTP filename conventions.