Hey Guys,
we are trying to set up a diskless compute setup, where a portion of our servers would netboot a complete OS on every single boot.
I found an older thread with a similar question (There’s a way to provision diskless hosts?) - @lzap commented this should be doable by modifying PXE templates. I would assume the template to modify would be the “iPXE default local boot”, as I guess the host wouldn’t be in build-mode all the time. Can you give any further hints on this? My initial thought would be creating a similar template, checking if the host in question belongs to our netbooted clusters, then just doing the usual iPXE netboot + CloudInit magic, does that sound half-correct?
I assume your problem is that the default PXE template boots from localhost, but you want for some group or servers to boot it from network with some extra arguments.
Well, I looked into the codebase and @host variable should be available in the local templates, so you can basically use hostgroup name or some host/hostgroup parameter to find out if a host should boot from network. Modify PXELinux, Grub2 or iPXE template:
Then all hosts which exit build mode (or never enter it) are deployed with these templates and will boot from HDD or network depending on how you set it up.
Alternatively, you can create some rules that will be executed in the bootloader based on hostname, IP/subnet, BIOS/EFI serial names or MAC addresses and boot either from HDD or network. You can’t do this with PXELinux but both Grub2 and iPXE have pretty powerfull scripting languages you can use.
In any case, get back to us with your solution so others can leverage what you find out!
If you find a way to properly do this, please report it back here. I wasn’t able to made it work in past. I confess that I didn’t tried that much because I didn’t have the time required, but as @lzap stated it should be doable.
So, i can confirm that @lzap is right and you can just modify this file and set it as a new default. I used <% if host_param_true?('..') %> to check if my host is to be netbooted. To work around foreman wanting to proxy all OS images and such we currently just copy them to the tftp-directory and hard-link them in the template, but this is probably not the most elegant solution.
Feel free to make a PR into the template, find a nicely named parameter and provide some good example item so other users could inspire.
Can you elaborate?
Foreman currently downloads PXE images after host is created, we could probably create a webhook when OS is modified and when Katello synchronizes or publishes content to initiate downloading too.
you basically have a switch statement for the host name that elaborates which image to pull?
the server already has to be be provisioned, but in the boot process it might end up with a totally different os?
my plan is basically this:
lets say i have both amd&nvidia gpus.
this image than maybe come with kernelheaders for drivers, but without the persistent storage that holds that drivers, we wouldnt be able to use them? → so you have to select the right snapshot/storage along with the image and kernel headers?
i switch from nvidia to amd, select a kernel with proper headers and storage that comes with the drivers.
on another machine i also want to switch, but this machine requires pytorch with cuda etc…
Iscsi would possibly work, but it would take some time to make the snapshots and copy them (so i guess the m2 approach is not really for me).
but cant we just use the mentioned diskless approach from this thread and combine it with zfs as storage including repository storage for large repos?
i think i start by implementing zfs repo storage to freshly provisioned machine.
if that works, ill try to implement diskless boot and then ill try to pass some arguments to select the repos that are used trough zfs repo storage.
but i still dont understand how to get the image from the tftp using the pxe template.
i guess i have to add something like this in my dhcp.conf?