I would like to propose a proof-of-concept for better SecureBoot support in Foreman.
SecureBoot expects to follow a chain of trust from the start of the host to the loading of Linux kernel modules. The very first shim that is loaded basically determines which distribution is allowed to be booted or kexec’ed until next reboot.
We assume host systems with enabled SecureBoot (user mode) and default MS certificates in db.
The existing “Grub2 UEFI SecureBoot” is not sufficiant as it limits the possible provisioning to the vendor of the Foreman host system. Simply speaking, if your Foreman/Smart Proxy host operating system is for example CentOS, you can only provisioning CentOS OS to your hosts (see https://github.com/theforeman/puppet-foreman_proxy/blob/master/manifests/tftp/netboot.pp).
Example (as of today):
Foreman Smart Proxy host OS is CentOS7
shim in (/var/lib/tftpboot/grub2/shimx64.efi) is signed by MS and contains CentOS certificate (vendor key)
GRUB2 in (/var/lib/tftpboot/grub2/grubx64.efi) is signed by CentOS
a host fetches shim, shim fetches GRUB2, GRUB2 fetches its configuration instructing GRUB2 to boot Ubuntu 22.04 installation kernel
GRUB2 fetches kernel+initrd
GRUB2 calls shim for signature verification
booting of Ubuntu 22.024 installation kernel fails with “Invalid signature”
We could solve this through a more fine configurable way to specify a host’s Network Bootstrap Program (NBP) and its provisioning. Foreman knows the to-be-installed OS at the moment when a host goes into build mode. Having this, Foreman can prepare and provide all required files on TFTP server and make corresponding DHCP configuration.
The following patchset is a proof-of-concept:
To make this POC working you need to provide all shim and GRUB2 binaries manually under /usr/local/share/bootloader-universe/<os>/. Don’t forget to set read permissions for foreman-proxy user. You need to set SELinux to permissive when testing.
Example (working):
Foreman host OS is CentOS7
Start Ubuntu 22.04 provisioning for a host
host enters build mode, shim+GRUB2 are copied from /usr/local/share/bootloader-universe/ubuntu/ to /var/lib/tftpboot/grub2/00-11-22-33-44-55, shim is signed by MS and contains Ubuntu certificate (vendor key)
GRUB2 in (/var/lib/tftpboot/grub2/00-11-22-33-44-55/grubx64.efi) is signed by Ubuntu
host fetches shim, shim fetches GRUB2 (from same directory path), GRUB2 fetches its configuration (from same directory path: $prefix) instructing GRUB2 to boot Ubuntu 22.04 installation kernel
GRUB2 fetches kernel+initrd
GRUB2 calls shim for signature verification
booting of Ubuntu 22.24 installation kernel succeeds
To-be-implemented:
auto-extract of binaries out of corresponding installation media
SELinux policy to allow file creation
DELETE method for Smart Proxy
various checks and code improvements
…
Considerations:
always boot from network: shim chainloading does not work (network shim chainloads disk shim), but network shim chainloading GRUB2 from disk works
some distribution GRUB2 are only trying to fetch grub.cfg (and not grub.cfg-<mac>) from $prefix (here TFTP), therefor we provide same GRUB2 configuration in three different files
What do you guys think? Would this be a useful feature for Foreman?
Understandable. I am also not a friend of hardware I own that might not boot anymore at some day just because MS dictates it. However, having more control over what a particular host initially gets when booting from network could also help for setups using different sets of custom keys.
Hi,
thanks for the RFC, the summary with examples, POC PRs and lists with things to consider made it really easy to read and follow. I’m still kinda new to the provisioning, so my knowledge is still limited in some areas, but I will happily assist you with testing or code reviews, feel free to ping me.
Few days back we had discussion with other DEVs about the status of loaders in the Foreman.
This is the list of all available loaders is quite long:
PXELinux BIOS
PXELinux UEFI
Grub UEFI
Grub2 BIOS
Grub2 ELF
Grub2 UEFI
Grub2 UEFI SecureBoot
Grub2 UEFI HTTP
Grub2 UEFI HTTPS
Grub2 UEFI HTTPS SecureBoot
iPXE Embedded
iPXE UEFI HTTP
iPXE Chain BIOS
iPXE Chain UEFI
As you can see, there are a lot of them. This brings several issues:
Nobody knows which ones actually works (by default, without additional configuration)
Some combinations are not recommended and maybe should not be there at all
Some of them are not used (?)
No resources to maintain them all
So, back to your RFC. I’m not saying we shouldn’t try your approach, or skip it for now, but I would rather first do the cleanup of current loaders and then think about adding new ones - yours.
I’m planning to create RFC where I’ll ask community what loaders are they using, so I have some data to decide on, then we will do the cleanup (with templates too).
Thanks for your offer regarding testing and code reviews.
Let me add my thoughts to the mentioned issues:
Nobody knows which ones actually works (by default, without additional configuration)
That’s right, at least not for all. For me (as engineer) things get more clear looking at the loader mapping in the code but that’s nothing an end-user does.
Some combinations are not recommended and maybe should not be there at all
Yes and no. Typically you set the loader in the host group. Doing that, you already know where a host will be deployed (compute resource + compute profile) and which OS is used. Having this, this could already limit the selection:
Example 1: Create a host group using VMware (computer resource) + EFI firmware (computer profile) + Ubuntu 22.04 (OS) → only Grub2 UEFI (HTTP/HTTPS) would make sense here.
But that’s not always the case:
Example 2: Use the host group from above, but use it for a bare-metal host → Grub2 UEFI (HTTP/HTTPS) and Grub2 BIOS and PXELinux BIOS would be valid options. At this point Foreman does not know the actual firmware of the host.
Some of them are not used (?)
From my professional experience and the netboot manifest I could directly do without these:
Grub UEFI (deprecated)
Grub2 ELF (deprecated)
Grub2 UEFI SecureBoot (incl. HTTPS) (only very limited usable)
No resources to maintain them all
Are the existing ones that expensive in maintenance? Most of the stuff exists since 10+ years and is quite settled, isn’t it? Only this SecureBoot stuff is a bit more challenging.
It’s like always: how much complexity you want to hide from the end-user and how much technical knowledge is assumed from the user. And we need to take care to not exclude options which are working for the one or other that is not on our mind.
Sure, please go for it and let me know if I can support here! I think me and my colleagues can also provide some useful date here.
Created redmine for it & added to our backlog, this is something that we should show to users.
Yes and no. Typically you set the loader in the host group. Doing that, you already know where a host will be deployed (compute resource + compute profile) and which OS is used.
Good point, this needs to be taken into consideration.
Are the existing ones that expensive in maintenance? Most of the stuff exists since 10+ years and is quite settled, isn’t it? Only this SecureBoot stuff is a bit more challenging.
Yeah I didn’t phrase it properly, maintenance is almost none, but the bug fixing & investigation of customer cases is time expensive and takes most of the time.
And we need to take care to not exclude options which are working for the one or other that is not on our mind.
For that I was thinking about *sight* another RFC, where I would like to suggest new approach to deprecating templates and boot loaders. Instead of deleting them, we could move them to new folder called deprecated and just remove them from UI & seed file. Or we can add new flag deprecated and just hide them in UI.
Sure, please go for it and let me know if I can support here! I think me and my colleagues can also provide some useful date here.
In upcoming weeks I’m flying to India to meet our colleagues and talk with them about the Foreman and provisioning, so I won’t be available much, plus will have some PTO after that so I’m apologizing in advance if my responses will take some time. It’s not that I forgot, I just have a lot of stuff to do right now
I really like your approach. I had a similar issue with SLES and secureboot with no good solution so far.
From what I understand this RFC would be exactly what I need to make some progress with my issue.