Add SecureBoot support for arbitrary distributions

Hi folks,

I would like to propose a proof-of-concept for better SecureBoot support in Foreman.

SecureBoot expects to follow a chain of trust from the start of the host to the loading of Linux kernel modules. The very first shim that is loaded basically determines which distribution is allowed to be booted or kexec’ed until next reboot.

We assume host systems with enabled SecureBoot (user mode) and default MS certificates in db.

The existing “Grub2 UEFI SecureBoot” is not sufficiant as it limits the possible provisioning to the vendor of the Foreman host system. Simply speaking, if your Foreman/Smart Proxy host operating system is for example CentOS, you can only provisioning CentOS OS to your hosts (see https://github.com/theforeman/puppet-foreman_proxy/blob/master/manifests/tftp/netboot.pp).

Example (as of today):

  • Foreman Smart Proxy host OS is CentOS7
  • shim in (/var/lib/tftpboot/grub2/shimx64.efi) is signed by MS and contains CentOS certificate (vendor key)
  • GRUB2 in (/var/lib/tftpboot/grub2/grubx64.efi) is signed by CentOS
  • a host fetches shim, shim fetches GRUB2, GRUB2 fetches its configuration instructing GRUB2 to boot Ubuntu 22.04 installation kernel
  • GRUB2 fetches kernel+initrd
  • GRUB2 calls shim for signature verification
  • booting of Ubuntu 22.024 installation kernel fails with “Invalid signature”

We could solve this through a more fine configurable way to specify a host’s Network Bootstrap Program (NBP) and its provisioning. Foreman knows the to-be-installed OS at the moment when a host goes into build mode. Having this, Foreman can prepare and provide all required files on TFTP server and make corresponding DHCP configuration.

The following patchset is a proof-of-concept:

To make this POC working you need to provide all shim and GRUB2 binaries manually under /usr/local/share/bootloader-universe/<os>/. Don’t forget to set read permissions for foreman-proxy user. You need to set SELinux to permissive when testing.

Example (working):

  • Foreman host OS is CentOS7
  • Start Ubuntu 22.04 provisioning for a host
  • host enters build mode, shim+GRUB2 are copied from /usr/local/share/bootloader-universe/ubuntu/ to /var/lib/tftpboot/grub2/00-11-22-33-44-55, shim is signed by MS and contains Ubuntu certificate (vendor key)
  • GRUB2 in (/var/lib/tftpboot/grub2/00-11-22-33-44-55/grubx64.efi) is signed by Ubuntu
  • host fetches shim, shim fetches GRUB2 (from same directory path), GRUB2 fetches its configuration (from same directory path: $prefix) instructing GRUB2 to boot Ubuntu 22.04 installation kernel
  • GRUB2 fetches kernel+initrd
  • GRUB2 calls shim for signature verification
  • booting of Ubuntu 22.24 installation kernel succeeds

To-be-implemented:

  • auto-extract of binaries out of corresponding installation media
  • SELinux policy to allow file creation
  • DELETE method for Smart Proxy
  • various checks and code improvements

Considerations:

  • always boot from network: shim chainloading does not work (network shim chainloads disk shim), but network shim chainloading GRUB2 from disk works
  • some distribution GRUB2 are only trying to fetch grub.cfg (and not grub.cfg-<mac>) from $prefix (here TFTP), therefor we provide same GRUB2 configuration in three different files

What do you guys think? Would this be a useful feature for Foreman?

5 Likes

is signed by MS

Why I refuse to support SecureBoot at all at our org.

Understandable. I am also not a friend of hardware I own that might not boot anymore at some day just because MS dictates it. However, having more control over what a particular host initially gets when booting from network could also help for setups using different sets of custom keys.

I haven’t had the requirement yet, but sounds like a great way forward if the manual steps can be solved.

Hi,
thanks for the RFC, the summary with examples, POC PRs and lists with things to consider made it really easy to read and follow. I’m still kinda new to the provisioning, so my knowledge is still limited in some areas, but I will happily assist you with testing or code reviews, feel free to ping me.

Few days back we had discussion with other DEVs about the status of loaders in the Foreman.
This is the list of all available loaders is quite long:

  • PXELinux BIOS
  • PXELinux UEFI
  • Grub UEFI
  • Grub2 BIOS
  • Grub2 ELF
  • Grub2 UEFI
  • Grub2 UEFI SecureBoot
  • Grub2 UEFI HTTP
  • Grub2 UEFI HTTPS
  • Grub2 UEFI HTTPS SecureBoot
  • iPXE Embedded
  • iPXE UEFI HTTP
  • iPXE Chain BIOS
  • iPXE Chain UEFI

As you can see, there are a lot of them. This brings several issues:

  • Nobody knows which ones actually works (by default, without additional configuration)
  • Some combinations are not recommended and maybe should not be there at all
  • Some of them are not used (?)
  • No resources to maintain them all

So, back to your RFC. I’m not saying we shouldn’t try your approach, or skip it for now, but I would rather first do the cleanup of current loaders and then think about adding new ones - yours.

I’m planning to create RFC where I’ll ask community what loaders are they using, so I have some data to decide on, then we will do the cleanup (with templates too).

Thanks for your offer regarding testing and code reviews.

Let me add my thoughts to the mentioned issues:

  • Nobody knows which ones actually works (by default, without additional configuration)

That’s right, at least not for all. For me (as engineer) things get more clear looking at the loader mapping in the code but that’s nothing an end-user does.

  • Some combinations are not recommended and maybe should not be there at all

Yes and no. Typically you set the loader in the host group. Doing that, you already know where a host will be deployed (compute resource + compute profile) and which OS is used. Having this, this could already limit the selection:

Example 1: Create a host group using VMware (computer resource) + EFI firmware (computer profile) + Ubuntu 22.04 (OS) → only Grub2 UEFI (HTTP/HTTPS) would make sense here.

But that’s not always the case:

Example 2: Use the host group from above, but use it for a bare-metal host → Grub2 UEFI (HTTP/HTTPS) and Grub2 BIOS and PXELinux BIOS would be valid options. At this point Foreman does not know the actual firmware of the host.

  • Some of them are not used (?)

From my professional experience and the netboot manifest I could directly do without these:

  • Grub UEFI (deprecated)
  • Grub2 ELF (deprecated)
  • Grub2 UEFI SecureBoot (incl. HTTPS) (only very limited usable)
  • No resources to maintain them all

Are the existing ones that expensive in maintenance? Most of the stuff exists since 10+ years and is quite settled, isn’t it? Only this SecureBoot stuff is a bit more challenging.

It’s like always: how much complexity you want to hide from the end-user and how much technical knowledge is assumed from the user. And we need to take care to not exclude options which are working for the one or other that is not on our mind.

Sure, please go for it and let me know if I can support here! I think me and my colleagues can also provide some useful date here.

1 Like

Created redmine for it & added to our backlog, this is something that we should show to users.

Yes and no. Typically you set the loader in the host group. Doing that, you already know where a host will be deployed (compute resource + compute profile) and which OS is used.

Good point, this needs to be taken into consideration.

Are the existing ones that expensive in maintenance? Most of the stuff exists since 10+ years and is quite settled, isn’t it? Only this SecureBoot stuff is a bit more challenging.

Yeah I didn’t phrase it properly, maintenance is almost none, but the bug fixing & investigation of customer cases is time expensive and takes most of the time.

And we need to take care to not exclude options which are working for the one or other that is not on our mind.

For that I was thinking about *sight* another RFC, where I would like to suggest new approach to deprecating templates and boot loaders. Instead of deleting them, we could move them to new folder called deprecated and just remove them from UI & seed file. Or we can add new flag deprecated and just hide them in UI.

Sure, please go for it and let me know if I can support here! I think me and my colleagues can also provide some useful date here.

In upcoming weeks I’m flying to India to meet our colleagues and talk with them about the Foreman and provisioning, so I won’t be available much, plus will have some PTO after that so I’m apologizing in advance if my responses will take some time. It’s not that I forgot, I just have a lot of stuff to do right now :smiley:

I really like your approach. I had a similar issue with SLES and secureboot with no good solution so far.
From what I understand this RFC would be exactly what I need to make some progress with my issue.

1 Like

Update:

PRs are open with refactored code (thanks to @goarsna) and ready for review:

2 Likes

Hey, I understand that non-RH systems SecureBoot support is bad to non-existent for Foreman. This workaround seems a little too complex tho, there is no need to chainboot grub2 twice when DHCP is under Foreman’s control. When DHCP reservation is created, why don’t you modify the “filename” option to already contain the MAC within the path? Then all you need is to put the signed bootloader into the directory.

Unfortunately, this does not solve the huge pain of acquiring the bootloader files themselves. In case of Red Hat, they are not part of the kickstart repository which is required in order to download PXE files. They are, however, present in a RPM file (grub2-signed.rpm I think) so there could be a way to download them and extract them. And I believe something similar would be possible for Debian/Ubuntu.

This is extremely hard problem to crack, a real pain point in the Foreman provisioning PXE workflow. An ideal case would be:

  • When a new host is created, shim+bootloader, kernel/initramdisk are all downloaded onto TFTP server.
  • Each host has its own directory unique by MAC name.
  • There is some kind of cache implemented so downloads are fast once downloaded for the first time.
  • DHCP entry is created pointing to the correct directory/bootloader.
  • Deleting a host also deletes the entry.

The problem with this approach, however, is IPv6 where host configuration cannot be pre-allocated by MAC. I hope, once HTTP UEFI Boot API is finally implemented for real by multiple vendors, Foreman could use Redfish to boot an unique URL containing MAC address already.

This is exactly what I was exploring with my provisioning service called Forester during the last year: https://foresterorg.github.io/ and I do not have definitive answer. I am happy to see someone taking a stab at this.

Hi,

thanks for commenting but I need to clarify some things.

I don’t see where we are going to chainboot (chainload?) grub2 twice. The planned changes do exactly what you proposed: point the “filename” in DHCP for a particular host to shim binary path of the distribution which is going to be installed.

We decided to separate everything in subdirectories (dir name is mac) for a host because of followup load of GRUB2 binary and it’s config file(s). This depends a bit from distribution to distribution. However, a shim referenced with grub2/foobar/shimx64.efi will try to load GRUB2 binary grub2/foobar/grubx64.efi (something hardcoded like basename + /grubx64.efi)

Using “filename” grub2/shimx64.efi-00:11:22:33:44:55 and put grub2/grubx64.efi-00:11:22:33:44:55 next to it won’t work.

True. We actively decided to skip the automation for now and to provide a handy description how to do this manually. I mean the whole topic is quite experimental right now and doing this once by hand would be OK for now. But that’s definitely something we can work on if first SecureBoot support got accepted upstream.

Your “ideal case” is already implemented (besides some minors like deletion and the automatic acquisition of the binaries).

Regarding IPv6 I have to admit that I’ve no experience at all ATM. Is Redfish still alive? I thought this was a stillborn (fun fact: ~10y ago I did some proof-of-concept regarding Redfish + libvirt) . Even if major HW vendors implement this, we would also need this on all supported hypervisors.

I will definitely have a look at this forester.

Anyway, what’s left to discuss? According to my understanding of your post we are on a good track (besides this IPv6 stuff where I need to catch up).

1 Like

Ah, just a misunderstanding. The filename option on DHCP is a good way indeed.

This is pretty big change, but with the automatic acquisition it would all make sense. It would help to solve one big problem - currently foreman installs grub from the underlaying OS which is sometimes incompatible with Red Hat kernel being deployed. This new feature would help to mitigate this.

Well, yeah, DHCPv6 is a problem. There is no way to create host reservation via MAC address, only via DUID which is random. There are some extensions for MAC-based uids but I never got them working.

Alive indeed, problem is that most vendors support minimum relevant features. Most of the fancy stuff, like “boot from HTTP just once” or “boot this ISO file” are only implemented by few. As much as I would love to, Redfish (and EFI) does not really help to solve problems we are facing in datacenters. It is somewhat similar to IPv6. Things are utterly complex.

Well, we haven’t touched how PXE workflow works since the very beginning. If you want to pursue this idea, which I believe is on good track. It is important to make sure this works 100% on all major distributions including Fedora/RH, Debian/Ubuntu and Suse to name few. A good amount of tests would be necessary.

There is one thing, when host is created, smart proxy downloads PXE files via curl. Everytime those PXE files change (e.g. when you use Fedora Rawhide) curl corrupts those files. Since bootloader will now be not shared anymore (unique per MAC), it would make a lot of sence to refactor also PXE files downloading in a way that it is downloaded per host. To do this efficiently, there must be some kind of cache so files are only downloaded once and then reused. This would be useful for bootloader files as well.

One thing I just realised is BIOS - it would make sense to make this workflow universal for both BIOS and EFI. See, what I would like to see us is making provisioning more simple. Just adding another PXE loader option on top what we have is exactly the opposite.

So the downloader would need to download PXE linux binary from the upstream repository, that is possible for sure in case of Red Hat compatible system:

$ rpm -ql syslinux-tftpboot | grep pxelinux.0
/tftpboot/pxelinux.0

Thing is, PXE linux is getting removed from Fedora ISO files and what I expect is that it could be also dropped completely from there. Fedora devs state grub2 is the replacement going forward. Problem is, grub2 cannot be easily extracted from the upstream repo:

$ rpm -ql grub2-pc-modules | grep core.0
/usr/lib/grub/i386-pc/core.0

It needs to be built via grub2-mknetdir or grub2-mkimg, that cannot be done via different version which will be installed on TFTP Smart Proxy. In that case, it would be better to simply keep using pxelinux.0 or grub.0 respectively and copy/hardlink/symlink those files into the MAC-based subdirectories.

At least, the MAC-based directory should be “complete” and work for both BIOS and EFI systems if possible.

For completeness, here is the grub and shim for EFI nodes:

$ rpm -ql grub2-efi-x64 | grep grubx64.efi
/boot/efi/EFI/fedora/grubx64.efi

$ rpm -ql shim-x64 | grep shim.efi
/boot/efi/EFI/fedora/shim.efi

Not complete random, but a bit odd. According to RFC 3315 - Dynamic Host Configuration Protocol for IPv6 (DHCPv6) the following two (out of three) DUID types might be used in data center context (baremetal & VM server):

  1        Link-layer address plus time
  2        Vendor-assigned unique ID based on Enterprise Number

With focus on VMs (which can be created & provisioned) I would expect the first type. Looks like I need to start investigating a bit how this works with Proxmox/libvirt (Qemu) and VMware. In the best case you can even set the DUID via API when the VM gets created.

And regarding baremetal: as of today you need to find out the MAC anyway before you can add any host specific boot instructions. This would be same for DUID.


I see. So not dead, but also not really alive :slight_smile: Back then it was traded as the successor to IPMI. But looks like IPMI is still today the defacto standard for remote control of baremetal (at least basic implementation by most vendors).


With PXE files you mean kernel/initrd? Aren’t they already stored in a distinct way in /var/lib/tftpboot/boot/? Don’t see any benefit to change this at the moment. These files are still referenced and used, independently of the actual bootloader.


Yeah, current list of PXE loaders is quite huge. I personally would be total fine with supporting only GRUB2 for EFI (with SecureBoot support) and BIOS. And maybe plus their HTTPboot versions.

We focus on EFI at the moment because there is no SecureBoot for BIOS. But yes, we should make it as uniform as possible. And as easy at possible.

Important: The SecureBoot enabled boot path also works on systems with disabled SecureBoot!
Means, if this is working reliable we can remove “normal” GRUB2/EFI PXE loaders anyway. At least for all distributions providing signed GRUB2 binary with network boot support.

I mean, having a working solution for SecureBoot now might help us in reaching our long term goals:

  1. Independent of the PXE loader (GRUB2, pxelinux, ipxe) and firmware (BIOS, EFI), we should rework provisioning process in a way that host specific PXE files are used in every case using DHCP filename option and e.g. MAC separated directories.

  2. We should implement automatic acquiring of all required PXE files directly from the target OS’ packages. Optimally only doing it once and reusing them wherever it’s needed (e.g. by using symlinks). Cleanup would also be nice after host deletion.

1 Like

The last time I tried it over PXE there were no LL-DUIDs supported by the PXE firmware, I am curious how this actually works today.

Well, PXE provisioning should always be designed for bare-metal, VM provisioning over PXE is just a bonus.

Yeah, but then the DHCP server must be able to identify the client. I am just little bit worried that DHCPv6 implementations in PXE stacks are poor and do not generally support LL-DUIDs. Once some decent bootloader is loaded, things get usually better, but that is too late as we want to actually decide on the bootloader (shim) beforehand.

But this needs to be tested, I just briefly tried it years ago.

Also there is one hack but that is just misery: https://dhcpy6d.de/ (this special DHCP server uses MAC adress tables to find the client address, will not work over routed networks). I don’t like that at all.

Yeah it is unrelated to the feature you want, yet, I believe this part is worth some refactoring. I believe both files should be downloaded at the same time as the shim/grub pair.

Yeah, thanks for sharing that. We will discuss this within the core team this week.

2 Likes

I outlined an idea that would actually make distribution of SecureBoot-signed bootloader files much easier:

1 Like

Hey folks.

There are a lot of discussions going on in the last time regarding provisioning processes and I love to see that. But we need to work on these step by step and try to separate them into several parts, without breaking existing functionality.

Given that, we would like to focus again on the current approach we (ATIX) have been working on for some time and we would like to see our changes being accepted upstream.

I don’t want to pre-empt the discussions in the actual PRs but @goarsna and I have agreed already on that we would implement a few more details in order to make the current PR complete.***

At first glance, this adds again an additional PXELoader to Foreman and we all agree on that we actually would like to have less. But with these changes we would:

a) have a first support for provisioning Linux on SecureBoot enabled hosts and
b) we are paving the way for a generally more flexible design of PXE boot for individual hosts.

Having this working reliable, we can then think about (order isn’t fix):

  1. reducing list of PXELoader*
  2. refactoring/automating the way of providing all required PXE boot files**
  3. extend the host-specific NBPs (=DHCP filename) to other PXELoader as well
  4. start looking at IPv6 PXE boot

Please speak up @lzap @ekohl @lstejska if this works for you guys or if we are wrong here.

*) e.g. having only one “Grub2 UEFI” PXELoader which corresponds to the new one
**) e.g. RFC: Distribution of netboot files via OCI registry
***) clean up MAC-directories after host deletion, check for new PXELoader support utilizing SP capabilities, manual testing the other distributions RHEL/Oracle/SLES

PRs:
https://github.com/theforeman/foreman/pull/9864
https://github.com/theforeman/smart-proxy/pull/877

2 Likes

Hi Jan,
we want to collab on this with you and

+1

Given that, we would like to focus again on the current approach we (ATIX) have been working on for some time and we would like to see our changes being accepted upstream.

Yes, that’s our plan, right now we just have discussions and trying to think about all the cases and problems, but for sure we don’t want to reinvent wheel.

1 Like

I agree with the design and direction you are taking, tho, I am not the one to decide as I am no longer involved in active development or testing.

One thing that I would like to propose is the PXE loader name tho, the proposed “Grub2 UEFI SecureBoot (target OS)” assumes that the target OS uses grub. This might be true for current Red Hat and Debian based systems, however, Fedora developers are already experimenting with unikernels based on systemd-boot. While the target for today are local booting in cloud environments, it might be possible that in the future there will be no grub.

Edit: Note for The Register - I am not saying anyone is actively working on removing grub2 from the distribution, this is just my intuition that in the far future a different bootloader might be in use.

Additionally, I don’t think “SecureBoot” needs to be in the name directly, it might repel users who do not want to use SecureBoot while this workflow will work perfectly fine without SecureBoot. And in the future with BIOS too. If you want to stress out that it works in SB environments, let’s add a help or UI element to tell it.

Therefore, let’s be bold and come up with a more generic name. Something like:

  • Target OS EFI - loads AA-BB-CC-DD-EE-FF/boot.efi
  • Target OS BIOS - loads AA-BB-CC-DD-EE-FF/boot.0

Both files will be just (relative) symlink/hardlink to the correct files which are, this is important, architecture dependent (different link destination for intel system than for arm64). Since every MAC-based directory is host-unique, there is no need to introduce boot-architecture.efi naming patters because architecture is known already.

What this patchset does not explain or solve is distribution of the boot files, the (/usr/local/share/bootloader-universe). This is open question but I think it is a good POC, manual copying is fine, maybe foreman-installer could populate the directory at least for the most popular distributions. What I am still missing is full specification of the filesystem layout in that directory, perhaps a README with that would be great - specifically what are the expected architecture suffixes (is that aarch64 or arm64 or just arm, and world-famous x86_64 vs x64 vs amd64). Stuff like that.

Oh one more thing, I am excited about this work because it solves one particular problem we faced several times before: not using the shim/bootloader from the target OS. In the examples, you show that the bootloader universe contain one bootloader per OS:

[root@vm ~]# tree /usr/local/share/bootloader-universe/ /usr/local/share/bootloader-universe/
|-- centos
|   |-- grubx64.efi
|   `-- shimx64.efi
`-- ubuntu
    |-- grubx64.efi
    `-- shimx64.efi

This will only help if there is one bootloader per OS version:

[root@vm ~]# tree /usr/local/share/bootloader-universe/ /usr/local/share/bootloader-universe/
|-- centos-9.0
|   |-- grubx64.efi
|   `-- shimx64.efi
|-- centos-9.1
|   |-- grubx64.efi
|   `-- shimx64.efi
|-- centos-9.2
|   |-- grubx64.efi
|   `-- shimx64.efi
|-- centos-9.3
|   |-- grubx64.efi
|   `-- shimx64.efi
`-- ubuntu-XYZ
    |-- grubx64.efi
    `-- shimx64.efi

Only with this naming convention, it will solve the painful problem of shim from 8.3 not being able to load RHEL 8.0-8.2 because keys were revoked because of a CVE. So I would suggest this is part of the initial version from the day one.

1 Like