RFC: Rearchitecting how PXE files are deployed

Hello,

I am being hammered with questions from our users about Ubuntu Focal (20.04) and later. If you don’t follow this, Canonical made a breaking change in this release: debian-installer is no longer used for installations and LiveCD must be booted instead with cloud-init used to do the provisioning.

This alone would not be a huge issue for Foreman, but there is a change that comes with it. Boot files which are required to boot such LiveCD (kernel/initramdisk) are not published via HTTP(S) protocol directly, instead, users are asked to download the whole LiveCD ISO (1.2GB), mount it and extract it into the TFTP/HTTP directories.

This does not work well with our current design when Foreman “understands” directory structures of various Linux distributions and then hands over URLs to smart proxy when OS needs to be installed. Smart proxy then downloads both files into the TFTP/HTTPBoot directory.

My initial solution to the Ubuntu problem was to implement the same behavior but for ISO file, it could be downloaded, extracted (we cannot use mount tho as smart proxy runs rootless) and technically this could be possible. However, Foreman performs this download on every host creation (or build operation). Smart proxy actually spawns curl (previously wget) and gives it parameters in a way that it should skip download if file is not newer than existing one, but the mirror the file is being downloaded from need to support some specific HTTP headers and this is not guaranteed. In the worst case, 1.2GB would be installed on every host entering a build mode. Also, I don’t like spawning many commands or extracting big tarballs on a API request.

Foreman PXE files downloading has always been problematic in its design. There are several flaws:

  • Download process is asynchronous and we see very often systems booting into yet downloading kernel/initramdisk.
  • Re-download of PXE files is initiated on every new build/host. This is not needed, in reality these PXE files almost never change. Typically they only change on every major RHEL release (e.g. 8.1, 8.2). Granted CentOS Stream has changed it - this OS actually rebuilds the PXE files quite often.
  • When Katello is used and kickstart files are being promoted, the filenames which are derived from the source URLs are changing all the time creating many unnecessary files in the PXE/HTTPBoot directory. Those files are almost always the same and Foreman or Smart Proxy never cleans them.

For these reasons, I propose to radically change how PXE files are being deployed to the TFTP/HTTPBoot smart proxies. I believe that Foreman should NOT download them at the time when a host is created, but when an Operating System is created. Here is what I propose:

  • Operating System model will have a new Template kind associated: Boot files (find a better name).
  • The new template will use Foreman ERB system and will render to a shell script.
  • This shell script will contain necessary commands to setup PXE/HTTPBoot environment (typically wget or curl the boot files from upstream repo)
  • Since ERB can be used, information provided by Foreman (Installation Media / Katello repository, OS Name, Release, Version…) can be used to construct the proper URL.
  • When OS is created or updated (or deleted even), template is rendered and passed to a new API Smart Proxy call.
  • Smart Proxy executes the script and sets up the environment.
  • All TFTP orchestration code can be now dropped from the host creation as files should be in-place before any host is even created.
  • The call could support both synchronous and async modes. When triggered via OS edit, it would do this on the background. When triggered via a button “Re-download boot files” the UI could actually show STDOUT/STDERR for easier debugging.

We could also leverage Remote Execution or Ansible for this, but I think it is an overkill - 90% of all cases will only need two curl commands.

Alternatively, we could hardcode this into Foreman or Smart Proxy or both. A new field for Ubuntu OS would be probably required in this case: LiveCD URL. That’s because Ubuntu LiveCD URL does not look very stable, they appear to delete old minor versions from the site:

Currently it’s https://releases.ubuntu.com/20.04/ubuntu-20.04.3-live-server-amd64.iso and .2 version is no longer available. I think this will break very often and that’s the reason why I want the new solution to be open and flexible and easy to change.

5 Likes

In general, I think this is a good direction, it solves many long-standing issues. The OS should be ideally also able to tell, whether the provisioning is ready or not (files present, non-zero size etc).
I like the fact the templates would allow to construct more complex installation medium URLs. I suppose that the ERB template for Ubuntu 20+ would download the iso and extract the files. It seems pretty universal so it could easily support any OS.

1 Like

One additional comment: when Katello is installed, sync/CV operations can lead to update of kickstart files although it is rare. Foreman could provide a simple API (OperatingSystem.rebuild_boot_files) which Katello could call each time a content with kickstart repo is synced, promoted or published. Ideally if Katello could actually detect, if sha-sums of the PXE files has changed, in the worst case Katello would trigger boot files update on each CV operation with kickstarts (today we do this on each host install so still better). Or we can simply decide that boot files are not subject of CVs and only do this on Library sync.

What I like about templates design is that Katello could associate a different template named “Boot files Red Hat Katello” that could be more complex implementing the sync/CV handling while other templates like “Boot files CentOS” or “Boot files Debian” would remain very simply (just curl the files, done).

When OS is updated or deleted, template context need to know previous record state so it can also cleanup the old records, this is an important feature specifically for Katello where many CVs can be promoted creating many PXE files.

Hi,

Yesterday I was able to pxe-boot a host with the autoinstall (cloud-init) mode from the live-server iso.
I also have templates for preseed-pxe-linux, user-data (autoinstall), netplan-config and partitioning table.

Later today I will share what I have done in order to make it possible.

But it is for sure that additional work is needed in Forman to make the necessary files available in order to perform the full autoinstall.

Beside this, I still need to find out how to apply the install/config of puppet, subscription key, …

All this is done in Foreman 3.0.1 Katello 4.2.1

3 Likes

A post was split to a new topic: Autoinstalling ubuntu-server 20.04.3 from the live-iso

I think creating the files when OS is created is an appririate mechanism. Should consider a few scenarios:

  1. When a Smart Proxy is added to Foreman configuration after the OS is defined
  2. Have redundant Smart Proxies to support provisioning (probably a larger topic in itself)
  3. Impact on UEFI HTTP/HTTPS boot scenario without PXE
2 Likes

Great comment, there would be a redeploy button, we could even take it to the proxy page or automatically trigger it after a proxy with such feature is added.

PXE and UEFIHttp features share the same directory, so no problem there.

1 Like

API or config support for headless usage would be great as well

Update: Ubuntu dev confirmed that PXE files are heading back to Ubuntu 22.04:

So we do not need to rush this refactoring, however, it looks like versions 20.04 and 21.10 don’t have these files and users will need to download them manually.

1 Like

Index of /ubuntu/ubuntu/dists/focal/main/installer-amd64/current/legacy-images/netboot aren’t these the files for 20.04? 21.xx doesn’t have them. Or are there different files for the new installer?

Yeah right only 21.10 (Impish) is affected, not 20.04 sorry, that is a different story (that one breaks provisioning scripts because of the new installer). This is 404:

Here is a solution:

Currently it’s https://releases.ubuntu.com/20.04/ubuntu-20.04.3-live-server-amd64.iso and .2 version is no longer available. I think this will break very often and that’s the reason why I want the new solution to be open and flexible and easy to change.

Just wanted to comment on this. I believe you can still download the older ISO versions, even though they don’t appear explicitly on https://releases.ubuntu.com
But who knows if this will always be the case. For my environment, I rely on this to pull down scripts when building custom installers.

For example, you can run wget https://releases.ubuntu.com/20.04.2/ubuntu-20.04.2-live-server-amd64.iso and it will download the 20.04.2 ISO despite not appearing on the web-page.

I suppose this could just as easily be ‘template-ed’ i.e. something like
wget https://releases.ubuntu.com/<%= $ISO_VERSION %>/ubuntu-<%= $ISO_VERSION %>-live-server-amd64.iso

1 Like

Thanks for the heads-up.

I suggest to wait until the next release is out (22.04) and if the PXE files are back then using temporary Foreman URL might be a solution. There are concerns about copyright tho (see the PR).

Looks like legacy images are not available for 22.04.

1 Like

Allright, our priority must be to make deployment via the new installer fully supported (docs, tested). People are already sending patches into upstream, manual step of extracting the ISO is currently needed but we can document that. Distributing those files on our site was not very liked across the community and there might be copyright issues.