We’ve recently completed our migration to 3.3 and working on validating all of our workflows. This includes re-running provisioning on various types of hosts. When validating provisioning on a bare metal host we hit a snag with the default templates.
Our bare metal hosts are dual homed and bonded (8023ad). When kicking off a provisioning job, this adds the bond
parameters to the kernel line due to changes from https://github.com/theforeman/community-templates/pull/685 . In general I think those changes make sense. However, with the bond, what seems to be happening is em1 (member 0) comes up and grabs an address from DHCP. This interface is configured to be “force up” on the switch so that if LACP isn’t running it can still be used for PXE booting. As it continues configuring things, it sets up bond0 and added em1 and em2 to the bond. When it adds em1 it does not drop/release any already configured addresses and results in having an address obtained from DHCP on both em1 and bond0 confusing the server and causing dracut to timeout resulting in a failed OS installation. The rest of the install works as intended without this and creates the bond in the installed OS. Have others experienced this? How are you handling this?
OS:CentOS 7.9
TFM: 3.3
For now we are working around this by unlocking the kickstart_kernel_options
snippet and removing the bond directive. This is undesirable for a couple reasons, primarily because we’ve now edited a stock template that can (will?) get replace after an update. We could clone this template but that means we also need to clone everything that is utilizing the template (thankfully we are only kickstart today?) to apply the clone. We are trying to minimize the number of truly custom templates that we need to maintain in an effort to make future upgrades easier. See RFC for template management (next post).