I have to deal with various orchestration problems pretty regularly and today I have found out that Foreman breaks orchestration via the Update subnet/domain from facts feature.
When a managed interface with DHCP/DNS orchestration is updated from facts, no orchestration is actually triggered. I do not think this is a bug, triggering orchestration via puppet uploads would be performance suicide. However when Subnet, Domain or IP address is updated, that moment the inventory database is out of sync with reality. All the subsequent actions on that host will ultimately fail with conflicts and hard to troubleshoot errors with problematic workarounds.
For a very long time I think this was a bad design - provisioning and inventory information should be kept separate. Networking information is not the only one causing conflicts, the same story applies for Operating System (e.g. CentOS 8 vs CentOS Stream recent problem). However I do not thing it is the right time to reengineer this from scratch.
I would like to explore options we have to solve this because dealing with orchestration errors is very time consuming for both users and us. One solution which comes to my mind is radical but it makes sense: when a fact would update Subnet, Domain or IP on a managed interface, Foreman would simply refuse to perform this even when these settings would be turned on. We would advise users to uncheck managed flag if they desire to override what the host was provisioned with.
From the user perspective, we could present this via a new Host Status field (no idea for a name to):
- OK - host subnet/domain/IP is in sync with facts
- IP out of sync - change it manually or change NIC to unmanaged
- Subnet out of sync - change it manually or change NIC to unmanaged
- Domain out of sync - change it manually or change NIC to unmanaged
- Unknown - no relevant facts were reported
These statuses could nicely explain what just happened and what users need to do in order to fix the issue. We would keep the current Administer - Settings for users who want to ignore information from facts for all hosts as well, but the default behavior for managed NICs would be to ignore the changes and only update the overall Host status.
This feels like a good compromise. What was very often seen as “mystery inventory changes” is now well defined and visible through UI and API/CLI. We would not affect umnanaged hosts - those users who like to use Foreman as a plain inventory would see no difference. Only users with managed hosts would benefit better usability and no orchestration errors.