Rethinking host groups

tbrisker · March 13, 2019, 3:22pm

Hello,

Following my call to simplify Foreman, I think that host groups pose one of the main points of confusion and complexity today.

Host groups are complicated objects that serve (at least) two very different workflows:

As a “template” for a new host, which is used during provisioning to pre-define some or all of the attributes of the new host. These attributes are either copied to the host or retrieved from the host group during provisioning. To make things more complicated, some of these are later overridden in the host object by attributes reported during fact processing, making rebuilds with the original configuration difficult and confusing. Changes to these attributes after a host is built in most cases have no effect on hosts in the host group.
As a block of configuration management options, such as puppet classes to apply, ansible roles to execute and various variables and parameters related to them. These are applied during runtime and changes to these on the host group level usually apply to hosts in the host group.
Are there other use cases I’m not thinking off? Please comment below if you have others.

My proposal is to consider breaking them up into two separate entities (better naming suggestions welcome):

Host template: this will contain all settings needed for provisioning to work. It will remain attached to the host after it’s built, so that when rebuilding the user will be able to select whether to rebuild the host from the template or from its current attributes. The host attributes will only reflect those reported by the host and will not copy values from the template, eliminating the need for multiple settings for ignoring certain facts. It may also make a host wizard (such as was proposed here for example) much easier to implement.
Host configuration group: this will contain a set of configuration management related attributes - such as puppet classes and their respective parameters or ansible roles and their respective variables. Changes to these will impact all hosts assigned to the group. Perhaps they can also be connected to host templates so that a new host will get these executed after provisioning? Maybe we can leverage the existing config groups object for this?

I didn’t go into too much details on purpose, because I would first like to get some feedback from both users and developers regarding this direction. If and when we agree on the direction we can start breaking it up into stages that will eventually lead to the desired outcome without (hopefully) disrupting existing workflows too much. If you think this is not something we should invest any effort in, please also say so below, and I’ll think of something else to do in my spare time

Shimon_Shtein · March 13, 2019, 3:33pm

I like the idea of breaking hostgroup into more manageable pieces. for that.

I think hostgroup facets can help us here - they already break up the hostgroup itself by subjects. What do you say about that?

ekohl · March 13, 2019, 3:46pm

I think you should either use them or refactor it such a way there’s only one thing to use. Have two so very similar solutions is confusing at best.

Dirk · March 13, 2019, 3:58pm

And to add to the confusion Katello adds Host collections.

I have no problem with hostgroup being used for different workflows/use cases. I only struggle when behavior is different. So I would prefer a behavior like everything not overridden on the host it gets from the hostgroup instead of some copied over and some not.

As an example: One big problem I have at a customer is with subscription-manager register setting the Lifecycle environment and Content View from activation key, but not setting the Content Smart Proxy. But it is also not possible to set this on hostgroup level because hosts will not get it. So it seems like my task will be setting it via hammer, because without it hosts will not install from synced content.

TimoGoebel · March 13, 2019, 10:00pm

Yes. To group hosts and have one parent hostgroup where parameters are defined that are inherited to all hosts.
In provisioning templates we use Hostgroups like this @host.hostgroup.hosts.map(&:ip) or @host.hostgroup.parent.name.
We actually don’t use hostgroups as a host template. The added complexity is not worth it.
As an alternative to group hosts, I could see labels (like on Github issues) or tags that you can assign to a host.

What do you think about compute profiles? I think they also play a big role in increasing the confusion level. If it were up to me, we’d get rid of them as soon as possible. We could create a new hardware object that defines PXELoader, Bonding, Boot Order, … And a network template that defines compute resource network, interfaces, subnets, …

UXabre · March 13, 2019, 10:36pm

Yes! A thousand times yes!
It really should be closer to hardware! Or should models be “upgraded”? I particulary love the network interfaces aspect as this is exactly one of our key use-cases in which we have a need for setting up multiple network interfaces on our bare-metal server; which is currently only doable using compute profiles (which is a shame considering we rock!)

I’m also a very big fan of have a “compute pool” instead, from which resources can be consumed. This would allow us to, for instance, create a pool of pure virtualization machines and another pool for running a kubernetes cluster for instance. Compute pools would at least allow me to assign admin-rights to my “virtualization experts” while only giving them read-only rights for kubernetes nodes (provided that these pools can be assigned to locations / organizations / usergroups / …)

Also, if we want to take “host templates” to the next level, we should bring to attention that this is a template and things cannot only be added but also removed. Currently this is “annoying” to have a hostgroup with a few ansible roles assigned but if I, for some interesting reason, don’t want that role… I need to create either a new hostgroup (there goes my effort to taxonomize my resources ) or have a whole dependency tree of parent-child hostgroups.

Great topic!

Ondrej_Prazak · March 14, 2019, 10:04am

I definitely support the idea of breaking up the hostgroup into several pieces. I think the key point will be separating the information that is coming from the outside world from what user has configured. Just doing that will make things much better.

Since we already have facets, it makes sense to me to leverage them, it will allow us to move in direction we want to go in a long term (decoupling from Puppet, better plugability). It could open doors for plugins to use facets for extending host(groups) as a next step rather than continue in the way we do it now.

Compute profiles need a cleanup too, but I’ll save it for another thread…

Bernhard_Suttner · March 14, 2019, 10:09am

Similar/Same discussion:

Fully agree! Thanks for bringing this up again.

tbrisker · March 14, 2019, 1:31pm

Huh, I completely forgot we already had this discussion last year

lzap · March 14, 2019, 2:42pm

Whatever will lead to cleaner code and less confusion for users I fully support. Let me drop some random thoughts on this.

Host template must have copy-only policy, users must be prevented from overriding fields. This can simplify things quite a lot in our code and as you mention give users flexibility on reprovisioning. Also we can easily show difference between template and host, this could be a nice feature. This hasn’t been possible with nesting, the moment user changed something in the hostgroup everything was gone (well there was an audit record created).
Host templates should not have nesting, I don’t believe it is something that creates huge value for provisioning. Again, simplifies both on code and user level.
Is copy-only policy also applicable to host configuration templates? Because if we keep puppet/ansible parameters which can be nested separately, what is the added value in having inheritance for modules/classes? When a template is updated, Foreman could ask if to update all hosts in group or not. Foreman could also show differences.
Is inheritance for host configuration templates needed too? If we allow assigning hosts to multiple configuration templates (e.g. flat structure), this should be pretty flexible. Nesting will not be hot feature when we split provisioning and configuration items because the way people use nesting is usually BaseOS/AdditionalOSData/ServerRole.
Naming is hard - let’s pick accordingly to avoid confusion with provisioning templates, partition tables, ReX templates, host collections, hostgroup combinations, virtualization templates (VM templates) and Katello’s host collections. I’d vote for something completely new like blueprint or templet.

bhawksfan · March 15, 2019, 6:46pm

Nesting provisioning configuration is very useful to my organization. Our hostgroup structure is currently laid out as // for a total of about 900 fully qualified groups. CMS determines whether puppet or salt is in control and has our corporate default OS (CentOS 7). For cases where we need it, we override the OS in the role, leaving the pxe loader, provisioning templates, partition tables, etc alone.

Provisioning configuration in foreman isn’t exactly the most straight forward for those who don’t spend their entire day looking at it, since there are so many pieces to it. It would be very likely for misconfigurations to occur if we had to duplicate that configuration in each of our 900 hostgroups.

lzap · March 19, 2019, 1:36pm

There is no doubt nesting is powerful, what we are aiming at is something actually better than that:

ability to compare what’s set in group/profile/name_it vs what’s set for hosts
much less bugs related to nesting
cleaner interface for users (Inherit button vs blank value vs nil value)
cleaner code we need to maintain

Please don’t disregard ideas just because this will be a painful change. Elaborate on what exactly is wrong with the current proposal(s).

bhawksfan · March 19, 2019, 8:04pm

I think breaking these concepts apart makes sense for all the reasons listed above. The only thing I have a concern about is the removal of nesting. Since this is a design in progress, I was asking was that you consider this use case while going through that design process.

Looking back at my previous message, it seems that our hostgroup design got “removed” somehow - which is CMS/role/level - where CMS specifies our configuration management and default OS (centos7). We override the OS in the role, because that’s the reason they have a different OS (greenplum servers require redhat7). Out of our 900 hostgroups, only 55 of them override the OS.

If it would help, I’d be happy to discuss this “offline” (IRC, slack, direct email, phone, etc).

How do you anticipate auto-provision rules to adjust to the split?

lzap · March 20, 2019, 12:02pm

What exactly is a level? It looks like this is pretty much aligned with the split:

CMS - configuration group
role - provisioning group
level - ?

Natural approach would be to split those groups as well, so a rule would be associated with both.

bhawksfan · March 20, 2019, 2:57pm

Level is our machine’s environment (dev, test, prod). We called it level to reduce confusion with the overloaded term “environment”. It’s mostly for monitoring and occasionally for puppet/salt code, but it is not a puppet/salt environment or anything that foreman has an attribute for. It is extremely rare that we put anything in there (dev, test, and prod should be as similar as possible). If we do configure anything there, it’s temporary - dev, then test, then back to the parent role as we move through the development cycle.

We use foreman’s hostgroup for our role classification (puppet roles/profiles) and integrate that with most of the rest of our infrastructure - monitoring, server access, etc. So the role is really defines both the “configuration group” and the “provisioning group” as I understand the proposal.

For example, greenplum servers need to be provisioned as Redhat 7 (provisioning group), using a specific partition table (provisioning group), and install the greenplum software (configuration group). Hadoop servers, on the other hand, need to be provisioned as CentOS 7, using their partition table, and install the hadoop software.

I’m tempted to suggest that we could configure provisioning groups by naming them for the different pieces that comprise them - puppet-centos7, puppet-redhat7, salt-centos7, salt-redhat7. That seems to get unwieldy when you extend this to more OS options (7.5 vs 7.6, centos/redhat 8), partition tables, provisioning templates, salt/puppet masters, network configuration, etc.

lzap · March 21, 2019, 8:43am

Thanks for info, this is useful.

But what exactly is different between “puppet-centos7” (flat structure) and “puppet/centos7” (inherit structure) when we give you tools to easily create new groups based on others (e.g. create group from group - like copying all the flags) and tools to compare those easily. In the end, amount of entries you need to deal with is the same because the main idea is still to split host and provisioning into two separate groups.

bhawksfan · March 21, 2019, 3:56pm

That’s a fair point.

The main advantage of the inheritance model isn’t about creation, it’s about updates. When I update a role, all of its levels get changed as well. When I update the corporate defaults, there’s only 2 CMS hostgroups we need to update.

If we find that we need to maintain a duplicate hierarchy between the provisioning and configuration groups, inheritance dramatically simplifies things. When we update our default OS to CentOS 7.7, that’s 850 provisioning groups that would have to get changed.

lzap · March 22, 2019, 9:15am

Agreed, if we ever decide to remove hierarchy (and so far I’ve only proposed this in this thread - we are anywhere else than us two thinking about this) then this must go hand in hand with ability to “propagate” changes to hosts or other groups.