Foreman Host Groups - integration into Puppet Hiera Data

I’ve spoke about this with people and tried with limited success in the past to integrate foreman hostgroups into the hiera eco-system. I know people have done it and speak positively about it and people also speak very negative about it, it’s all personal preference, I’ve really like to give this a go. I’ve posted it here in community as it’s not really a support/problem, more I can’t find any reliable working documents to actually set this up, so I’d like to try to build one (or be pointed at one that exists that I’ve not found.

I’ve now got a fully automatable crash and burn lab, so I’m happy to test things to progress this.

My current test setup is a Rocky 9.5 x86_64 host, running a single node foreman 3.13 full component install.

The puppet master is OpenSource / Perforce puppet 8.7 with puppet 8.10 agent.

My Hiera config for a puppet environment is pretty generic,

# Hiera 5 Global configuration file

version: 5

defaults:
  data_hash: yaml_data
  datadir: data
# hierarchy:
#  - name: Common
#    data_hash: yaml_data
hierarchy:
  - name: "Per-node data"                   # Human-readable name.
    path: "nodes/%{trusted.certname}.yaml"  # File path, relative to datadir.

  - name: "Per-OS defaults"
    path: "os/%{facts.os.family}.yaml"

  - name: "Per-OS Version Specific defaults"
    path: "os/version/%{facts.os.name}-%{facts.os.release.major}.yaml"

  - name: "Common data"
    path: "common.yaml"

really basic, if you find something specific to a node use that, if you find something specific to an OS do that next, if you find something specific to a specific OS version, do that, other wise everyone gets everything in common.yaml

What I’d really like to do is try to emulate the roles / profiles model a little bit somewhere in between this hierarchy.

node → OS → OS Specific → App Major (role) → App Minor (profile) → Common

I’d like to a use or at least to try to use host groups to fill that app major/minor data.

eg:

hostgroup/everything/bind (major)/{master/slave}

so I could use the hostgroup bind to ensure common bind install / configuration, and the child hostgup of master/slave to config the bind service as a master or slave.

This works with a lot of usecases for me

hostgroup/webserver/{apache/nginx}/{sitename}

hostgroup/database/maria/{master/replica}

how do I go about approaching it so that the puppet master can use foreman hostgroups as a hiera datasource

1 Like

As Foreman is automatically adding hostgroup to the top level variables of the Puppet ENC, you should be able to simple use it in hiera as %{::hostgroup}. I have not used this for a while, but I think also getting here something like CentOS/CentOS 9 with slashes and whitespaces should not cause problems.

2 Likes

this may be the part of the puzzle I was always missing or never really understood.
Do you have any docs / more information on foreman adding the hostgroup to the variables, as this would now make sense how Hiera would be aware of it.

That one sentence, all these years, the missing link,

thank you

Sorry no documentation, but at least a short explanation.

If you just want to see what Foreman is giving to Puppet as ENC, you can easily see this in the UI. At the host on the tab Puppet is a sub-tab ENC Preview showing this. Parameters are given to Puppet to use in the code, classes is for assigning classes and environment enforces the usage of this Puppet environment and overrides configuration and command line.

If you want to know how it is technically integrated, the Foreman Installer drops a node.rb at /etc/puppetlabs/puppet and configures Puppet to use this as ENC. This script takes the FQDN and returns the same YAML structure you see in the UI.

For more details on how this works Puppet ENC or External Node Classifier is perhaps a good search term.

2 Likes

priceless, thanks Dirk, that’s really useful

looking at what foreman adds to the parameters, if I pick one of my real test cases, I see

 hostgroup: base/core/development/mariadb

so from the point of a Hiera reference, am I referencing a fully qualified hostgroup path as a single file, eg:

 base\/core\/development\/mariadb.yaml

or am I referencing a hierachy of files stored in directories (don’t think so)

 base
   --base.yaml
       core
          --core.yaml
             development
               --development.yaml 
etc 

or finally, is the hostgroup path a directory structure to get to the right file

base/
  core/
    development/
      /mariadb
       mariadb.yaml

from looking at the code (I’m scanning it very simply) it looks like the literal string of the hostgroup as a file,

however from your example (Centos / 9 /etc) that suggests it would follow the directory structure

Classifying nodes is probably the best starting point. What the installer does is step “Connect an ENC” with the node.rb script that @Dirk mentioned.

I would suggest to prefix the layer, so you have a hierachy with path hostgroup/%{hostgroup}.yaml Similar to you how the example also has os/%{facts.os.family}.yaml. That way when you look at directory structure you understand where it’s coming from.

2 Likes

the link and suggestion make a lot of sense, but it is referencing the fully expanded path as a single file name thought ?

managed to spend some time on this over the weekend in the home lab, got it working great, and managed to build a basic pattern that emulates roles and profiles really well, I know a few production environments that would really benefit from this now that I’ve spent some time really working it

For the clarity of this thread and anyone who reads it.

The following hiera config

  - name: "Hostgroup aligned parameters"
    path: "hostgroups/%{::hostgroup}.yaml"

sets up a mapping to a directory structure that maps to the hostgroup hierarchy, and as Dirk said you can see this from the

host → puppet → enc preview screen.

In the example I’m using here, the hostgroup is

hostgroup: base/core/development/dnsmaster

which maps to the file on the file system

/data/hostgroups/base/core/development/dnsmaster.yaml

data is the root of the hiera data, hostgroups is the ‘hostgroups’ directory in heira.yaml

I’m going to try to write a proper document on this based on what I’ve learnt as while it’s a niche setup, it’s got a lot of potential especially for horizontally scaling services and the tried and tested puppet roles and profiles pattern.

1 Like

I should probably raise this in a new thread in support now, as it’s getting more specific and support related as I dig a little more.

I’ve tried rolling this out in a bigger test environment with more diverse hostgroups and I’ve hit a challenge.

in my earlier testing, I was testing nodes in specific hostgroups to try to test roles and profile patterns, using technology that aligns to that, webserver/website, database/{master,slave} etc.

in the wider test environment there are nodes that are just sat there for general use/testing

so in my isolated tests detailed in this thread, I’d be using a node that was aligned to the hostgroup

hostgroup: base/core/development/dnsmaster

which mapped to

data/hostgroups/base/core/development/dnsmaster.yaml

however there are nodes in the test env that sit futher up that hierarchy, or nodes that don’t even have specific host group parameters

eg:

hostgroup: base/core

would be a node that has no profile, just the bases classes that every node gets and it’s parameters are picked up from common.yaml as they are common to all.

in this case, I get an error in the puppet run

Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Puppet::Parser::Compiler failed with error Errno::EISDIR: Is a directory - /etc/puppetlabs/code/environments/production/data/hostgroups/base/core on node picard.no-dns.co.uk

it’s not happy as to meet

data/hostgroups/base/core/development/dnsmaster.yaml

in the earlier example, the directories

data/hostgroups/base/core

needs to exist

so it’s unhappy that it’s expecting a yaml file but it’s got a directory.

my expectation would be similar to non-hostgroup based parameters, with an ifexist type setup, eg: I have

data/os/RedHat.yaml

the Debian nodes don’t error because there is no Debian.yaml they just ignore it, I’d expect the ‘core’ nodes to not error because there is no core.yaml move on because there are no paraemters.

To try to work around this I included the empty file

data/hostgroups/base/core.yaml 

so there is now a directory called core and a yaml file called core.yaml in the ‘base’ directory

Same error.

To get the puppet run to work, I had to delete the core directory (and everything under it) which breaks the hierarchy of my test usecase of the dnsmaster/slave setup

to get it working I needed an empty file

data/hostgroups/base/core.yaml

Is this the expected behaviour for using hostgroups to define parameters ? have I missed something in the test setup that is enabling this behaviour ?

Any advice or work arounds I could consider ?

after some useful help from the foreman and puppet community, I think I’ve got a better understanding of how this works, pros, cons, limitations.

The summary to me last question is ‘you can’t have a node aligned to a hostgroup that is a directory, and you can’t have a hostgroup that is both a directory and a yaml definition.’

So to work around this limitation in my working example

as hostgroup of

hostgroup: base/core

which was aligned to a directory had a additional level added to it

hostgroup: base/core/default

which on disk moved away from

data/hostgroups/base/core

to

data/hostgroups/base/core/default.yaml

this still makes the roles and profiles patterns work able, and depending on your needs there are some clever people in the puppet community who have done some clever things to overcome this limitation, but it is overkill for me needs.

I’ll try to write up a proper post on what I’ve learn on this as I believe there is a lot of value in this pattern for people.

1 Like