Single Organization on resources to simplify taxonomies

ezr-ondrej · September 23, 2021, 11:35am

The current taxonomy model is very hard to understand for both Users and Developers, which in turn complicates life for both.

Even during the primary design [1], there were many arguments for a simpler model. There was even an followup RFC to change this to the said simpler model afterwards best summarized in Ivan’s comment on that [2]. We even have user feedback on the topic e.g. [3].

Every time we try to introduce a new resource/entity, we have a hard time figuring out how it fits with our existing multi-tenancy models. It’s also more complicated by the fact that some resources belong to a single Organization and Location (Host) while some others belong to multiple (Subnet, Domain). Some only support single Organization multi-tenancy but not Location at all (all Katello resources). Some ignore the multitenancy completely (Operating System). When you have a more complex object that combines other resources, it’s causing a lot of problems (e.g. Host living in org 1 being assigned to org-less OS, assigned to the subnet available in 3 orgs, being assigned LCE from a single Org but no Loc etc). Also with Katello, orgs can no longer be nested, while in non-katello installations Organizations can be nested in a tree structure, so Host in fact belongs to multiple organizations.

I’d like to propose a way out of that. In my current effort I need to have Ansible roles taxable at least by Organization so I’m proposing to start implementing this new model in Foreman Ansible, but I’d love to take it into the Foreman core for all the resources.

The proposal: Organizations should be the only taxonomy with a strict single organization per resource restriction and location would be only a classification, but not a multi-tenancy source.

This is a simple proposal that would have to be properly designed and thought through. Though the steps in my mind are:

Identify resources that can have multiple organizations assigned for no reason and simplify that by forcing a single organization (Hostgroup would be the first in Foreman core)
Identify resources that need to be shared across the organizations and figure a way to do so (keeping the current model would be the easiest) but need to be limited to only resources that has to be shared.
Consider dropping Organizations nesting, to get rid of multi-tenancy of models completely
Making location only a classification - remove Location from RBAC, stop enforcing its selection and stop filtering by it in default scope and permission checks

As this was discussed many times, I’d appreciate it if you raise blockers and thoughts to this plan in general and try not to deep dive into the implementation design yet, as I’ll have a thread once we’d start the design of each step where we can discuss implementation details.

[1] Organization/Location
[2] https://community.theforeman.org/t/rfc-replace-taxonomy-with-true-relationships-for-organizations-and-locations/4946/5
[3] Organizations and locations

Dirk · September 23, 2021, 1:36pm

I will reach out to a Red Hat customer where I am only doing Icinga who has probably the most complicated scenario here. Lets see if this can identify blockers with simplified taxonomies.

From other customers and environments:

Locations as classification should totally be fine, nesting is here nice but not necessary
Organisations are great for different customers or departments, if you need both nesting is great but it could be done in other ways for sure

So I think the idea is good as it simplifies development and perhaps also the user workflows and at least t does not limit the user.

Marek_Hulan · September 23, 2021, 3:09pm

For the record, I agree with the proposed end goal. I can see how hard it may be to get there, but I can find enough examples to justify the change. I see you posted this in Development, would it make sense to cross post or change the tag so that also users see that? I think their use-cases will be crucial when we enter the design phase and talk about what/how to share some resources in the future.

ekohl · September 23, 2021, 6:40pm

I suspect it will depend on the specific model.

For example, a host group is a concept that really represents organization without Foreman - there is nothing “out there” that it reflects. That means it’s probably easy to have one organization per host group.

However, a subnet reflects something “real” and can be shared. Let’s say you have an organization where there’s a shared 192.0.2.0/24 where all customers without their own subnet run their machines. If you have an organization for each customer, then you can’t model this. Also remember that on the Smart Proxy side there are no taxonomies and that free IP checking relies on a correct view.

Similarly, a Smart Proxy can also be shared. For example, a Puppet CA may be shared between customers in the same way as a subnet.

There may be some implications, such as that now the organization + name become unique rather than just the name. In some places our API allows querying by name (also in the URL).

So it’ll really depend on each model.

I also agree that this is more like an RFC. Any objections to moving it there?

lzap · September 24, 2021, 1:50pm

I would definitely take it further:

Make organization a single taxonomy for all resources
Make location only a classification
All resources must have exactly one organization assigned (no exception like OS or Subnet)
No Any Organization context - users have to select an organization after login

In the new model, nothing would be shared. Meaning, more than one organizations could create the very same domain, subnet or operating system. It makes a lot of sense, say there are two organizations A and B and both want to create example.com domain and subnet 192.168.1.0. We have this concept of “sharing” in current Foreman and I think it is a bad design.

And administrator should be able to deploy smart proxies for both A and B organizations, register them and org admins/users should be able to define their own networks, domains or OSes in an independent manner.

See, the two orgs can have the same domains, but they need to have two separate DNS backend systems to manage them. Same for subnet. These would be two idependent networks. Now, I understand that there might be a use case when two organization would share the same infrastructure - but then they need to operate under a single organization and only use location to classify resources.

This will work well with Katello as well, I believe this is exactly what Katello does - every Product, Repository, Content View must have an organization assigned. If anything is shared, then its on the backend level (Pulp will not redownload the same packages).

Dirk · October 4, 2021, 12:41pm

I got answer today and it looks like there is not so much complexity, just the high amount of organizations was what the consultant who built it “scared”. There are about 40 organizations used to separate the different business units and it seems to slow down the system.

So lets assume simplifying the taxonomies will result in the same amount or only some more organizations. I would also assume simpler means also better performance, so it should be fine but this number can give us a nice edge case for testing.

ekohl · October 4, 2021, 1:09pm

How would that work? Unless you do weird things, there can only be a single nameserver that serves example.com. The same with a subnet.

lzap · October 4, 2021, 4:46pm

A single DNS/DHCP server can be managed by two or more smart proxies with some limitations.

Duncan_Innes · November 24, 2021, 8:57am

While Locations might be hard to understand, they can be crucial in some areas of business.

We have jurisdictional restrictions where any server in a certain country can only be administered by admins from that same country (Switzerland). They are still servers which belong to the whole company however, so need to use exactly the same CV/CCVs as the rest of the estate. Splitting this into a separate Org for Switzerland might be acceptable for some users, but not all.

We also need to report data on all systems registered to the Satellite. Pulling some servers into a different Org complicates the task of reporting.

If Location drops to a classification, we would need to find a way to restrict modifications to any host in that Location to a specific set of users. Would this sit well with your “remove Location from RBAC” statement?

mhjacks · November 28, 2021, 5:40pm

I use organization a bit dfifferently, perhaps. I’m running Foreman/Katello in my home lab, where I serve and manage Fedora, CentOS Stream, and proper RHEL content. I use one organization for the Fedora and CentOS content, and another org for the RHEL content. (This is because simple content access is incompatible with the non-RHEL content.) I would much, much rather have a single organization, since I don’t believe I need this complexity.

Duncan_Innes · November 29, 2021, 4:13am

Incompatible? In what way is it incompatible?

It’s all just content at the end of the day. Are you using counted subscriptions for the CentOS/Fedora content, but not for the RHEL?

This would be a deal breaker for large Satellite customers who use SCA for their Red Hat content, but also push their own RPMs into repos. I’m not sure this is the case though.

Dirk · November 29, 2021, 8:09am

Simple content access gives you access to all repositories, but as some could be incompatible it allows to restrict the provided repositories by OS version and architecture. The OS version part is only available for RHEL so if another OS would be in the organisation with Simple content access you can not limit by the version. So here it is incompatible.

What you can do is restricting every repository, so a repository from other vendors including your own are perfectly fine.

And you can possibly create content views which limit the available repository to mitigate the incompatibility, but this would remove the simple from Simple content access.

mhjacks · November 29, 2021, 1:58pm

“Incompatible” might be the wrong word here. The rub here is managing content from CentOS, Fedora, and RHEL - the Venn diagram is pretty complex even for fairly small use cases (VScode and Chrome repos should be used by all 3, EPEL is for CentOS and RHEL, RPMfusion for Fedora only) - and the easiest way to manage it that I’ve seen is with products and “subscriptions” (even those things don’t really apply to CentOS/Fedora, Candlepin makes some things a bit easier when done that way).

I don’t mean to imply that I’m thrilled with my current setup, and if there’s a better way I’ll definitely follow it. I understand there’s a lot of work happening to improve the SCA experience; I just want to make sure that the content management experience for stuff that doesn’t have “magic” stays good too.

markdv77 · December 1, 2021, 4:09pm

Personally, with our use-cases in mind, I don’t see a reason NOT to share Hostgroup between Organizations. If there is a hard definition or prescribed use-case for Hostgroup I’ve missed it. The way we use Hostgroups is to group hosts with similar/same functionality. The fact that you can associate puppet classes with a hostgroup (which we do) seems to support this way of using them. Why should you not be able to use the same type of host in multiple Organizations?

We share literally EVERYTHING (that can be shared). We have different organizations in some ways and on some levels. But from an IT perspective it’s one single global effort that shares as much as possible. This is reflected in how we use foreman and it’s resources.

Specifically, in our case, the hostgroup will determine as what kind of server puppet will configure the hosts while the Organization determines certain Organization specific configurations of such a server for the specific organization it’s assigned to. (The Organization is a variable in the Hiera layer definition.)

If you do end up making Hostgroups not-sharable anymore, would the names have to be unique globally or only per Org? If it’s the former we could get by by duplicating them across all Organizations, ugly and cumbersome, but doable. If they have to be globally unique though we’d have to find workarounds for our Puppet manifests and/or Hiera-data too.

ezr-ondrej · December 3, 2021, 10:12pm

So what are you actually gaining by spliting into organizations, if you share everything? Is it just to label the resources to a specific group? Would that be supported by some simple tagging of the resources? Or do you benefit from the Organizations in some other way?

We would force uniqueness only per Org, but it should not mean to duplicate resources into Orgs, that would be a bad experience and I’d like to avoid it. I’d like to shift the mindset here to Organizations mean hard split of resources. We might need to define what Organization means first tho and then align all the resources accross the Project to that definition.

markdv77 · December 14, 2021, 9:45am

Both the Organization & Location are used in hiera. We have more, but somewhere in the stack defined in hiera.yaml there are:

  - name: "Foreman Location"
    mapped_paths: [ location_paths, path, "locations/%{path}.yaml" ]
  - name: "Foreman Organization"
    path: "organizations/%{::organization}.yaml"

So lookups from puppet will yield Location and/or Organization specific values.

( 'location_paths ’ is an array created from $location_title in site.pp. If $location_title is e.g. ‘top/sub/leaf’ the array wil contain ‘top/sub/leaf’, ‘top/sub’ & ‘top’ so we end up looking in three files from most to least specific.)

I doubt we could use tags for the same purpose (with the same ease).

Regards,
Mark.