Why don't we have unique index on host names?

lzap · June 18, 2021, 6:22am

Hello,

recent bugreport reminded me that we don’t have unique index on hostnames:

https://projects.theforeman.org/issues/32832

Foreman has unique RoR validator, but these validators are pure evil - unless you do not do those checks in SQL transactions, which you can’t even do, they are not consistent. I briefly remember that in RoR world, database constraints are not handled correctly, but isn’t this something that has been solved in recent RoR versions?

Jean in the issue makes a good point that Foreman positions itself as the infrastructure inventory, single point of truth. Well, when these things happen, you can’t really rely on it, that’s not great at all.

What it takes to enable unique index contraint on hostnames? We currently do have 40 of them in the database, so technically this is possible.

Marek_Hulan · June 18, 2021, 8:10am

I’m not aware of any issues with RoR and DB constraints. I believe they play nicely with validators.

I think the reason was, you can have two hosts with the same name in two different orgs. We never fully supported this model though and there would be probably other places where that does not work. So I’d say nothing prevents us to add such constraint.

A bit of off topic:
Validators are not pure evil, their purpose is to make it easy to validate user input. However the unique validator does not protect DB from duplicits. That’s why there are constraints. It’s like you’d say client side validators are pure evil They play a role in user experience but will never protect the server-side storage reliably. The same with backend -> DB.

ekohl · June 18, 2021, 8:23am

To add to that: AFAIK we do not uniquely identify a host by its hostname and never assume that. There’s a host UUID and a certname that is unique. At least, that’s how Puppet works. However, as @Marek_Hulan said, it probably is broken because not everyone was aware of this.

Recently I had a chat with @ehelms and I think we need some design principles that every developer/maintainer should know. Those are fundamental things about the Foreman design.

The example I then gave was that all info for a host to be configured should be in the ENC. That means the hostname, FQDN, network interfaces etc. But in the case of Katello it should also be the RHSM URL and content URL. Katello should add that info.

Taking this case, I can imagine we could add “hostname are not unique, do NOT assume that in your code; if you need it, use certname or uuid”.

Marek_Hulan · June 18, 2021, 8:34am

The problem with the not-unique hostname (or better to say fqdn) is, people typically use that in REX/Ansible without realizing, it’s the foreman proxy box that actually converts that to IP. Image based provisioning has the same problem I think. I’m almost sure smart class parameters and ansible variable matchers don’t support that.

If I’m not mistaken, certname is only set if we have it from Puppet. That means it’s not even avaialble during the provisioning (or when the host gets saved for the first time, before the Puppet CA orchestration is done). UUID is also based on incoming facts (this is I think more adopted in other fact sources).

I agree we need to make a decision and spread the word. But I’m not sure at this point we can say we can’t rely on unique host names.

ekohl · June 18, 2021, 8:46am

You’re right that it probably regressed to the point where we must make a choice. Do we really want to support non-unique or really make it unique. Now we probably have the worst of both worlds: can’t rely on it being unique but also can’t really use it non-unique.

Marek_Hulan · June 18, 2021, 8:49am

lzap · June 18, 2021, 9:38am

Yeah this was not what I meant with the rant. Unique validators are bad, because they simply don’t work

I am afraid we already made a choice:

validates :name, :presence => true, :uniqueness => true, :format => {:with => Net::Validations::HOST_REGEXP, :message => _(Net::Validations::              +HOST_REGEXP_ERR_MSG)}

Except it does not work in environments with some concurrent access.

What I am asking is do you see any roadblocks in making this a database constraint? Obviously migration can be problematic as there are many hosts with same hostnames out there. We would need to rename them somehow in the migration.

As much as I would love to have a real unique identifier across all our workflows, I think in reality this is not feasible. Having said that, I am not against it and if we choose to remove both the validator and the constraint in some future when everything is ready - let’s do it.

Marek_Hulan · June 18, 2021, 1:54pm

Ok, given the validator is already there for a long time, I’d officially say we don’t support multitenancy/full isolation and we can rely on hostname to be a unique identifier of the host. Not because it’s better, but simply because it’s reality today with no change planned soon. If you think it’s worth a bigger discussion, let’s do that in separate thread. Introducing a constraint is probably ok for now. Regarding existing duplicit records, IMHO it’s an edgecase we may fix by providing a rake script that would ask user whether to rename (changes DNS potentially) or delete such records. I’d expect the migration to fail in such case asking user to run the rake script first.

ehelms · June 18, 2021, 2:03pm

I agree with this. To my knowledge we have shared that with users in discussions many times and aimed code at the notion that hosts by way of hostname are unique for Foreman and not unique per organization. I think this has made sense to users but always skewed the lines of our various multi-tenancy ~~models~~ rules. And then let’s make sure we write this down clearly for users.

So if we can make the situation more concrete and less flimsy for users let’s make that change. If we want to persue changing this, let’s do as suggested and start up a separate conversation thread to weigh the pros and cons.

ekohl · June 18, 2021, 5:25pm

At first I wondered what name was. It looks like the managed host has shortname as well so name is the FQDN?

For example, having a db01.prod.example.com and db01.test.example.com should be supported, but if name is only db01 then it’ll be a problem. However, this doesn’t appear to be the case.

If I read that right, then I agree the uniqueness constraint should be enforced at the database. IMHO frontend only uniqueness enforcement is a bad idea. Both is probably ok.

Marek_Hulan · June 18, 2021, 8:45pm

If only it was that easy But in most cases, you right. Host model has name, hostname which are aliases and in most case equals to FQDN (despite their name). There’s fqdn which looks at fact with name fqdn and fallbacks to name. There’s also shortname which is the same without the domain. Domain can be found by domain_name. There’s a bit more magic in what exacly is stored where if we have or don’t have the respective domain defined. The best thing is probably to use hostname or its alias name.

Btw when I was refreshing my memory, I saw certname also fallbacks to name when unknown, so it wouldn’t be a good unique identifier.

lzap · June 23, 2021, 8:59am

Can you show some references to that? Because our codebase does not allow unique names across orgs. I am asking because:

I think we can just do this and schedule it into 3.0 as one of the bigger changes. Probably with the change, we can improve documentation on this and provide some guidance on how to approach this in multi-tennant environment:

https://projects.theforeman.org/issues/32857

I have this bad feeling that this change will break thousands of tests tho

lzap · June 23, 2021, 9:02am

For the record, the original author of the issue provided more details:

In case it helps, we have noted that in the hosts table, the conflicting data record is always (obviously) the ones with a greater auto-increment ID:

foreman=# select id,name from hosts where name = 'XXXXXXXX.XXXXX.XX.XX.XXXXXX.XXX';
id | name |
4211 | XXXXXXXX.XXXXX.XX.XX.XXXXXX.XXX |
3046 | XXXXXXXX.XXXXX.XX.XX.XXXXXX.XXX |
(2 rows)

It is a bit crude, but we deleted the records directly prior to applying the index like this:

foreman=# BEGIN;
BEGIN
foreman=*# DELETE from fact_values where host_id = 4211;
DELETE 829
foreman=*# DELETE from host_facets_reported_data_facets where host_id = 4211;
DELETE 1
foreman=*# DELETE from host_classes where host_id = 4211;
DELETE 0
foreman=*# DELETE from host_status where host_id = 4211;
DELETE 1
foreman=*# DELETE from nics where host_id = 4211;
DELETE 13
foreman=*# DELETE from reports where host_id = 4211;
DELETE 1
foreman=*# DELETE from tokens where host_id = 4211;
DELETE 0
foreman=*# DELETE from hosts where id=4211;
DELETE 1
foreman=# COMMIT;

Out of several thousand hosts, we only had 3 inconsistencies accros organizations and 1 inconsistency within the same organization. In all cases, the correct entry was always following logic (e.g: the oldest entry was the correct one).