Making Facts & Reports collecting optional

Idea from task #28825:

Storing lots of fact and/or report data can quickly go out of control and our research shows that not everyone uses these Foreman features. We would like to make fact and report data storing in Foreman database opt-in via Administer setting. For new installations this will be turned off.

After implementing it in PR #7475, discussion started about why the facts and reports collecting should (not) be enabled or disabled by default.

  • :white_check_mark: Enabled by default
  • :x: Disabled by default
  • :ballot_box_with_check: (original idea from the feature task) Enabled if user already collected some facts / reports OR Disabled when there are no facts / reports in the database.
0 voters

So my question is what community thinks about it and how we should implement it?

Let me explain, we have experienced several customers of Red Hat Satellite who had issues with performance of PostgreSQL and whole deployment because huge logs and fact_values tables. Thing it, they haven’t used facts or reports at all. While refactoring this is on our radar, we would like to ship a sane default to prevent this from happening.

The change introduces a clear message on the empty facts and reports pages so for users it’s obvious what to do. Also, this will not affect existing installations.

I’ll reply later with a more detailed response, but for now I’ve converted the options into a poll.

In a Foreman scenario I think configmanagement is a big part so I think this is an expected feature so it should be enabled. In a Katello scenario I can see environments where it is mostly used for content management especially staging the content, so I see a use for disabling it, but having different defaults seems bad, so I voted for enabled by default.

I think the entire design needs to go back to the drawing board and declare what it’s trying to achieve.

The current implementation breaks all configuration management. This is currently the default deployment meaning users will get their logs spammed with error messages. This scares away new users and overall doesn’t make sense.

I also wonder if this doesn’t break basically all use cases. subscription-manager also uploads facts. Possibly via a different endpoint, but you’ll still end up with data in your database. Now you just can’t see it anymore until you toggle some setting to on.

IMHO this is Satellite’s problem. It makes it mandatory to install Puppet meaning those facts are uploaded, even if on a technical level it’s possible to disable it. If you don’t want reports and facts, don’t upload them in the first place; don’t configure all uploaders while breaking the receiver.

A better solution might be to actually figure out how make the facts perform much better with some database redesign etc.

I already did that research, it’s been reported here on this forums. It’s just matter of priorities, Foreman is a huge codebase and it’s just handful of us working on it. But in short:

Reports should be stored as text, not as individual lines associated via single join table. Much faster updates and less space for data and indices, slightly slower searching through reports (tablescan in the worst case).

Facts should be stored also as JSON (text), only some key facts should be broken into key-value store. Those which need to be searchable. This is how Puppetmaster appears to store them (they call them volatile and stable facts IIRC).

I was the one initiating this patch when we were dealing with customer case, but I have to say at this moment that I am not sure about it. Ewoud raised good concerns about them, generally speaking we want to have less settings and I haven’t considered subscription-manager and also the fact we now parse puppet/rhsm facts creating some extra host attributes like uptime or cpu/core count and memory showing these in the UI and CLI. Also, when I initiated this months ago, I was not aware that new group led by @tbrisker will be created to figure out how to make puppet optional.

All and all, I know this might sound weird but, I am against the patch to be merged upstream. We should really priorize how facts and reports are stored in the DB.