Has anyone used PostgreSQL partitioning to help with database maintenance in Foreman? For example, creating partitions for the larger tables (logs, reports, audits, fact_values, etc) and just dropping the old partition rather than using foreman-rake crons?
Have you had any issues with it? Any positive outcomes?
It’s indeed a good idea, I haven’t heard about this myself. Be sure to make notes and get back to us with instructions and results so we could incorporate this into our docs.
We would like to optimize reports (logs) and facts (fact_values) some day, but until we get there partitioning would be great answer. Report expiration is for example huge pain and its painfully slow. Be sure to turn off the cron job which takes care of expiration and do it on the partition level instead.
One remark for fact_values table tho, we used to have created_at and updated_at columns in this table but we dropped these. If would not be good fit anyway. What makes more sense in this case is probably hash partitioning, I am not sure about this one.
Well, it was a big failure, unfortunately. I have our DBA looking at it to try to figure out why. We tried to add partitions to the fact_names, fact_values, logs, audits and reports tables. We used pg_partman for this. In order for it to work, we had to modify the tables as such:
ALTER TABLE $table_name ALTER COLUMN created_at SET NOT NULL;
When we started Foreman back up, we had all kinds of problems. One of the errors I managed to catch was:
STATEMENT: SELECT 1 AS one FROM "fact_values" WHERE "fact_values"."fact_name_id" IS NULL and "fact_values"."host_id" = $1 LIMIT $2
ERROR: null value in column "fact_name_id" violates non-null constraint
DETAIL: Failing row contains (360774385, $ipaddress, null, 217538, $datestamp, $datestamp).
CONTEXT: SQL statement "INSERT INTO public.fact_values_p2020_07_27 VALUES (NEW.*)"
public.fact_values_p2020_07_27 is one of the partition tables for fact_values.
Now, I don’t know if it mattered or not, but, when I did the upgrade, I dumped the database but excluded data from the tables I mentioned above. So when we started Foreman with 1.21, the tables were empty.
You can also look at switching fact* and messages, logs and sources tables to unlogged mode. It is less reliable however facts can be easily reconstructed once hosts check in and the same is for reports.