Slow reports? Change to hash indices!

Hello,

during my research on reports performance, I came with two small patches which can both dramatically improve performance of report upload and expiration. While these patches are still to be reviewed, I would like to hear from experience from production systems.

  1. Foreman renders reports back after creation, however most (if not all) clients (puppet, ansible, openscap) do only check for the HTTP code. The rendering of huge JSON is slow (the culprit is likely ActiveRecord), but I am proposing to drop the biggest offender from the response - all the log lines. To test this, apply this small template change and restart the server to see if report processing is faster: https://github.com/theforeman/foreman/pull/7987/files

  2. Reports are broken up into lines (messages, sources) and stored as individual records. While we would like to rewrite this completely, it is possible to do a small change in postgres to boost performance quite a bit. Dropping btree indices on messages.digest and sources.digest columns replacing them with hash indices makes everything much faster - specifically on instances with many reports:
    https://github.com/theforeman/foreman/pull/7981#issuecomment-694741365

# su postgres -c "psql foreman"
foreman=# drop index index_messages_on_digest;
DROP INDEX
foreman=# create index index_messages_on_digest on messages using hash (digest);
CREATE INDEX
foreman=# drop index index_sources_on_digest;
DROP INDEX
foreman=# create index index_sources_on_digest on sources using hash (digest);
CREATE INDEX

Index creation was pretty fast on my testing VM: 18 million records per minute. Also all the changes are safe - RABL change will be overwritten during upgrade (no matter how we end up fixing this issue) and if we choose to convert indices this will happen during database migration as well (dropping existing and recreating them once again).

Please get back to us with numbers if you apply those changes:

  • How fast was report upload before and after the RABL patch (1)?
  • How fast was report upload before and after the hash index (2)?
  • How big were sources/messages tables before and after the indices creation (2)?
  • How quick was the index creation and how many records were in the two tables (2)?
  • Any observations during report expiration rake (cron) job?

Thanks!

4 Likes

So, I definitely want to test this. Will this work on older versions (1.21) of Foreman? Can I just do the index part?

Yes, any version that has PostgreSQL 10 can do this. Even on older versions of postgres you can do this, except that on 9 or older hash indices are not WAL logged. Meaning that if you have a crash, you need to drop and recreate indices manually. Just keep that in mind.

While we are still looking for reports from Puppet-heavy deployments, I already got some results from OpenSCAP heavy deployment and here the patch/changes do not help at all. OpenSCAP uses reports in slightly different way and it does not create many records in logs/messages tables. It also has a different RABL template so no change there. So if you are OpenSCAP shop, that might not work out for you. For Puppet, expect improvements tho.