Slow generate applicability

Problem:
I have noticed recently, that after updates on a larger number of hosts it takes a while until upgradable package counts get updated. I can see that a lot of “Bulk generate applicability for hosts” are running for several minutes, sometimes more than 20 minutes. I see also that postgres is very busy. The postgres logs contain long running queries.

Expected outcome:
Faster “Bulk generate applicability for hosts” runs

Foreman and Proxy versions:
foreman-3.14.0-1.el9.noarch
katello-4.16.1-1.el9.noarch

Distribution and version:
AlmaLinux 9.6

Other relevant data:
In the postgres logs I can find a lot of queries like the following which take approx. one minute each. Each query seems to be for each host after each update task. I guess, that’s the reason why it takes so long:

2025-06-23 07:14:54 CEST LOG:  duration: 49038.861 ms  execute <unnamed>: SELECT "katello_rpms"."id" FROM "katello_rpms" INNER JOIN katello_repository_rpms ON
	            katello_rpms.id = katello_repository_rpms.rpm_id INNER JOIN katello_installed_packages ON
	            katello_rpms.name = katello_installed_packages.name AND
	            katello_rpms.arch = katello_installed_packages.arch AND
	            katello_rpms.evr > katello_installed_packages.evr AND
	            katello_installed_packages.id in (SELECT DISTINCT ON (katello_installed_packages.name, katello_installed_packages.arch) katello_installed_packages.id FROM katello_installed_packages INNER JOIN katello_host_installed_packages ON katello_installed_packages.id = katello_host_installed_packages.installed_package_id WHERE katello_host_installed_packages.host_id = 347 ORDER BY katello_installed_packages.name, katello_installed_packages.arch, katello_installed_packages.evr DESC) LEFT JOIN katello_module_stream_rpms ON
	            katello_rpms.id = katello_module_stream_rpms.rpm_id INNER JOIN katello_host_installed_packages ON
	            katello_installed_packages.id = katello_host_installed_packages.installed_package_id WHERE (katello_repository_rpms.repository_id in (1971,16489,28501,16492,16491,16490,16493,190048,226614)) AND (katello_host_installed_packages.host_id = 347) AND ((katello_module_stream_rpms.module_stream_id IS NULL AND
	            katello_installed_packages.id NOT IN (SELECT "katello_installed_packages"."id" FROM "katello_installed_packages" WHERE 1=0 AND 1=0)) OR
	            (katello_module_stream_rpms.module_stream_id IN (SELECT "katello_module_streams"."id" FROM "katello_module_streams" inner join katello_available_module_streams on
	            katello_module_streams.name = katello_available_module_streams.name and
	            katello_module_streams.stream = katello_available_module_streams.stream and
	            katello_module_streams.context = katello_available_module_streams.context inner join katello_host_available_module_streams on
	            katello_available_module_streams.id = katello_host_available_module_streams.available_module_stream_id WHERE (katello_host_available_module_streams.host_id = 347 and
	            katello_host_available_module_streams.status = 'enabled'))
	            AND katello_installed_packages.id IN (SELECT "katello_installed_packages"."id" FROM "katello_installed_packages" WHERE 1=0 AND 1=0)))
2025-06-23 07:14:58 CEST LOG:  duration: 48128.038 ms  execute <unnamed>: SELECT "katello_rpms"."id" FROM "katello_rpms" INNER JOIN katello_repository_rpms ON
	            katello_rpms.id = katello_repository_rpms.rpm_id INNER JOIN katello_installed_packages ON
	            katello_rpms.name = katello_installed_packages.name AND
	            katello_rpms.arch = katello_installed_packages.arch AND
	            katello_rpms.evr > katello_installed_packages.evr AND
	            katello_installed_packages.id in (SELECT DISTINCT ON (katello_installed_packages.name, katello_installed_packages.arch) katello_installed_packages.id FROM katello_installed_packages INNER JOIN katello_host_installed_packages ON katello_installed_packages.id = katello_host_installed_packages.installed_package_id WHERE katello_host_installed_packages.host_id = 284 ORDER BY katello_installed_packages.name, katello_installed_packages.arch, katello_installed_packages.evr DESC) LEFT JOIN katello_module_stream_rpms ON
	            katello_rpms.id = katello_module_stream_rpms.rpm_id INNER JOIN katello_host_installed_packages ON
	            katello_installed_packages.id = katello_host_installed_packages.installed_package_id WHERE (katello_repository_rpms.repository_id in (16490,16491,16489,16492,28501,1971,16493,190048,226614)) AND (katello_host_installed_packages.host_id = 284) AND ((katello_module_stream_rpms.module_stream_id IS NULL AND
	            katello_installed_packages.id NOT IN (SELECT "katello_installed_packages"."id" FROM "katello_installed_packages" WHERE 1=0 AND 1=0)) OR
	            (katello_module_stream_rpms.module_stream_id IN (SELECT "katello_module_streams"."id" FROM "katello_module_streams" inner join katello_available_module_streams on
	            katello_module_streams.name = katello_available_module_streams.name and
	            katello_module_streams.stream = katello_available_module_streams.stream and
	            katello_module_streams.context = katello_available_module_streams.context inner join katello_host_available_module_streams on
	            katello_available_module_streams.id = katello_host_available_module_streams.available_module_stream_id WHERE (katello_host_available_module_streams.host_id = 284 and
	            katello_host_available_module_streams.status = 'enabled'))
	            AND katello_installed_packages.id IN (SELECT "katello_installed_packages"."id" FROM "katello_installed_packages" WHERE 1=0 AND 1=0)))

Postgres explain analyze for the second query above:

foreman=# explain analyze SELECT "katello_rpms"."id" FROM "katello_rpms" INNER JOIN katello_repository_rpms ON
                    katello_rpms.id = katello_repository_rpms.rpm_id INNER JOIN katello_installed_packages ON
                    katello_rpms.name = katello_installed_packages.name AND
                    katello_rpms.arch = katello_installed_packages.arch AND
                    katello_rpms.evr > katello_installed_packages.evr AND
                    katello_installed_packages.id in (SELECT DISTINCT ON (katello_installed_packages.name, katello_installed_packages.arch) katello_installed_packages.id FROM katello_installed_packages INNER JOIN katello_host_installed_packages ON katello_installed_packages.id = katello_host_installed_packages.installed_package_id WHERE katello_host_installed_packages.host_id = 284 ORDER BY katello_installed_packages.name, katello_installed_packages.arch, katello_installed_packages.evr DESC) LEFT JOIN katello_module_stream_rpms ON
                    katello_rpms.id = katello_module_stream_rpms.rpm_id INNER JOIN katello_host_installed_packages ON
                    katello_installed_packages.id = katello_host_installed_packages.installed_package_id WHERE (katello_repository_rpms.repository_id in (16490,16491,16489,16492,28501,1971,16493,190048,226614)) AND (katello_host_installed_packages.host_id = 284) AND ((katello_module_stream_rpms.module_stream_id IS NULL AND
                    katello_installed_packages.id NOT IN (SELECT "katello_installed_packages"."id" FROM "katello_installed_packages" WHERE 1=0 AND 1=0)) OR
                    (katello_module_stream_rpms.module_stream_id IN (SELECT "katello_module_streams"."id" FROM "katello_module_streams" inner join katello_available_module_streams on
                    katello_module_streams.name = katello_available_module_streams.name and
                    katello_module_streams.stream = katello_available_module_streams.stream and
                    katello_module_streams.context = katello_available_module_streams.context inner join katello_host_available_module_streams on
                    katello_available_module_streams.id = katello_host_available_module_streams.available_module_stream_id WHERE (katello_host_available_module_streams.host_id = 284 and
                    katello_host_available_module_streams.status = 'enabled'))
                    AND katello_installed_packages.id IN (SELECT "katello_installed_packages"."id" FROM "katello_installed_packages" WHERE 1=0 AND 1=0)));
                                                                                                                                                     QUERY PLAN                                                                              
                                                                       
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------
 Nested Loop  (cost=5098.04..56858.95 rows=3 width=4) (actual time=12117.250..50681.607 rows=6 loops=1)
   ->  Nested Loop Left Join  (cost=5097.61..56842.45 rows=17 width=4) (actual time=94.264..50662.319 rows=1656 loops=1)
         Filter: (((katello_module_stream_rpms.module_stream_id IS NULL) AND (NOT (hashed SubPlan 1))) OR ((hashed SubPlan 2) AND (hashed SubPlan 3)))
         Rows Removed by Filter: 499
         ->  Nested Loop  (cost=4901.74..56622.24 rows=67 width=8) (actual time=94.189..50640.951 rows=2147 loops=1)
               ->  Nested Loop  (cost=4901.31..6525.93 rows=8 width=324) (actual time=6.866..20.550 rows=619 loops=1)
                     Join Filter: ("ANY_subquery".id = katello_installed_packages.id)
                     ->  Hash Join  (cost=4901.02..6075.39 rows=82 width=8) (actual time=6.841..11.337 rows=619 loops=1)
                           Hash Cond: (katello_host_installed_packages.installed_package_id = "ANY_subquery".id)
                           ->  Bitmap Heap Scan on katello_host_installed_packages  (cost=13.59..1185.20 rows=667 width=4) (actual time=0.091..1.670 rows=627 loops=1)
                                 Recheck Cond: (host_id = 284)
                                 Heap Blocks: exact=67
                                 ->  Bitmap Index Scan on katello_host_installed_packages_host_id  (cost=0.00..13.42 rows=667 width=0) (actual time=0.075..0.076 rows=664 loops=1)
                                       Index Cond: (host_id = 284)
                           ->  Hash  (cost=4884.93..4884.93 rows=200 width=4) (actual time=6.725..6.734 rows=619 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 30kB
                                 ->  HashAggregate  (cost=4882.93..4884.93 rows=200 width=4) (actual time=6.334..6.523 rows=619 loops=1)
                                       Group Key: "ANY_subquery".id
                                       Batches: 1  Memory Usage: 89kB
                                       ->  Subquery Scan on "ANY_subquery"  (cost=4869.59..4881.26 rows=667 width=4) (actual time=5.118..6.033 rows=619 loops=1)
                                             ->  Unique  (cost=4869.59..4874.59 rows=667 width=324) (actual time=5.116..5.688 rows=619 loops=1)
                                                   ->  Sort  (cost=4869.59..4871.26 rows=667 width=324) (actual time=5.114..5.278 rows=627 loops=1)
                                                         Sort Key: katello_installed_packages_1.name, katello_installed_packages_1.arch, katello_installed_packages_1.evr DESC
                                                         Sort Method: quicksort  Memory: 345kB
                                                         ->  Nested Loop  (cost=13.88..4838.30 rows=667 width=324) (actual time=0.080..3.694 rows=627 loops=1)
                                                               ->  Bitmap Heap Scan on katello_host_installed_packages katello_host_installed_packages_1  (cost=13.59..1185.20 rows=667 width=4) (actual time=0.049..0.675 rows=627 loops=1)
                                                                     Recheck Cond: (host_id = 284)
                                                                     Heap Blocks: exact=67
                                                                     ->  Bitmap Index Scan on katello_host_installed_packages_host_id  (cost=0.00..13.42 rows=667 width=0) (actual time=0.040..0.041 rows=664 loops=1)
                                                                           Index Cond: (host_id = 284)
                                                               ->  Index Scan using katello_installed_packages_pkey on katello_installed_packages katello_installed_packages_1  (cost=0.29..5.48 rows=1 width=324) (actual time=0.004..0.004 
rows=1 loops=627)
                                                                     Index Cond: (id = katello_host_installed_packages_1.installed_package_id)
                     ->  Index Scan using katello_installed_packages_pkey on katello_installed_packages  (cost=0.29..5.48 rows=1 width=324) (actual time=0.009..0.009 rows=1 loops=619)
                           Index Cond: (id = katello_host_installed_packages.installed_package_id)
                           Filter: ((NOT (hashed SubPlan 1)) OR (hashed SubPlan 3))
                           SubPlan 1
                             ->  Result  (cost=0.00..0.00 rows=0 width=0) (actual time=0.001..0.001 rows=0 loops=2)
                                   One-Time Filter: false
                           SubPlan 3
                             ->  Result  (cost=0.00..0.00 rows=0 width=0) (never executed)
                                   One-Time Filter: false
               ->  Index Scan using katello_rpms_fields_index on katello_rpms  (cost=0.42..6262.03 rows=1 width=331) (actual time=73.828..81.768 rows=3 loops=619)
                     Index Cond: (((name)::text = (katello_installed_packages.name)::text) AND ((arch)::text = (katello_installed_packages.arch)::text))
                     Filter: (evr > katello_installed_packages.evr)
                     Rows Removed by Filter: 48
         ->  Index Scan using katello_msrpm_rpm_idx on katello_module_stream_rpms  (cost=0.29..0.33 rows=2 width=16) (actual time=0.006..0.006 rows=0 loops=2147)
               Index Cond: (rpm_id = katello_rpms.id)
         SubPlan 2
           ->  Nested Loop  (cost=0.56..195.58 rows=1 width=8) (actual time=0.052..0.054 rows=0 loops=1)
                 Join Filter: (((katello_available_module_streams.name)::text = (katello_module_streams.name)::text) AND ((katello_available_module_streams.stream)::text = (katello_module_streams.stream)::text) AND ((katello_available_mo
dule_streams.context)::text = (katello_module_streams.context)::text))
                 ->  Nested Loop  (cost=0.56..16.79 rows=1 width=21) (actual time=0.051..0.053 rows=0 loops=1)
                       ->  Index Only Scan using rpm_and_module_applicability_related_indices on katello_host_available_module_streams  (cost=0.29..8.49 rows=1 width=8) (actual time=0.050..0.051 rows=0 loops=1)
                             Index Cond: ((host_id = 284) AND (status = 'enabled'::text))
                             Heap Fetches: 0
                       ->  Index Scan using katello_available_module_streams_pkey on katello_available_module_streams  (cost=0.28..8.29 rows=1 width=29) (never executed)
                             Index Cond: (id = katello_host_available_module_streams.available_module_stream_id)
                 ->  Seq Scan on katello_module_streams  (cost=0.00..133.74 rows=2574 width=29) (never executed)
   ->  Index Only Scan using index_katello_repository_rpms_on_rpm_id_and_repository_id on katello_repository_rpms  (cost=0.43..0.96 rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=1656)
         Index Cond: (rpm_id = katello_rpms.id)
         Filter: (repository_id = ANY ('{16490,16491,16489,16492,28501,1971,16493,190048,226614}'::integer[]))
         Rows Removed by Filter: 6
         Heap Fetches: 66
 Planning Time: 2.646 ms
 Execution Time: 50681.855 ms
(64 rows)

1 Like

@gvde thanks for the report, we’ve been seeing more issues lately with regards to applicability generation so we’ll need to look into what changed.

It looks like we are missing an index on installed packages for name, arch, & evr:

Indexes:
    "katello_installed_packages_pkey" PRIMARY KEY, btree (id)
    "index_katello_installed_packages_on_name_and_nvra" btree (name, nvra)
    "index_katello_installed_packages_on_nvra" btree (nvra)
    "index_katello_installed_packages_on_nvra_and_epoch" btree (nvra, epoch)
    "index_katello_installed_packages_on_nvrea" UNIQUE, btree (nvrea)

vs for RPMs:

Indexes:
    "katello_rpms_pkey" PRIMARY KEY, btree (id)
    "index_katello_rpms_on_name_and_arch_and_evr" btree (name, arch, evr)
    "index_katello_rpms_on_pulp_id" UNIQUE, btree (pulp_id)
    "katello_rpms_fields_index" btree (id, pulp_id, name, version, release, arch, version_sortable, release_sortable)

Creating that index might help with what seems to be a costly sorting of installed packages

1 Like

I have just tested it with the query above. It doesn’t seem to make a different if there is an index or not:

without:

foreman=# explain analyze SELECT "katello_rpms"."id" FROM "katello_rpms" INNER JOIN katello_repository_rpms ON
                    katello_rpms.id = katello_repository_rpms.rpm_id INNER JOIN katello_installed_packages ON
                    katello_rpms.name = katello_installed_packages.name AND
                    katello_rpms.arch = katello_installed_packages.arch AND
                    katello_rpms.evr > katello_installed_packages.evr AND
                    katello_installed_packages.id in (SELECT DISTINCT ON (katello_installed_packages.name, katello_installed_packages.arch) katello_installed_packages.id FROM katello_installed_packages INNER JOIN katello_host_installed_packages ON katello_installed_packages.id = katello_host_installed_packages.installed_package_id WHERE katello_host_installed_packages.host_id = 284 ORDER BY katello_installed_packages.name, katello_installed_packages.arch, katello_installed_packages.evr DESC) LEFT JOIN katello_module_stream_rpms ONllo_rpms.evr > k                    katello_rpms.id = katello_mo                    katello_rpms.id = k                    katello_rpms.id = katello_mo                    katello                    kate                                                        katello_rpms.id = katello_module_stream_rpms.rpm_id INNER JOIN katello_host_installed_packages ON ON ON ONges ONtello_host                    katello_installe                    katello_inst                                                                    katello_installed_packages.id = katello_host_installed_packages.installed_package_id WHERE (katello_repository_rpms.repository_id in (16490,16491,16489,16492,28501,1971,16493,190048,226614)) AND (katello_host_installed_packages.host_id = 284) AND ((katello_module_stream_rpms.module_stream_id IS NULL AND AND AND ANDNULL AND                  katello_installed_packages.id NOT                     katello_installed_packages.i                    katello_in                    katello_in                                katello_installed_packages.id NOT IN (SELECT "katello_installed_packages"."id" FROM "katello_installed_packages" WHERE 1=0 AND 1=0)) OR0)) OR0)) OR0)) ORAND 1=0)) OR         (katello_module                    (katello_module                                  (katello_module_stream_rpms.module_stream_id IN (SELECT "katello_module_streams"."id" FROM "katello_module_streams" inner join katello_available_module_streams oneams oneams oneams onule_streams on     katello_module_strea                                    katello_module_streams.name = katello_available_module_streams.name andname andname andname andstreams.name andatello_available_mo                    katello_module_streams.st                    katello_module_streams.st                                      katello_module_streams.stream = katello_available_module_streams.stream andtream andtream andtream andstreams.stream and                  katello_module_streams.context                    katello_module_streams.context                                        katello_module_streams.context = katello_available_module_streams.context inner join katello_host_available_module_streams onstreams onstreams onstreams onle_module_streams on      katello_available_module_streams.id                    ka                    katello_available_module_streams.id = katello_host_available_module_streams.available_module_stream_id WHERE (katello_host_available_module_streams.host_id = 284 andd = 284 andd = 284 andd = 284 andeams.host_id = 284 and
                    katello_host_available_module_streams.status = 'enabled'))
                    AND katello_installed_packages.id IN (SELECT "katello_installed_packages"."id" FROM "katello_installed_packages" WHERE 1=0 AND 1=0)));
                                                                                                                                                     QUERY PLAN                                                                                                   
                                                  
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------
 Nested Loop  (cost=5102.04..56863.44 rows=3 width=4) (actual time=11109.243..46700.537 rows=6 loops=1)
   ->  Nested Loop Left Join  (cost=5101.61..56846.94 rows=17 width=4) (actual time=98.922..46677.362 rows=1656 loops=1)
         Filter: (((katello_module_stream_rpms.module_stream_id IS NULL) AND (NOT (hashed SubPlan 1))) OR ((hashed SubPlan 2) AND (hashed SubPlan 3)))
         Rows Removed by Filter: 499
         ->  Nested Loop  (cost=4905.74..56626.73 rows=67 width=8) (actual time=98.841..46652.834 rows=2147 loops=1)
               ->  Nested Loop  (cost=4905.31..6530.42 rows=8 width=324) (actual time=7.532..21.650 rows=619 loops=1)
                     Join Filter: ("ANY_subquery".id = katello_installed_packages.id)
                     ->  Hash Join  (cost=4905.02..6079.39 rows=82 width=8) (actual time=7.510..12.640 rows=619 loops=1)
                           Hash Cond: (katello_host_installed_packages.installed_package_id = "ANY_subquery".id)
                           ->  Bitmap Heap Scan on katello_host_installed_packages  (cost=13.59..1185.20 rows=667 width=4) (actual time=0.082..1.733 rows=627 loops=1)
                                 Recheck Cond: (host_id = 284)
                                 Heap Blocks: exact=67
                                 ->  Bitmap Index Scan on katello_host_installed_packages_host_id  (cost=0.00..13.42 rows=667 width=0) (actual time=0.060..0.061 rows=664 loops=1)
                                       Index Cond: (host_id = 284)
                           ->  Hash  (cost=4888.93..4888.93 rows=200 width=4) (actual time=7.411..7.422 rows=619 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 30kB
                                 ->  HashAggregate  (cost=4886.93..4888.93 rows=200 width=4) (actual time=7.046..7.232 rows=619 loops=1)
                                       Group Key: "ANY_subquery".id
                                       Batches: 1  Memory Usage: 89kB
                                       ->  Subquery Scan on "ANY_subquery"  (cost=4873.59..4885.26 rows=667 width=4) (actual time=6.021..6.799 rows=619 loops=1)
                                             ->  Unique  (cost=4873.59..4878.59 rows=667 width=324) (actual time=6.019..6.520 rows=619 loops=1)
                                                   ->  Sort  (cost=4873.59..4875.26 rows=667 width=324) (actual time=6.005..6.152 rows=627 loops=1)
                                                         Sort Key: katello_installed_packages_1.name, katello_installed_packages_1.arch, katello_installed_packages_1.evr DESC
                                                         Sort Method: quicksort  Memory: 345kB
                                                         ->  Nested Loop  (cost=13.88..4842.30 rows=667 width=324) (actual time=0.057..4.803 rows=627 loops=1)
                                                               ->  Bitmap Heap Scan on katello_host_installed_packages katello_host_installed_packages_1  (cost=13.59..1185.20 rows=667 width=4) (actual time=0.030..0.841 rows=627 loops=1)
                                                                     Recheck Cond: (host_id = 284)
                                                                     Heap Blocks: exact=67
                                                                     ->  Bitmap Index Scan on katello_host_installed_packages_host_id  (cost=0.00..13.42 rows=667 width=0) (actual time=0.025..0.026 rows=664 loops=1)
                                                                           Index Cond: (host_id = 284)
                                                               ->  Index Scan using katello_installed_packages_pkey on katello_installed_packages katello_installed_packages_1  (cost=0.29..5.48 rows=1 width=324) (actual time=0.005..0.005 rows=1 loops=627)
                                                                     Index Cond: (id = katello_host_installed_packages_1.installed_package_id)
                     ->  Index Scan using katello_installed_packages_pkey on katello_installed_packages  (cost=0.29..5.49 rows=1 width=324) (actual time=0.009..0.009 rows=1 loops=619)
                           Index Cond: (id = katello_host_installed_packages.installed_package_id)
                           Filter: ((NOT (hashed SubPlan 1)) OR (hashed SubPlan 3))
                           SubPlan 1
                             ->  Result  (cost=0.00..0.00 rows=0 width=0) (actual time=0.001..0.001 rows=0 loops=2)
                                   One-Time Filter: false
                           SubPlan 3
                             ->  Result  (cost=0.00..0.00 rows=0 width=0) (never executed)
                                   One-Time Filter: false
               ->  Index Scan using katello_rpms_fields_index on katello_rpms  (cost=0.42..6262.03 rows=1 width=331) (actual time=67.850..75.324 rows=3 loops=619)
                     Index Cond: (((name)::text = (katello_installed_packages.name)::text) AND ((arch)::text = (katello_installed_packages.arch)::text))
                     Filter: (evr > katello_installed_packages.evr)
                     Rows Removed by Filter: 48
         ->  Index Scan using katello_msrpm_rpm_idx on katello_module_stream_rpms  (cost=0.29..0.33 rows=2 width=16) (actual time=0.007..0.007 rows=0 loops=2147)
               Index Cond: (rpm_id = katello_rpms.id)
         SubPlan 2
           ->  Nested Loop  (cost=0.56..195.58 rows=1 width=8) (actual time=0.044..0.048 rows=0 loops=1)
                 Join Filter: (((katello_available_module_streams.name)::text = (katello_module_streams.name)::text) AND ((katello_available_module_streams.stream)::text = (katello_module_streams.stream)::text) AND ((katello_available_module_streams.context)
::text = (katello_module_streams.context)::text))
                 ->  Nested Loop  (cost=0.56..16.79 rows=1 width=21) (actual time=0.044..0.046 rows=0 loops=1)
                       ->  Index Only Scan using rpm_and_module_applicability_related_indices on katello_host_available_module_streams  (cost=0.29..8.49 rows=1 width=8) (actual time=0.043..0.044 rows=0 loops=1)
                             Index Cond: ((host_id = 284) AND (status = 'enabled'::text))
                             Heap Fetches: 0
                       ->  Index Scan using katello_available_module_streams_pkey on katello_available_module_streams  (cost=0.28..8.29 rows=1 width=29) (never executed)
                             Index Cond: (id = katello_host_available_module_streams.available_module_stream_id)
                 ->  Seq Scan on katello_module_streams  (cost=0.00..133.74 rows=2574 width=29) (never executed)
   ->  Index Only Scan using index_katello_repository_rpms_on_rpm_id_and_repository_id on katello_repository_rpms  (cost=0.43..0.96 rows=1 width=4) (actual time=0.012..0.012 rows=0 loops=1656)
         Index Cond: (rpm_id = katello_rpms.id)
         Filter: (repository_id = ANY ('{16490,16491,16489,16492,28501,1971,16493,190048,226614}'::integer[]))
         Rows Removed by Filter: 6
         Heap Fetches: 83
 Planning Time: 5.019 ms
 Execution Time: 46700.968 ms
(64 rows)

with index:

foreman=# create index on katello_installed_packages ( name, arch, evr );
CREATE INDEX
foreman=# \d katello_installed_packages
                                    Table "public.katello_installed_packages"
 Column  |          Type          | Collation | Nullable |                        Default                         
---------+------------------------+-----------+----------+--------------------------------------------------------
 id      | integer                |           | not null | nextval('katello_installed_packages_id_seq'::regclass)
 name    | character varying(255) |           | not null | 
 nvra    | character varying(255) |           | not null | 
 nvrea   | character varying      |           | not null | 
 epoch   | character varying      |           |          | 
 version | character varying      |           |          | 
 release | character varying      |           |          | 
 arch    | character varying      |           |          | 
 vendor  | character varying      |           |          | 
 evr     | evr_t                  |           |          | 
Indexes:
    "katello_installed_packages_pkey" PRIMARY KEY, btree (id)
    "index_katello_installed_packages_on_name_and_nvra" btree (name, nvra)
    "index_katello_installed_packages_on_nvra" btree (nvra)
    "index_katello_installed_packages_on_nvra_and_epoch" btree (nvra, epoch)
    "index_katello_installed_packages_on_nvrea" UNIQUE, btree (nvrea)
    "katello_installed_packages_name_arch_evr_idx" btree (name, arch, evr)
Referenced by:
    TABLE "katello_host_installed_packages" CONSTRAINT "katello_host_installed_packages_installed_package_id" FOREIGN KEY (installed_package_id) REFERENCES katello_installed_packages(id)
Triggers:
    evr_insert_trigger_katello_installed_packages BEFORE INSERT ON katello_installed_packages FOR EACH ROW EXECUTE FUNCTION evr_trigger()
    evr_update_trigger_katello_installed_packages BEFORE UPDATE OF epoch, version, release ON katello_installed_packages FOR EACH ROW WHEN (old.epoch::text IS DISTINCT FROM new.epoch::text OR old.version::text IS DISTINCT FROM new.version::text OR old.release::text IS DISTINCT FROM new.release::text) EXECUTE FUNCTION evr_trigger()

foreman=# explain analyze SELECT "katello_rpms"."id" FROM "katello_rpms" INNER JOIN katello_repository_rpms ON
                    katello_rpms.id = katello_repository_rpms.rpm_id INNER JOIN katello_installed_packages ON
                    katello_rpms.name = katello_installed_packages.name AND
                    katello_rpms.arch = katello_installed_packages.arch AND
                    katello_rpms.evr > katello_installed_packages.evr AND
                    katello_installed_packages.id in (SELECT DISTINCT ON (katello_installed_packages.name, katello_installed_packages.arch) katello_installed_packages.id FROM katello_installed_packages INNER JOIN katello_host_installed_packages ON katello_installed_packages.id = katello_host_installed_packages.installed_package_id WHERE katello_host_installed_packages.host_id = 284 ORDER BY katello_installed_packages.name, katello_installed_packages.arch, katello_installed_packages.evr DESC) LEFT JOIN katello_module_stream_rpms ON
                    katello_rpms.id = katello_module_stream_rpms.rpm_id INNER JOIN katello_host_installed_packages ON
                    katello_installed_packages.id = katello_host_installed_packages.installed_package_id WHERE (katello_repository_rpms.repository_id in (16490,16491,16489,16492,28501,1971,16493,190048,226614)) AND (katello_host_installed_packages.host_id = 284) AND ((katello_module_stream_rpms.module_stream_id IS NULL AND
                    katello_installed_packages.id NOT IN (SELECT "katello_installed_packages"."id" FROM "katello_installed_packages" WHERE 1=0 AND 1=0)) OR
                    (katello_module_stream_rpms.module_stream_id IN (SELECT "katello_module_streams"."id" FROM "katello_module_streams" inner join katello_available_module_streams on
                    katello_module_streams.name = katello_available_module_streams.name and
                    katello_module_streams.stream = katello_available_module_streams.stream and
                    katello_module_streams.context = katello_available_module_streams.context inner join katello_host_available_module_streams on
                    katello_available_module_streams.id = katello_host_available_module_streams.available_module_stream_id WHERE (katello_host_available_module_streams.host_id = 284 and
                    katello_host_available_module_streams.status = 'enabled'))
                    AND katello_installed_packages.id IN (SELECT "katello_installed_packages"."id" FROM "katello_installed_packages" WHERE 1=0 AND 1=0)));
                                                                                                                                                     QUERY PLAN                                                                                                   
                                                  
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------
 Nested Loop  (cost=5102.04..56859.76 rows=2 width=4) (actual time=10857.102..45728.238 rows=6 loops=1)
   ->  Nested Loop Left Join  (cost=5101.61..56844.24 rows=16 width=4) (actual time=77.128..45709.324 rows=1656 loops=1)
         Filter: (((katello_module_stream_rpms.module_stream_id IS NULL) AND (NOT (hashed SubPlan 1))) OR ((hashed SubPlan 2) AND (hashed SubPlan 3)))
         Rows Removed by Filter: 499
         ->  Nested Loop  (cost=4905.74..56625.13 rows=64 width=8) (actual time=77.097..45688.026 rows=2147 loops=1)
               ->  Nested Loop  (cost=4905.31..6530.42 rows=8 width=324) (actual time=4.521..18.116 rows=619 loops=1)
                     Join Filter: ("ANY_subquery".id = katello_installed_packages.id)
                     ->  Hash Join  (cost=4905.02..6079.39 rows=82 width=8) (actual time=4.488..9.468 rows=619 loops=1)
                           Hash Cond: (katello_host_installed_packages.installed_package_id = "ANY_subquery".id)
                           ->  Bitmap Heap Scan on katello_host_installed_packages  (cost=13.59..1185.20 rows=667 width=4) (actual time=0.042..1.723 rows=627 loops=1)
                                 Recheck Cond: (host_id = 284)
                                 Heap Blocks: exact=67
                                 ->  Bitmap Index Scan on katello_host_installed_packages_host_id  (cost=0.00..13.42 rows=667 width=0) (actual time=0.032..0.050 rows=664 loops=1)
                                       Index Cond: (host_id = 284)
                           ->  Hash  (cost=4888.93..4888.93 rows=200 width=4) (actual time=4.438..4.446 rows=619 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 30kB
                                 ->  HashAggregate  (cost=4886.93..4888.93 rows=200 width=4) (actual time=4.122..4.275 rows=619 loops=1)
                                       Group Key: "ANY_subquery".id
                                       Batches: 1  Memory Usage: 89kB
                                       ->  Subquery Scan on "ANY_subquery"  (cost=4873.59..4885.26 rows=667 width=4) (actual time=3.115..3.892 rows=619 loops=1)
                                             ->  Unique  (cost=4873.59..4878.59 rows=667 width=324) (actual time=3.114..3.623 rows=619 loops=1)
                                                   ->  Sort  (cost=4873.59..4875.26 rows=667 width=324) (actual time=3.113..3.249 rows=627 loops=1)
                                                         Sort Key: katello_installed_packages_1.name, katello_installed_packages_1.arch, katello_installed_packages_1.evr DESC
                                                         Sort Method: quicksort  Memory: 345kB
                                                         ->  Nested Loop  (cost=13.88..4842.30 rows=667 width=324) (actual time=0.035..2.040 rows=627 loops=1)
                                                               ->  Bitmap Heap Scan on katello_host_installed_packages katello_host_installed_packages_1  (cost=13.59..1185.20 rows=667 width=4) (actual time=0.025..0.307 rows=627 loops=1)
                                                                     Recheck Cond: (host_id = 284)
                                                                     Heap Blocks: exact=67
                                                                     ->  Bitmap Index Scan on katello_host_installed_packages_host_id  (cost=0.00..13.42 rows=667 width=0) (actual time=0.021..0.021 rows=664 loops=1)
                                                                           Index Cond: (host_id = 284)
                                                               ->  Index Scan using katello_installed_packages_pkey on katello_installed_packages katello_installed_packages_1  (cost=0.29..5.48 rows=1 width=324) (actual time=0.002..0.002 rows=1 loops=627)
                                                                     Index Cond: (id = katello_host_installed_packages_1.installed_package_id)
                     ->  Index Scan using katello_installed_packages_pkey on katello_installed_packages  (cost=0.29..5.49 rows=1 width=324) (actual time=0.009..0.009 rows=1 loops=619)
                           Index Cond: (id = katello_host_installed_packages.installed_package_id)
                           Filter: ((NOT (hashed SubPlan 1)) OR (hashed SubPlan 3))
                           SubPlan 1
                             ->  Result  (cost=0.00..0.00 rows=0 width=0) (actual time=0.001..0.001 rows=0 loops=2)
                                   One-Time Filter: false
                           SubPlan 3
                             ->  Result  (cost=0.00..0.00 rows=0 width=0) (never executed)
                                   One-Time Filter: false
               ->  Index Scan using katello_rpms_fields_index on katello_rpms  (cost=0.42..6261.83 rows=1 width=331) (actual time=66.475..73.771 rows=3 loops=619)
                     Index Cond: (((name)::text = (katello_installed_packages.name)::text) AND ((arch)::text = (katello_installed_packages.arch)::text))
                     Filter: (evr > katello_installed_packages.evr)
                     Rows Removed by Filter: 48
         ->  Index Scan using katello_msrpm_rpm_idx on katello_module_stream_rpms  (cost=0.29..0.33 rows=2 width=16) (actual time=0.006..0.006 rows=0 loops=2147)
               Index Cond: (rpm_id = katello_rpms.id)
         SubPlan 2
           ->  Nested Loop  (cost=0.56..195.58 rows=1 width=8) (actual time=0.015..0.017 rows=0 loops=1)
                 Join Filter: (((katello_available_module_streams.name)::text = (katello_module_streams.name)::text) AND ((katello_available_module_streams.stream)::text = (katello_module_streams.stream)::text) AND ((katello_available_module_streams.context)
::text = (katello_module_streams.context)::text))
                 ->  Nested Loop  (cost=0.56..16.79 rows=1 width=21) (actual time=0.014..0.016 rows=0 loops=1)
                       ->  Index Only Scan using rpm_and_module_applicability_related_indices on katello_host_available_module_streams  (cost=0.29..8.49 rows=1 width=8) (actual time=0.013..0.014 rows=0 loops=1)
                             Index Cond: ((host_id = 284) AND (status = 'enabled'::text))
                             Heap Fetches: 0
                       ->  Index Scan using katello_available_module_streams_pkey on katello_available_module_streams  (cost=0.28..8.29 rows=1 width=29) (never executed)
                             Index Cond: (id = katello_host_available_module_streams.available_module_stream_id)
                 ->  Seq Scan on katello_module_streams  (cost=0.00..133.74 rows=2574 width=29) (never executed)
   ->  Index Only Scan using index_katello_repository_rpms_on_rpm_id_and_repository_id on katello_repository_rpms  (cost=0.43..0.96 rows=1 width=4) (actual time=0.009..0.009 rows=0 loops=1656)
         Index Cond: (rpm_id = katello_rpms.id)
         Filter: (repository_id = ANY ('{16490,16491,16489,16492,28501,1971,16493,190048,226614}'::integer[]))
         Rows Removed by Filter: 6
         Heap Fetches: 83
 Planning Time: 2.434 ms
 Execution Time: 45728.394 ms
(64 rows)

1 Like

Interesting, yeah it seems there is no difference at all. Appreciate the test. We’ll probably need to compare the postgres explain output between a 4.16 machine and an older one to get a better idea.

We did change the EVR postgres type from being introduced in the postgresql-evr extension vs a migration for Katello 4.16. I can’t think why that would cause a slowdown, but it does seem to coincide with the report here.

Hello, in our setup (about 1500 hosts) the Bulk generate applicability for hosts task sometimes takes several hours. Will it be possible to focus on improving of the performance of this task ? Thank you.

Hello, shoudn’t there be an index created also for katello_rpms table:
“index_katello_rpms_on_name_and_arch_and_evr” btree (name, arch, evr) ?
I do not have it in my foreman database.

1 Like

You may be on to something there. I have just added:

foreman=# create index on katello_rpms ( name, arch, evr );
CREATE INDEX
foreman=# \d katello_rpms
                                           Table "public.katello_rpms"
      Column      |            Type             | Collation | Nullable |                 Default                  
------------------+-----------------------------+-----------+----------+------------------------------------------
 id               | integer                     |           | not null | nextval('katello_rpms_id_seq'::regclass)
 pulp_id          | character varying(255)      |           | not null | 
 created_at       | timestamp without time zone |           |          | 
 updated_at       | timestamp without time zone |           |          | 
 name             | character varying(255)      |           |          | 
 version          | character varying(255)      |           |          | 
 release          | character varying(255)      |           |          | 
 arch             | character varying(255)      |           |          | 
 epoch            | character varying(255)      |           |          | 
 filename         | character varying(255)      |           |          | 
 sourcerpm        | character varying(255)      |           |          | 
 checksum         | character varying(255)      |           |          | 
 version_sortable | character varying(255)      |           |          | 
 release_sortable | character varying(255)      |           |          | 
 summary          | character varying(255)      |           |          | 
 nvra             | character varying(1020)     |           |          | 
 modular          | boolean                     |           |          | false
 evr              | evr_t                       |           |          | 
Indexes:
    "katello_rpms_pkey" PRIMARY KEY, btree (id)
    "index_katello_rpms_on_pulp_id" UNIQUE, btree (pulp_id)
    "katello_rpms_fields_index" btree (id, pulp_id, name, version, release, arch, version_sortable, release_sortable)
    "katello_rpms_name_arch_evr_idx" btree (name, arch, evr)
Referenced by:
    TABLE "katello_repository_rpms" CONSTRAINT "fk_rails_39a260fd51" FOREIGN KEY (rpm_id) REFERENCES katello_rpms(id)
    TABLE "katello_module_stream_rpms" CONSTRAINT "katello_msrpm_rpm_id_fk" FOREIGN KEY (rpm_id) REFERENCES katello_rpms(id)
Triggers:
    evr_insert_trigger_katello_rpms BEFORE INSERT ON katello_rpms FOR EACH ROW EXECUTE FUNCTION evr_trigger()
    evr_update_trigger_katello_rpms BEFORE UPDATE OF epoch, version, release ON katello_rpms FOR EACH ROW WHEN (old.epoch::text IS DISTINCT FROM new.epoch::text OR old.version::text IS DISTINCT FROM new.version::text OR old.release::text IS DISTINCT FROM new.release::text) EXECUTE FUNCTION evr_trigger()

And the query which I have posted above went from 42196.113 ms down to 42.239 ms.

@iballou Here are the explains before and after:

foreman=# explain analyze SELECT "katello_rpms"."id" FROM "katello_rpms" INNER JOIN katello_repository_rpms ON
                    katello_rpms.id = katello_repository_rpms.rpm_id INNER JOIN katello_installed_packages ON
                    katello_rpms.name = katello_installed_packages.name AND
                    katello_rpms.arch = katello_installed_packages.arch AND
                    katello_rpms.evr > katello_installed_packages.evr AND
                    katello_installed_packages.id in (SELECT DISTINCT ON (katello_installed_packages.name, katello_installed_packages.arch) katello_installed_packages.id FROM katello_installed_packages INNER JOIN katello_host_installed_packages ON katello_installed_packages.id = katello_host_installed_packages.installed_package_id WHERE katello_host_installed_packages.host_id = 284 ORDER BY katello_installed_packages.name, katello_installed_packages.arch, katello_installed_packages.evr DESC) LEFT JOIN katello_module_stream_rpms ON
                    katello_rpms.id = katello_module_stream_rpms.rpm_id INNER JOIN katello_host_installed_packages ON
                    katello_installed_packages.id = katello_host_installed_packages.installed_package_id WHERE (katello_repository_rpms.repository_id in (16490,16491,16489,16492,28501,1971,16493,190048,226614)) AND (katello_host_installed_packages.host_id = 284) AND ((katello_module_stream_rpms.module_stream_id IS NULL AND
                    katello_installed_packages.id NOT IN (SELECT "katello_installed_packages"."id" FROM "katello_installed_packages" WHERE 1=0 AND 1=0)) OR
                    (katello_module_stream_rpms.module_stream_id IN (SELECT "katello_module_streams"."id" FROM "katello_module_streams" inner join katello_available_module_streams on
                    katello_module_streams.name = katello_available_module_streams.name and
                    katello_module_streams.stream = katello_available_module_streams.stream and
                    katello_module_streams.context = katello_available_module_streams.context inner join katello_host_available_module_streams on
                    katello_available_module_streams.id = katello_host_available_module_streams.available_module_stream_id WHERE (katello_host_available_module_streams.host_id = 284 and
                    katello_host_available_module_streams.status = 'enabled'))
                    AND katello_installed_packages.id IN (SELECT "katello_installed_packages"."id" FROM "katello_installed_packages" WHERE 1=0 AND 1=0)));
                                                                                                                                                    QUERY PLAN                                                             
                                                                                       
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------
 Nested Loop  (cost=5123.35..56904.20 rows=3 width=4) (actual time=42195.627..42195.656 rows=0 loops=1)
   ->  Nested Loop Left Join  (cost=5122.92..56888.21 rows=16 width=4) (actual time=88.399..42176.681 rows=1846 loops=1)
         Filter: (((katello_module_stream_rpms.module_stream_id IS NULL) AND (NOT (hashed SubPlan 1))) OR ((hashed SubPlan 2) AND (hashed SubPlan 3)))
         Rows Removed by Filter: 532
         ->  Nested Loop  (cost=4947.11..56688.26 rows=66 width=8) (actual time=88.314..42155.489 rows=2370 loops=1)
               ->  Nested Loop  (cost=4946.68..6593.33 rows=8 width=324) (actual time=6.231..20.862 rows=619 loops=1)
                     Join Filter: ("ANY_subquery".id = katello_installed_packages.id)
                     ->  Hash Join  (cost=4946.39..6126.98 rows=85 width=8) (actual time=6.208..11.553 rows=619 loops=1)
                           Hash Cond: (katello_host_installed_packages.installed_package_id = "ANY_subquery".id)
                           ->  Bitmap Heap Scan on katello_host_installed_packages  (cost=13.65..1191.46 rows=675 width=4) (actual time=0.076..1.792 rows=627 loops=1)
                                 Recheck Cond: (host_id = 284)
                                 Heap Blocks: exact=72
                                 ->  Bitmap Index Scan on katello_host_installed_packages_host_id  (cost=0.00..13.48 rows=675 width=0) (actual time=0.055..0.056 rows=729 loops=1)
                                       Index Cond: (host_id = 284)
                           ->  Hash  (cost=4930.24..4930.24 rows=200 width=4) (actual time=6.112..6.121 rows=619 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 30kB
                                 ->  HashAggregate  (cost=4928.24..4930.24 rows=200 width=4) (actual time=5.770..5.933 rows=619 loops=1)
                                       Group Key: "ANY_subquery".id
                                       Batches: 1  Memory Usage: 89kB
                                       ->  Subquery Scan on "ANY_subquery"  (cost=4914.74..4926.55 rows=675 width=4) (actual time=4.746..5.512 rows=619 loops=1)
                                             ->  Unique  (cost=4914.74..4919.80 rows=675 width=324) (actual time=4.744..5.235 rows=619 loops=1)
                                                   ->  Sort  (cost=4914.74..4916.43 rows=675 width=324) (actual time=4.743..4.887 rows=627 loops=1)
                                                         Sort Key: katello_installed_packages_1.name, katello_installed_packages_1.arch, katello_installed_packages_1.evr DESC
                                                         Sort Method: quicksort  Memory: 346kB
                                                         ->  Nested Loop  (cost=13.94..4883.02 rows=675 width=324) (actual time=0.062..3.457 rows=627 loops=1)
                                                               ->  Bitmap Heap Scan on katello_host_installed_packages katello_host_installed_packages_1  (cost=13.65..1191.46 rows=675 width=4) (actual time=0.033..0.927 
rows=627 loops=1)
                                                                     Recheck Cond: (host_id = 284)
                                                                     Heap Blocks: exact=72
                                                                     ->  Bitmap Index Scan on katello_host_installed_packages_host_id  (cost=0.00..13.48 rows=675 width=0) (actual time=0.026..0.027 rows=729 loops=1)
                                                                           Index Cond: (host_id = 284)
                                                               ->  Index Scan using katello_installed_packages_pkey on katello_installed_packages katello_installed_packages_1  (cost=0.29..5.47 rows=1 width=324) (actual 
time=0.003..0.003 rows=1 loops=627)
                                                                     Index Cond: (id = katello_host_installed_packages_1.installed_package_id)
                     ->  Index Scan using katello_installed_packages_pkey on katello_installed_packages  (cost=0.29..5.47 rows=1 width=324) (actual time=0.009..0.009 rows=1 loops=619)
                           Index Cond: (id = katello_host_installed_packages.installed_package_id)
                           Filter: ((NOT (hashed SubPlan 1)) OR (hashed SubPlan 3))
                           SubPlan 1
                             ->  Result  (cost=0.00..0.00 rows=0 width=0) (actual time=0.001..0.001 rows=0 loops=2)
                                   One-Time Filter: false
                           SubPlan 3
                             ->  Result  (cost=0.00..0.00 rows=0 width=0) (never executed)
                                   One-Time Filter: false
               ->  Index Scan using katello_rpms_fields_index on katello_rpms  (cost=0.42..6261.86 rows=1 width=331) (actual time=60.772..68.060 rows=4 loops=619)
                     Index Cond: (((name)::text = (katello_installed_packages.name)::text) AND ((arch)::text = (katello_installed_packages.arch)::text))
                     Filter: (evr > katello_installed_packages.evr)
                     Rows Removed by Filter: 48
         ->  Index Scan using katello_msrpm_rpm_idx on katello_module_stream_rpms  (cost=0.29..0.34 rows=2 width=16) (actual time=0.005..0.005 rows=0 loops=2370)
               Index Cond: (rpm_id = katello_rpms.id)
         SubPlan 2
           ->  Hash Join  (cost=12.81..175.52 rows=1 width=8) (actual time=0.099..0.105 rows=0 loops=1)
                 Hash Cond: (((katello_module_streams.name)::text = (katello_available_module_streams.name)::text) AND ((katello_module_streams.stream)::text = (katello_available_module_streams.stream)::text) AND ((kate
llo_module_streams.context)::text = (katello_available_module_streams.context)::text))
                 ->  Seq Scan on katello_module_streams  (cost=0.00..133.74 rows=2574 width=29) (actual time=0.014..0.014 rows=1 loops=1)
                 ->  Hash  (cost=12.79..12.79 rows=1 width=21) (actual time=0.063..0.067 rows=0 loops=1)
                       Buckets: 1024  Batches: 1  Memory Usage: 8kB
                       ->  Nested Loop  (cost=0.56..12.79 rows=1 width=21) (actual time=0.062..0.065 rows=0 loops=1)
                             ->  Index Only Scan using rpm_and_module_applicability_related_indices on katello_host_available_module_streams  (cost=0.29..4.49 rows=1 width=8) (actual time=0.061..0.062 rows=0 loops=1)
                                   Index Cond: ((host_id = 284) AND (status = 'enabled'::text))
                                   Heap Fetches: 0
                             ->  Index Scan using katello_available_module_streams_pkey on katello_available_module_streams  (cost=0.28..8.29 rows=1 width=29) (never executed)
                                   Index Cond: (id = katello_host_available_module_streams.available_module_stream_id)
   ->  Index Only Scan using index_katello_repository_rpms_on_rpm_id_and_repository_id on katello_repository_rpms  (cost=0.43..0.99 rows=1 width=4) (actual time=0.008..0.008 rows=0 loops=1846)
         Index Cond: (rpm_id = katello_rpms.id)
         Filter: (repository_id = ANY ('{16490,16491,16489,16492,28501,1971,16493,190048,226614}'::integer[]))
         Rows Removed by Filter: 5
         Heap Fetches: 51
 Planning Time: 6.042 ms
 Execution Time: 42196.113 ms
(66 rows)

after with two more indexes:

foreman=# create index on katello_installed_packages ( name, arch, evr );
CREATE INDEX
foreman=# create index on katello_rpms ( name, arch, evr );
CREATE INDEX
foreman=# explain analyze SELECT "katello_rpms"."id" FROM "katello_rpms" INNER JOIN katello_repository_rpms ON
                    katello_rpms.id = katello_repository_rpms.rpm_id INNER JOIN katello_installed_packages ON
                    katello_rpms.name = katello_installed_packages.name AND
                    katello_rpms.arch = katello_installed_packages.arch AND
                    katello_rpms.evr > katello_installed_packages.evr AND
                    katello_installed_packages.id in (SELECT DISTINCT ON (katello_installed_packages.name, katello_installed_packages.arch) katello_installed_packages.id FROM katello_installed_packages INNER JOIN katello_host_installed_packages ON katello_installed_packages.id = katello_host_installed_packages.installed_package_id WHERE katello_host_installed_packages.host_id = 284 ORDER BY katello_installed_packages.name, katello_installed_packages.arch, katello_installed_packages.evr DESC) LEFT JOIN katello_module_stream_rpms ON
                    katello_rpms.id = katello_module_stream_rpms.rpm_id INNER JOIN katello_host_installed_packages ON
                    katello_installed_packages.id = katello_host_installed_packages.installed_package_id WHERE (katello_repository_rpms.repository_id in (16490,16491,16489,16492,28501,1971,16493,190048,226614)) AND (katello_host_installed_packages.host_id = 284) AND ((katello_module_stream_rpms.module_stream_id IS NULL AND
                    katello_installed_packages.id NOT IN (SELECT "katello_installed_packages"."id" FROM "katello_installed_packages" WHERE 1=0 AND 1=0)) OR
                    (katello_module_stream_rpms.module_stream_id IN (SELECT "katello_module_streams"."id" FROM "katello_module_streams" inner join katello_available_module_streams on
                    katello_module_streams.name = katello_available_module_streams.name and
                    katello_module_streams.stream = katello_available_module_streams.stream and
                    katello_module_streams.context = katello_available_module_streams.context inner join katello_host_available_module_streams on
                    katello_available_module_streams.id = katello_host_available_module_streams.available_module_stream_id WHERE (katello_host_available_module_streams.host_id = 284 and
                    katello_host_available_module_streams.status = 'enabled'))
                    AND katello_installed_packages.id IN (SELECT "katello_installed_packages"."id" FROM "katello_installed_packages" WHERE 1=0 AND 1=0)));
                                                                                                                                                    QUERY PLAN                                                             
                                                                                       
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------------------------------------------
 Nested Loop  (cost=5123.60..6863.44 rows=3 width=4) (actual time=42.063..42.083 rows=0 loops=1)
   ->  Nested Loop Left Join  (cost=5123.17..6847.48 rows=16 width=4) (actual time=4.909..35.802 rows=1846 loops=1)
         Filter: (((katello_module_stream_rpms.module_stream_id IS NULL) AND (NOT (hashed SubPlan 1))) OR ((hashed SubPlan 2) AND (hashed SubPlan 3)))
         Rows Removed by Filter: 532
         ->  Nested Loop  (cost=4947.36..6647.54 rows=66 width=8) (actual time=4.892..30.562 rows=2370 loops=1)
               ->  Nested Loop  (cost=4946.68..6593.33 rows=8 width=324) (actual time=4.827..7.525 rows=619 loops=1)
                     Join Filter: ("ANY_subquery".id = katello_installed_packages.id)
                     ->  Hash Join  (cost=4946.39..6126.98 rows=85 width=8) (actual time=4.809..5.626 rows=619 loops=1)
                           Hash Cond: (katello_host_installed_packages.installed_package_id = "ANY_subquery".id)
                           ->  Bitmap Heap Scan on katello_host_installed_packages  (cost=13.65..1191.46 rows=675 width=4) (actual time=0.039..0.387 rows=627 loops=1)
                                 Recheck Cond: (host_id = 284)
                                 Heap Blocks: exact=72
                                 ->  Bitmap Index Scan on katello_host_installed_packages_host_id  (cost=0.00..13.48 rows=675 width=0) (actual time=0.029..0.030 rows=729 loops=1)
                                       Index Cond: (host_id = 284)
                           ->  Hash  (cost=4930.24..4930.24 rows=200 width=4) (actual time=4.761..4.768 rows=619 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 30kB
                                 ->  HashAggregate  (cost=4928.24..4930.24 rows=200 width=4) (actual time=4.413..4.588 rows=619 loops=1)
                                       Group Key: "ANY_subquery".id
                                       Batches: 1  Memory Usage: 89kB
                                       ->  Subquery Scan on "ANY_subquery"  (cost=4914.74..4926.55 rows=675 width=4) (actual time=3.231..4.141 rows=619 loops=1)
                                             ->  Unique  (cost=4914.74..4919.80 rows=675 width=324) (actual time=3.230..3.836 rows=619 loops=1)
                                                   ->  Sort  (cost=4914.74..4916.43 rows=675 width=324) (actual time=3.228..3.384 rows=627 loops=1)
                                                         Sort Key: katello_installed_packages_1.name, katello_installed_packages_1.arch, katello_installed_packages_1.evr DESC
                                                         Sort Method: quicksort  Memory: 346kB
                                                         ->  Nested Loop  (cost=13.94..4883.02 rows=675 width=324) (actual time=0.039..2.138 rows=627 loops=1)
                                                               ->  Bitmap Heap Scan on katello_host_installed_packages katello_host_installed_packages_1  (cost=13.65..1191.46 rows=675 width=4) (actual time=0.027..0.329 
rows=627 loops=1)
                                                                     Recheck Cond: (host_id = 284)
                                                                     Heap Blocks: exact=72
                                                                     ->  Bitmap Index Scan on katello_host_installed_packages_host_id  (cost=0.00..13.48 rows=675 width=0) (actual time=0.021..0.021 rows=729 loops=1)
                                                                           Index Cond: (host_id = 284)
                                                               ->  Index Scan using katello_installed_packages_pkey on katello_installed_packages katello_installed_packages_1  (cost=0.29..5.47 rows=1 width=324) (actual 
time=0.002..0.002 rows=1 loops=627)
                                                                     Index Cond: (id = katello_host_installed_packages_1.installed_package_id)
                     ->  Index Scan using katello_installed_packages_pkey on katello_installed_packages  (cost=0.29..5.47 rows=1 width=324) (actual time=0.002..0.002 rows=1 loops=619)
                           Index Cond: (id = katello_host_installed_packages.installed_package_id)
                           Filter: ((NOT (hashed SubPlan 1)) OR (hashed SubPlan 3))
                           SubPlan 1
                             ->  Result  (cost=0.00..0.00 rows=0 width=0) (actual time=0.001..0.001 rows=0 loops=2)
                                   One-Time Filter: false
                           SubPlan 3
                             ->  Result  (cost=0.00..0.00 rows=0 width=0) (never executed)
                                   One-Time Filter: false
               ->  Index Scan using katello_rpms_name_arch_evr_idx on katello_rpms  (cost=0.68..6.77 rows=1 width=331) (actual time=0.028..0.035 rows=4 loops=619)
                     Index Cond: (((name)::text = (katello_installed_packages.name)::text) AND ((arch)::text = (katello_installed_packages.arch)::text) AND (evr > katello_installed_packages.evr))
         ->  Index Scan using katello_msrpm_rpm_idx on katello_module_stream_rpms  (cost=0.29..0.34 rows=2 width=16) (actual time=0.001..0.001 rows=0 loops=2370)
               Index Cond: (rpm_id = katello_rpms.id)
         SubPlan 2
           ->  Hash Join  (cost=12.81..175.52 rows=1 width=8) (actual time=0.041..0.045 rows=0 loops=1)
                 Hash Cond: (((katello_module_streams.name)::text = (katello_available_module_streams.name)::text) AND ((katello_module_streams.stream)::text = (katello_available_module_streams.stream)::text) AND ((kate
llo_module_streams.context)::text = (katello_available_module_streams.context)::text))
                 ->  Seq Scan on katello_module_streams  (cost=0.00..133.74 rows=2574 width=29) (actual time=0.004..0.005 rows=1 loops=1)
                 ->  Hash  (cost=12.79..12.79 rows=1 width=21) (actual time=0.017..0.019 rows=0 loops=1)
                       Buckets: 1024  Batches: 1  Memory Usage: 8kB
                       ->  Nested Loop  (cost=0.56..12.79 rows=1 width=21) (actual time=0.016..0.017 rows=0 loops=1)
                             ->  Index Only Scan using rpm_and_module_applicability_related_indices on katello_host_available_module_streams  (cost=0.29..4.49 rows=1 width=8) (actual time=0.015..0.015 rows=0 loops=1)
                                   Index Cond: ((host_id = 284) AND (status = 'enabled'::text))
                                   Heap Fetches: 0
                             ->  Index Scan using katello_available_module_streams_pkey on katello_available_module_streams  (cost=0.28..8.29 rows=1 width=29) (never executed)
                                   Index Cond: (id = katello_host_available_module_streams.available_module_stream_id)
   ->  Index Only Scan using index_katello_repository_rpms_on_rpm_id_and_repository_id on katello_repository_rpms  (cost=0.43..0.99 rows=1 width=4) (actual time=0.003..0.003 rows=0 loops=1846)
         Index Cond: (rpm_id = katello_rpms.id)
         Filter: (repository_id = ANY ('{16490,16491,16489,16492,28501,1971,16493,190048,226614}'::integer[]))
         Rows Removed by Filter: 5
         Heap Fetches: 51
 Planning Time: 3.102 ms
 Execution Time: 42.239 ms
(64 rows)
2 Likes

Thank you, I really appreciate your work ! Do you think it is safe for me to create those two indexes manually too after making the backup snapshot ( I think it should be, they can be dropped, can’t they ) ? We are really struggling with performance of this task.

I can’t really tell you what side effects it may have. I am just a foreman user. Maybe one of the devs can give feedback.

You should, however, make sure to remove the indices before making any foreman update or change (i.e. when running foreman-installer). Otherwise, I think it might interfere with the migrations which automatically run during the update. Even if I think it should not matter as long as the index names differ from what foreman uses and having the same index twice shouldn’t hurt either.

I want to note that the second index on katello_rpms took significantly longer to create than the first one. But we have a rather small installation and it took ~20s. If you have a very large installation it may take much longer.

1 Like

On my system, this index exists. Looking at the code, it exists since:

Maybe it was removed during upgrade accidentally by katello/db/migrate/20240924161240_katello_recreate_evr_constructs.rb at master · Katello/katello · GitHub

What do you think @iballou ?

I guess you are running 4.15 or earlier and not 4.16.

I suppose that means that we might lost all indexes which references evr with 4.16? Are there others?

image

I have checked the foreman-installer log. The migration ran during the 3.14/4.16 update. However, according to the logs, this is where the evr column was added:

2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns: == 20240924161240 KatelloRecreateEvrConstructs: migrating =====================
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns: -- extension_enabled?("evr")
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns:    -> 0.0215s
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns: -- execute("DROP EXTENSION evr CASCADE;\n")
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns:    -> 0.0110s
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns: -- execute("      create type evr_array_item as (\n  n       NUMERIC,\n  s       TEXT\n);\n\ncreate type evr_t as (\n  epoch INT,\n  version evr_array_item[],\n  release evr_array_item[]\n);\n\nCREATE FUNCTION evr_trigger() RETURNS trigger AS $$\n  BEGIN\n    NEW.evr = (select ROW(coalesce(NEW.epoch::numeric,0),\n                          rpmver_array(coalesce(NEW.version,'empty'))::evr_array_item[],\n                          rpmver_array(coalesce(NEW.release,'empty'))::evr_array_item[])::evr_t);\n    RETURN NEW;\n  END;\n$$ language 'plpgsql';\n\ncreate or replace FUNCTION empty(t TEXT)\n\tRETURNS BOOLEAN as $$\n\tBEGIN\n\t\treturn t ~ '^[[:space:]]*$';\n\tEND;\n$$ language 'plpgsql';\n\ncreate or replace FUNCTION isalpha(ch CHAR)\n  RETURNS BOOLEAN as $$\n  BEGIN\n    if ascii(ch) between ascii('a') and ascii('z') or\n        ascii(ch) between ascii('A') and ascii('Z')\n    then\n      return TRUE;\n    end if;\n    return FALSE;\n  END;\n$$ language 'plpgsql';\n\ncreate or replace FUNCTION isalphanum(ch CHAR)\n\tRETURNS BOOLEAN as $$\n\tBEGIN\n\t\tif ascii(ch) between ascii('a') and ascii('z') or\n\t\t\tascii(ch) between ascii('A') and ascii('Z') or\n\t\t\tascii(ch) between ascii('0') and ascii('9')\n\t\tthen\n\t\t\treturn TRUE;\n\t\tend if;\n\t\treturn FALSE;\n\tEND;\n$$ language 'plpgsql';\n\ncreate or replace function isdigit(ch CHAR)\n\tRETURNS BOOLEAN as $$\n\tBEGIN\n\t  if ascii(ch) between ascii('0') and ascii('9')\n\t  then\n\t\treturn TRUE;\n\t  end if;\n\t  return FALSE;\n\tEND ;\n$$ language 'plpgsql';\n\ncreate or replace FUNCTION rpmver_array (string1 IN VARCHAR)\n\tRETURNS evr_array_item[] as $$\n\tdeclare\n\t\tstr1 VARCHAR := string1;\n\t\tdigits VARCHAR(10) := '0123456789';\n\t\tlc_alpha VARCHAR(27) := 'abcdefghijklmnopqrstuvwxyz';\n\t\tuc_alpha VARCHAR(27) := 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';\n\t\talpha VARCHAR(54) := lc_alpha || uc_alpha;\n\t\tone VARCHAR;\n\t\tisnum BOOLEAN;\n\t\tver_array evr_array_item[] := ARRAY[]::evr_array_item[];\n\tBEGIN\n\t\tif str1 is NULL\n\t\tthen\n\t\t\tRAISE EXCEPTION 'VALUE_ERROR.';\n\t\tend if;\n\n\t\tone := str1;\n\t\t<<segment_loop>>\n\t\twhile one <> ''\n\t\tloop\n\t\t\tdeclare\n\t\t\t\tsegm1 VARCHAR;\n\t\t\t\tsegm1_n NUMERIC := 0;\n\t\t\tbegin\n\t\t\t\t-- Throw out all non-alphanum characters\n\t\t\t\twhile one <> '' and not isalphanum(one)\n\t\t\t\tloop\n\t\t\t\t\tone := substr(one, 2);\n\t\t\t\tend loop;\n\t\t\t\tstr1 := one;\n\t\t\t\tif str1 <> '' and isdigit(str1)\n\t\t\t\tthen\n\t\t\t\t\tstr1 := ltrim(str1, digits);\n\t\t\t\t\tisnum := true;\n\t\t\t\telse\n\t\t\t\t\tstr1 := ltrim(str1, alpha);\n\t\t\t\t\tisnum := false;\n\t\t\t\tend if;\n\t\t\t\tif str1 <> ''\n\t\t\t\tthen segm1 := substr(one, 1, length(one) - length(str1));\n\t\t\t\telse segm1 := one;\n\t\t\t\tend if;\n\n\t\t\t\tif segm1 = '' then return ver_array; end if; /* arbitrary */\n\t\t\t\tif isnum\n\t\t\t\tthen\n\t\t\t\t\tsegm1 := ltrim(segm1, '0');\n\t\t\t\t\tif segm1 <> '' then segm1_n := segm1::numeric; end if;\n\t\t\t\t\tsegm1 := NULL;\n\t\t\t\telse\n\t\t\t\tend if;\n\t\t\t\tver_array := array_append(ver_array, (segm1_n, segm1)::evr_array_item);\n\t\t\t\tone := str1;\n\t\t\tend;\n\t\tend loop segment_loop;\n\n\t\treturn ver_array;\n\tEND ;\n$$ language 'plpgsql';\n\n")
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns:    -> 0.0087s
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns: -- add_column(:katello_rpms, :evr, :evr_t)
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns:    -> 0.0005s
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns: -- add_column(:katello_installed_packages, :evr, :evr_t)
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns:    -> 0.0003s
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns: -- execute("        update katello_rpms SET evr = (ROW(coalesce(epoch::numeric,0),\n                                           rpmver_array(coalesce(version,'empty'))::evr_array_item[],\n                                           rpmver_array(coalesce(release,'empty'))::evr_array_item[])::evr_t);\n\n        update katello_installed_packages SET evr = (ROW(coalesce(epoch::numeric,0),\n                                                         rpmver_array(coalesce(version,'empty'))::evr_array_item[],\n                                                         rpmver_array(coalesce(release,'empty'))::evr_array_item[])::evr_t);\n")
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns:    -> 60.6772s
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns: -- create_trigger(:evr_insert_trigger_katello_rpms, {:on=>:katello_rpms})
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns:    -> 0.0030s
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns: -- create_trigger(:evr_update_trigger_katello_rpms, {:on=>:katello_rpms})
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns:    -> 0.0013s
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns: -- create_trigger(:evr_insert_trigger_katello_installed_packages, {:on=>:katello_installed_packages})
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns:    -> 0.0005s
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns: -- create_trigger(:evr_update_trigger_katello_installed_packages, {:on=>:katello_installed_packages})
2025-05-19 08:22:26 [INFO  ] [configure] /Stage[main]/Foreman::Database/Foreman::Rake[db:migrate]/Exec[foreman-rake-db:migrate]/returns:    -> 0.0004s

But no indicies…

There is a question also if it is sufficient to restore the performance to recreate the index index_katello_rpms_on_name_and_arch_and_evr as before katello 4.16. or it is necessary to create the second one on katello_installed_packages too.

In topic Katello installable updates not updating there also seem to be some issue related to evr in 4.16. Possibly related?

@iballou Do you need any more information to fix this? Should I open a ticket/issue somewhere for this?

It it helps, I also still have the snapshot before updating to 4.16 and could check the database or dump the database schema…

A quick test shows that

DROP EXTENSION evr CASCADE;

removes the evr columns from the tables and with that also the indices which include the evr column…

So to me it seems migration 20240924161240 is the reason.

But I don’t understand why @Bernhard_Suttner still has it…

2 Likes

Probably 20240924161240 runs during upgrade from lower 4.15 only not during fresh install ?