I have just looked in the postgresql log files for long running queries because it took pretty long until a content view publish began to start after sitting on planning for a few minutes.
I didn’t find anything related to that there, but I have noticed another thing which made me wonder. There are occasional SELECT statements like this:
2025-03-29 07:31:25 CET LOG: duration: 8521.372 ms execute <unnamed>: SELECT "katello_repository_rpms".* FROM "katello_repository_rpms" WHERE "katello_repository_rpms"."repository_id" = $1 AND NOT (rpm_id in (44973251,44973235,44973207,44973177,44973176,44973232,44973212,44973199,44973206,44973246,44973234,44973185,44973226,44973195,44973221,44973179,44973215,44973213,449...
2025-03-29 07:31:25 CET DETAIL: parameters: $1 = '227251'
The first line is 717.132 bytes long!
I started looking at line lengths in the logs and find some others even longer:
2025-03-29 02:59:50 CET LOG: duration: 2487.740 ms execute <unnamed>: SELECT "katello_rpms"."id", "katello_rpms"."pulp_id" FROM "katello_rpms" WHERE (pulp_id in ('/pulp/api/v3/content/rpm/packages/0195df97-9871-7d12-95a0-7cf150f58328/','/pulp/api/v3/content/rpm/packages/0195df95-95c2-7e37-baa4-329d3af3c13a/','/pulp/api/v3/content/rpm/packages/0195df95-95c0-769a-8f22-a29c1...
2025-03-29 02:59:57 CET LOG: duration: 5281.817 ms execute <unnamed>: SELECT "katello_repository_rpms".* FROM "katello_repository_rpms" WHERE "katello_repository_rpms"."repository_id" = 1971 AND "katello_repository_rpms"."rpm_id" NOT IN (42726595, 18385409, 43387778, 1507875, 31453554, 31100335, 8168606, 11398189, 28568674, 12159919, 48625246, 1509134, 1507261, 21820778, ...
2025-03-29 03:00:04 CET LOG: duration: 1377.383 ms execute <unnamed>: INSERT INTO "katello_repository_rpms" ("repository_id","rpm_id","created_at","updated_at") VALUES (1971, 42726595, '2025-03-29 01:59:51.165391', '2025-03-29 01:59:51.165426'), (1971, 18385409, '2025-03-29 01:59:51.165430', '2025-03-29 01:59:51.165433'), (1971, 43387778, '2025-03-29 01:59:51.165436', '20...
Line lengths of these three: 6.015.773, 798.930, 6.326.853
I am not sure what the duration time of postgresql includes, i.e. whether it includes parsing time or only execution. I am amazed that it takes megabyte long statements that well.
Still I just wonder in general, if there isn’t something going fundamentally wrong if you have statements listing some 80.000 pulp or rpm ids for processing. I guess those lists of ids came from another SELECT statement before, thus shouldn’t be there a much more efficient way for that?
I have just updated to 3.13/4.15 latest this week but logs show these kinds of statements before running with 3.12/4.14.