Katello purge pulp_database.repo_profile_applicability?

Hello,

Is there a way to clean mongo table pulp_database.repo_profile_applicability on katello 3.14?
It seems to keep the history of all deleted hosts:

	{
	"ns" : "pulp_database.repo_profile_applicability",
	"size" : 41127952,
	"count" : 14079,
	"avgObjSize" : 2921,
	"numExtents" : 39,
	"storageSize" : 45794504416,
	"lastExtentSize" : 2146426864,
	"paddingFactor" : 1,
	"paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",
	"userFlags" : 1,
	"capped" : false,
	"nindexes" : 3,
	"totalIndexSize" : 6516272,
	"indexSizes" : {
		"_id_" : 637728,
		"repo_id_-1" : 1381744,
		"all_profiles_hash_-1_profile_hash_-1_repo_id_-1" : 4496800
	},
	"ok" : 1
},

{
	"ns" : "pulp_database.consumers",
	"size" : 205680,
	"count" : 857,
	"avgObjSize" : 240,
	"numExtents" : 4,
	"storageSize" : 696320,
	"lastExtentSize" : 524288,
	"paddingFactor" : 1,
	"paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",
	"userFlags" : 1,
	"capped" : false,
	"nindexes" : 3,
	"totalIndexSize" : 179872,
	"indexSizes" : {
		"_id_" : 49056,
		"id_-1" : 81760,
		"notes_-1" : 49056
	},
	"ok" : 1
},

Thanks!

Note the storageSize versus the number of content hosts, 45 GB of data for only 857 hosts. It looks really huge to me :slight_smile:

I reached out to the pulp team and they said that there is a monthly cleanup task that runs. It doesn’t appear that this is configurable to increase in frequency.

1 Like

Thanks for looking into it. It pointed me into the right direction, here is what I just grep:

/var/log/messages-20200828:Aug 27 12:07:48 foreman-dev pulp: celery.app.trace:ERROR: [a3b09054] (91427-92576) Task pulp.server.maintenance.monthly.monthly_maintenance[a3b09054-4a6b-48d3-ad61-93c66667aece] raised unexpected: OperationFailure(u"command SON([('aggregate', u'repo_profile_applicability'), ('pipeline', [{'$group': {'rpa_profiles': {'$addToSet': '$profile_hash'}, '_id': None}}, {'$project': {'orphaned_profiles': {'$setDifference': ['$rpa_profiles', [u'35bcbf0f4604105d8f75cb53db3f621123b614602a5c109c0f3376a812418376', u'3fb702d0d3d6b5b45bface72ffe91fa40343dcafe7787edef678584ab662f1f3', u'89288efe14136995a305516f89d2ed4da2a0834df3e78abd9cd0af3d2fa93b9e', u'27b8dca585588459da5ee08400c19b6627cab28db447edcea652b3862d7acd10', u'ee1089bfa99d4d663c0ea7177b7da6d3ea58d3c32e7f437df4eaa217fb1067bf', u'174fd1bf707f969fdcaed6decb09b1e100805640e2bca60584b50d63e5f57bcf', u'f9df7ad193846055b9321203e27877d58e757fc02fe4b82fa78e222174e1e8b3', u'd0592d480cb9cc7acd336c27f238f6c055e8a4ac4e2a9a0db023df27faaeb10a', u'6514316ff32ea40986d9c7d18d73f2bcd6e9d110d88a5e84bde8d52c0114a5b1', u'4d1ece3858e0e707abe3c52307ee40372beda4ae243030eccf48bcba505129e6', u'7eab6930105e7339bb8071ebdb9e924c0eca53204d9acc8db2f7e70f477ec687', u'40ea48cf60e2016de68feb9e422797eb2a96e4f3b90fd6db7c0703a086eaedcf', u'f67b5568875867f268f7a419c60660628fa3e94cd2f928e7ebf59ce5b48d94c2', u'7cf66d66bcc98071c83d566a16b5eb04b40c35c2d08fc0711e2a0fa6db856a81', u'66940e834647b66384b1b1ecb9e5499c9757cf5ef8aceb7f48eb58f2271043df', u'34612769742f0b4653623154dd6d53fc3557addcdb85b61c96e0c99038f9922e', u'79ee0a81a4d446b4a9d4aaefaecce7a88161dd1a718331c9f079107d1fa40c65', u'bcef9ec32b787429a06f888e7c65cf3c442a13a6828e328fd88b52b423996f4b', u'523b5517f5ff7c8d4d1ff647220d41a41866ff969f9843bfdf698bb087b02626', u'2a34141af626c104c4d5945a5ec5c72aa4a62f1f0a44e2226b4b6f479ae3afb5', u'48f55d66671eff06b6a8e14d3972562133d3dae14a1b8c4b149dacecd7e65e74', u'c5d83c916f27fac21427e593c9a9370487458df93153733b78f97bf725477bd3', u'77a5ea0d7c3b58e5c712fecc46e946f8c057a19e3113c8457f95141b8070555a', u'1ea9653508f4731802aa5ce2fd2f9a9644055a1f7fb
/var/log/messages-20200828:Aug 27 12:07:48 foreman-dev pulp: celery.app.trace:ERROR: [a3b09054] (91427-92576)   File "/usr/lib/python2.7/site-packages/pulp/server/maintenance/monthly.py", line 22, in monthly_maintenance

I have centos erratas imported from this external tool so it might be linked. I will dig more into it.

After removing Centos erratas and few content views, the monthly cleanup task was able to run. After reclaiming space with db.repairDatabase(), the size of repo_profile_applicability went down to 27.8 MB!

   {
            "ns" : "pulp_database.repo_profile_applicability",
            "size" : 22253872,
            "count" : 12909,
            "avgObjSize" : 1723,
            "numExtents" : 7,
            "storageSize" : 29159424,
            "lastExtentSize" : 10330112,
            "paddingFactor" : 1,
            "paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",
            "userFlags" : 1,
            "capped" : false,
            "nindexes" : 3,
            "totalIndexSize" : 3630144,
            "indexSizes" : {
                    "_id_" : 392448,
                    "repo_id_-1" : 711312,
                    "all_profiles_hash_-1_profile_hash_-1_repo_id_-1" : 2526384
            },
            "ok" : 1
    },
1 Like

Added back Centos erratas and published new content view. The pulp monthly maintenance still works. The table size is now 41.1 MB.

{
	"ns" : "pulp_database.repo_profile_applicability",
	"size" : 27526400,
	"count" : 12976,
	"avgObjSize" : 2121,
	"numExtents" : 8,
	"storageSize" : 43106304,
	"lastExtentSize" : 13946880,
	"paddingFactor" : 1,
	"paddingFactorNote" : "paddingFactor is unused and unmaintained in 3.0. It remains hard coded to 1.0 for compatibility only.",
	"userFlags" : 1,
	"capped" : false,
	"nindexes" : 3,
	"totalIndexSize" : 5077296,
	"indexSizes" : {
		"_id_" : 433328,
		"repo_id_-1" : 1398096,
		"all_profiles_hash_-1_profile_hash_-1_repo_id_-1" : 3245872
	},
	"ok" : 1
},