Performan Issues Publishing/Promoting Some Content Views

Hello Everyone,

I'm experiencing some very long times while creating new composite content view versions and publishing them for some products.

About my setup:

Katello 3.1
Foreman 1.12.4

CentOS 7
6 vCPU (backed by 2.8 GHz Xeons)
ESXi 6.x hypervisor
Virtual host is all but dedicated to this one VM right now
VM has been migrated to host local storage (15k SAS in RAID 5)

I have composite content views for CentOS 6 & 7 and OEL 6 & 7. These composite views are made up of individual content views of the OS, EPEL, Puppet, and Katello client repos. For example, my "CentOS 6 Stable" CCV is made up of the following CVs:

CentOS 6 Core
CentOS 6 SCL
Copr - dgoodwin - EL 6 - Subscription Manager
Fedora People - Katello - EL 6 - Client
Puppet Labs EL 6
Puppet Modules
VMware - EL 6 - OSP

Each of those CVs has one more more repositories corresponding to the name of the CV (C6 core includes the base, updates, fasttrack, and extras repos). If it matters, all of my repos are using "on demand" for their download policy. I have a script that goes through each CCV, creates new versions of each of the components, updates the CCV for the new component versions, creates a new version of the CCV, then publishes it to the first environment in the lifecycle. All of these steps run in serial.

Currently my "CentOS 6 Stable" CCV has 23,434 packages, 3,911 errata (from EPEL), and 17 Puppet modules. None of the CVs have any filters. There is one filter applied which Today it took about 7.5 minutes to create a new version of this CCV and about 3.5 minutes to publish it to the first environment in the lifecycle after library. Not too shabby in my opinion.

Conversely, my "OEL 6 Stable - UEK" CCV has 47,374 packages (because Oracle puts every version of the packages in their repos), 6,552 errata (Oracle publishes errata), and 17 Puppet modules. The "OEL 6 Core" CV has one filter which excludes the RHN packages (by name with "all versions"). Today it took 3 hours, 10 minutes to create a new version of this CCV. The publish has been running for a couple hours and is stuck at 94% (this is common - see next for a bit more info).

When I go to publish a content view it seems that Mongodb is constantly the bottleneck. There is always a single mongodb process taking up nearly 100% of 1 processor. The load average on the system hovers around 1.3. The task monitor in the web UI shows the tasks running along fine until the 90th percentile, where the task normally hangs at either 94% or 98% for the majority of the time. The task says that it is "initiating pulp task" and the task is "yum_clone_distributor".

Can someone give me ideas as to how to go about addressing this? If the differences between my CentOS and OEL repos were double I would understand (because the number of packages is about double), but its over 17 times as long. I've thought about trying to lower the package count by filtering out package groups (such as all the languages), but I don't know if that will help or hurt since there is more processing that has to be done.

Any input is very much appreciated.