Split External DBs

With the movement to Pulp3 and the removal of MongoDB, a managed database Katello installation now only uses a single database service, PostgreSQL, which contains Foreman, Candlepin, and Pulp databases.

With that said, we also have the capability to deploy with unmanaged databases, and I’d like to start a discussion about the capabilities and installation experience of this feature, specifically regarding the below use cases:

  1. For users who wish to deploy with a single external database server, could we have a single set of parameters for this DB server which simplifies the installation experience for external DBs? I.e. the user would only need to specify once the address and port of the external database server and by default all components would use these.

  2. For very large deployments where Foreman, Candlepin, and Pulp are all competing for PostgreSQL connections, I think we should also make it possible to configure these external DBs separately per service, so that for example repository sync or CV publish/promote performance would not be throttled by the DB during times of high registration load.

Regarding the 2nd use case, it should already be possible with the way Pulp3 is installed to specify a separate external DB, but for Candlepin it’s not the case currently and would require some additional work.

I think a way to achieve both requirements would be implementing parameters such as --candlepin-use-foreman-db and --pulpcore-use-foreman-db which default to true. Then a user who wishes to configure a single external database server can simply pass the --foreman-db* parameters to the installer and candlepin and pulpcore would use the same configuration. However for the user who wants to configure separate DBs per service, those parameters could be changed to false and additional configuration for pulpcore and/or candlepin DBs could be supplied.

Please weigh in if you have any thoughts on this proposed design! Thanks,

2 Likes

So, I looked further into this and I’m already able to perform this type of installation in nightly – I didn’t run into the problems that I anticipated.

The one issue I did run into was that the server hosting the Foreman database was also required to have https://fedorapeople.org/groups/katello/releases/yum/nightly/pulpcore/el7/x86_64/rh-postgresql12-postgresql-evr-0.0.2-1.el7.x86_64.rpm installed, which was not documented anywhere.

1 Like

Thank you for getting back to us. Can you share your full notes, if you have any? This deployment is indeed very interesting. Do you even plan to split databases into separate hosts or VMs?

I am interested mainly from the documentation perspective, our @installer team can definitely share opinion on the technical side.

@wbclark is on the Red Hat platform team and has a strong interest in the installer.

I have thought about this as well. It’s probably safe to assume that there’s 3 groups to consider:

  • Default all in one. Everything is on the same host. This will likely be the majority of users
  • Application server + DB server. Once you start to scale out, this is a logical first step. You may have a DBA team. Most likely you want every DB on this DB server
  • Full scale out - split everything. Ideally you’d also split the applications themselves

That last past is my long term desire. Currently Katello has bad entry points. I can’t make anything else out of it - it’s bad. You have --katello-candlepin-* options. Then on a Katello server you have --katello-pulp-* but you also see --foreman-proxy-content-pulp-* options that are actually ignored. I can go on a long rant with more examples, but that’s bad UI/UX.

With the move to Pulp 3 we can actually improve a lot - it has an architecture that makes it easier to deploy. I also started a PR to expose --candlepin-* options. It also allows --no-enable-candlepin to not deploy candlepin at all. That allows for a composable setup. It also makes it more transparent which systems are being deployed and how to configure them.

Ideally we’ll also have top level pulpcore parameters (--pulpcore-* instead of --foreman-proxy-content-pulpcore-*). That would allow users to deploy Pulpcore on a different server.

I’m waiting for the Pulp 2 removal before picking this up again. Not having to think about that old deployment makes refactoring easier.

Hi, thanks for your reply

Here are the installer options I used with Katello 4.0.0-0.2

# foreman-installer --scenario katello --verbose \
--foreman-db-manage false \
--foreman-db-host postgres-foreman.example.com \
--foreman-db-database foreman \
--foreman-db-username foreman \
--foreman-db-password fakepass \
--foreman-proxy-content-pulpcore-manage-postgresql false \
--foreman-proxy-content-pulpcore-postgresql-host postgres-pulpcore.example.com \
--foreman-proxy-content-pulpcore-postgresql-user pulp \
--foreman-proxy-content-pulpcore-postgresql-password fakepass \
--katello-candlepin-manage-db false \
--katello-candlepin-db-host postgres-candlepin.example.com \
--katello-candlepin-db-name candlepin \
--katello-candlepin-db-user candlepin \
--katello-candlepin-db-password fakepass 

For the infrastructure, I deployed 4 separate VMs running the latest RHEL7
The Candlepin DB server used postgresql-server-9.2.24-4.el7_8.x86_64
The Pulpcore DB server used rh-postgresql96-postgresql-server-syspaths-9.6.10-1.el7.x86_64
The Foreman DB server used rh-postgresql12-postgresql-server-syspaths-12.1-2.el7.x86_64 and was also required to have rh-postgresql12-postgresql-evr-0.0.2-1.el7.x86_64

The only real reason for the different versions is that I started with the versions shipped in RHEL repos and had to replace them with more recent versions to get foreman-rake db:migrate and sudo -u pulp DJANGO_SETTINGS_MODULE=pulpcore.app.settings PULP_SETTINGS=/etc/pulp/settings.py python3-django-admin migrate --noinput to complete successfully.

If I had to do it again I would just start with postgres 12 on each DB server.

The basic steps for setting up each DB were like

# /opt/rh/rh-postgresql12/root/usr/bin/postgresql-setup --initdb
# vim /var/opt/rh/rh-postgresql12/lib/pgsql/data/pg_hba.conf # added the following line
host    all             all             katello.example.com            md5
# vim /var/opt/rh/rh-postgresql12/lib/pgsql/data/postgresql.conf # added the following line
listen_addresses = '*'
# systemctl enable --now postgresql
# su - postgres -c psql
postgres=# CREATE USER foreman WITH PASSWORD fakepass;
postgres=# CREATE DATABASE foreman OWNER foreman;
postgres=# \q
# firewall-cmd --add-service=postgresql
# firewall-cmd --runtime-to-perm