Yet another Foreman upgrade fail

Yes. You are right. Even the postgresql-13-setup script from postgresql does it with runuser -l.

But by now, it seems we have found the reason and a potential workaround.

Looking the last katello.log on my server I see this line:

[configure] /Stage[main]/Postgresql::Server::Initdb/Postgresql::Server::Instance::Initdb[main]/Exec[postgresql_initdb_instance_main]: '/usr/bin/initdb --pgdata '/var/lib/pgsql/data' --username 'postgres' ' won't be executed because of failed check 'creates'

This looks to me as if foreman-installer initializes the database as user root, calling initdb directly, which probably means in that step the current userā€™s locale applies.

The upgrade runs through runuser -l which uses the system locale.

Both together would suggest that foreman-installer causes this, because it doesnā€™t use runuser -l for initdb (and doesnā€™t use postgresql-setup --initdb either). So this would make it a foreman-installer bug, because it relies on the system locale being used by the user running foreman-installer.

I would say it would be good if foreman-installer would check the locale of the database and the system locale and would raise an error if it finds an inconsistency. That would be better than people trying to find the right log file with the right piece of information pointing to the cause of the problemā€¦

Thatā€™s puppetlabs-postgresql/manifests/server/initdb.pp at main Ā· puppetlabs/puppetlabs-postgresql Ā· GitHub which calls puppetlabs-postgresql/manifests/server/instance/initdb.pp at main Ā· puppetlabs/puppetlabs-postgresql Ā· GitHub.

I think for most Puppet users this isnā€™t a problem because Puppet runs as a daemon with the system locale, but our installer is different. Iā€™m wondering if it would be better to let the installer ensure Puppet runs with the system locale.

I am using AlmaLinux 8.10 with Foreman 3.10. and the upgrade failed with:

lc_collate values for database ā€œpostgresā€ do not match:

I simply changed the locale and upgraded the database and that seems to work.

# localectl set-locale C.UTF-8
# localectl status
   System Locale: LANG=C.UTF-8
       VC Keymap: gb
      X11 Layout: gb
# postgresql-setup --upgrade
 * Upgrading database.
 * Upgraded OK.

I then ran: foreman-installer and everything worked as expected. I have left the new locale and will see if it makes any difference.

For anyone experiencing issues with locales I found a fix that I was able to leverage for my systems

If you list out the databases for your postgres servers you can see the locale config
make sure you are logged in as the postgres user

su postgres
psql 
\l

My locales were being set as C.UTF-8 for some reason, I suspect it is because of an update to one of the postgres packages

Regardless, after updating the postgres databases for my foreman and smartproxy servers the pg_upgrade ran successfully

You can update each table with

update pg_database set datcollate='en_US.UTF-8', datctype='en_US.UTF-8' where datname='postgres';

I also updated these tables on my smartproxy servers

You can execute these from a file as well

psql -f filename

My foreman upgrade ran successfully after this

Screenshot 2024-08-22 110837

1 Like

This solved the issue for me!

Please do not do that! Never do that! That is very dangerous! Technically your database is corrupted after that because you just told postgresql that the main postgres database has a different collation and ctype from what it actually does. It may just work for you because you are coming from C.UTF-8 and thus any strings in the database should directly map to en_US.UTF-8 without modification.

For someone else coming from a different collation it may cause serious issues.

But again: technically itā€™s corrupted! Do not attempt it. String may be broken. Indexes may be broken. Anything string related may be broken. You may not even notice where itā€™s broken until much later when you cannot find the texts you are looking for or worse itā€™s causes internal crashes/exceptions because a text cannot be interpreted as itā€™s the wrong locale.

Anything supported to be changed, can be changed through ALTER commands. Messing in the tables in postgres database can have serious effects.

The only supported way to ā€œchangeā€ the collation or ctype is by dumping the database and restoring the dump into a new database created with the correct locale.

Otherwise, change the system locale to match and do the upgrade of the database. After that, it doesnā€™t matter. This is a bug of the redhat postgresql-setup script. Donā€™t break the database as workaround to this bugā€¦

1 Like

Okay, so I did use this fix. Is there a way to fix the database if it is indeed broken? Not sure what the localeā€™s were before I did this.

Only a postgresql developer might know that. Itā€™s an inconsistent state. I donā€™t know where the information in the pg_database table is used. But itā€™s not a trivial change otherwise there would probably an ALTER DATABASE option for that.

You can only revert to a snapshot before the upgrade or check a backup to find out the locale before. But even then, itā€™s not safe to assume that simply changing it back would bring it into a consistent state again. You have changed it in postgresql 12 and upgraded to 13 with this. Even if I would guess that itā€™s not an issue because the system postgres database will hopefully hold mostly ascii strings, unless someone know exactly the implications of these two columns in the pg_database table itā€™s impossible to predict. Chances are it doesnā€™t matter.

To be safe, revert to a snapshot or backup before the upgrade and upgrade again.

Iā€™ve taken your suggestion and I developed a different solution where I dump the old database and create a new database with the appropriate locales

/* create db */
create database postgres_new with owner=postgres encoding='UTF-8' lc_collate='en_US.UTF-8'
alter database postgres_new with allow_connections=true;

/* dump old db and restore in new db */
pg_dump -d postgres > /tmp/postgresbak.sql
psql -d postgres_new < /tmp/postgresbak.sql
 
/* drop old db */
lc_ctype='en_US.UTF-8' template template0;
alter database postgres rename to postgres_old;
alter database postgres_new rename to postgres;
drop database postgres_old

This solution also worked for me however Iā€™m trying to understand how this is any different than my original solution when we have always used the en_US.UTF-8 locales.

Whatever changed in the recent version of this package to cause the locales to be set to C.UTF-8 for all users seems to be done in the interest of ensuring everyone has a POSIX standard locale that will be compatible for upgrade but this only seems to have created a mismatch for every user. What is the philosophy behind this?

I donā€™t think that is any better than messing with the postgres internal table. The database postgres is the internal table. I wouldnā€™t mess with it. I am surprised that this is working at all because when you rename the database you donā€™t have a database postgres for a moment. It wonder how it works at all.

Did you check after the drop database if there are still old database files opened which have been deleted. Did you restart the postgresql server since then? My guess would be that it wrote the information into files which were already openā€¦

If you dump and restore to change the locale of the whole database, you have to dump all databases on the server, run initdb with the correct locale options and then restore.

I can only repeat: set the correct locale for the upgrade. Itā€™s a bug of the postgresql-setup that it doesnā€™t use the current database locale for the inplace upgrade. With the correct locale set, at least for the duration of the upgrade, itā€™ll work. Donā€™t try to change something internally which isnā€™t broken only because something else is broken.

With your change you never know what the impact may be. As far as I understand postgresql initializes the postgres, template0 and template1 database with the same locale. Possibly with your change, youā€™ll end up with a database which cannot be upgraded anymore because those three arenā€™t matching. Or whatever.

Donā€™t try to tweak internals to work around an issue, which can be handled by setting the correct locale while running the database upgrade, until the upgrade script has been fixed by redhat to handle this case correctly.

I donā€™t understand why you donā€™t want to set the locale but rather mess with internals for which you donā€™t understand the consequences.

Weā€™re working on a ā€œcorrectā€ fix in Fixes #37797 - ensure PostgreSQL upgrade uses the right locales by evgeni Ā· Pull Request #971 Ā· theforeman/foreman-installer Ā· GitHub

2 Likes

Any update on the fix?

Itā€™s part of 3.12.0, we still need to backport it to 3.11.x

1 Like

Ran into this issue as well. Instead of waiting for the fix I just fixed the Collate and Ctype and carry on.

Before upgrade I have:

pulpcore=# \l
                                  List of databases
   Name    |  Owner   | Encoding |   Collate   |    Ctype    |   Access privileges
-----------+----------+----------+-------------+-------------+-----------------------
 postgres  | postgres | UTF8     | en_AU.UTF-8 | en_AU.UTF-8 |
 pulpcore  | postgres | UTF8     | en_US.utf8  | en_US.utf8  | =T/postgres          +
           |          |          |             |             | postgres=CTc/postgres+
           |          |          |             |             | pulp=CTc/postgres
 template0 | postgres | UTF8     | en_AU.UTF-8 | en_AU.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
 template1 | postgres | UTF8     | en_AU.UTF-8 | en_AU.UTF-8 | =c/postgres          +
           |          |          |             |             | postgres=CTc/postgres
(4 rows)

This was not expected but I am pretty sure it happened because that is the locale on my laptop and I SSH in to install the proxy so the remote shell inherited my environments.

I used sudo -i which keeps some of these environment as well.

Here is how I did it:

  1. Took a VM snapshot.
  2. sudo foreman-maintain service stop --exclude postgresql
  3. sudo su - postgres # donā€™t use sudo -i as it will inherit remote locale
  4. pg_dump -F d pulpcore -c -C -j 5 -f /var/lib/pgsql/backups/pulpcore.$(hostname -s).$(date -I)
  5. mv /var/lib/pgsql/data /var/lib/pgsql/data.v12
  6. double check for any other LC_* environment with export | grep en_
  7. initdb -E UTF-8 --locale en_US.UTF-8 /var/lib/pgsql/data
  8. pg_restore -d postgres --create /var/lib/pgsql/backups/pulpcore.$(hostname -s).$(date -I)
  9. foreman-installer
  10. Now you have a working v3.10 proxy, continue with the instructions from: Upgrading Foreman to 3.11

Any update on backport to 3.11?

For me the issue was:

# grep '^lc_' /var/lib/pgsql/data/postgresql.conf
lc_messages = 'en_US.UTF-8'
...
# localectl
   System Locale: LANG=en_AU.UTF-8

So for me the fix was to revert to snapshot before my failed upgrade attempt, and then re-do the upgrade after switching system locale to match the postgres database:

localectl set-locale en_US.UTF-8
#followed by:
dnf -y module switch-to postgresql:13
foreman-maintain service stop
dnf upgrade
foreman-installer

I expect i could switch locale back now, until the next postgres upgrade.
But Iā€™ll probably leave it as-is.

3.11.4 is about to be released and will contain the necessary fixes.

ah nice, ta evgeni that will avoid much grief for users, ty.

Thank you. We will be upgrading to that when released

1 Like