Need help after accidental update to nightly build

Polle · October 7, 2020, 3:41pm

So, when I decided to upgrade our Foreman/Katello server from 2.0.3 to 2.1.3, I noticed too late that I was copy/pasting commands from the nightly version instead of 3.16.

So I went through a rather painful process of downgrading packages and trying to fix things. To some extend that worked fairly well - I got to a point where I managed to successfully run ‘foreman-installer’ after which I could access the GUI again with the ‘About’ showing version 2.1.3. Looks like most is working again - the only issues left all seem to be related to hostgroup stuff:
Edit a host group:
undefined method kickstart_repository_id' for #<Hostgroup:0x00007fa8fb6d6fb8> Did you mean? .... Provision a host: undefined method kickstart_repository_id’ for #Hostgroup:0x00007fa8fa3d0428 Did you mean? …
Edit a host:
undefined method lifecycle_environment_id' for #<Hostgroup:0x00007fa8e0053a30> Did you mean? .... Remote execute command: undefined method sudo_password=’ for #JobInvocation:0x00007fa8f9b6ed70 Did you mean? sudo_password

And I guess there are some more I didn’t run into yet.
So, still pretty messy although things don’t look too bad at first sight - looks like some generic issue that might be fixed by reverting some more files. Any suggestions?

Another thing I’ve been trying to figure out: is it in some way possible to reinstall things without touching the db(s)? (Well the nightly update also did some db migration so my biggest fear is that there is some work there too).

The final option is to shoot myself but I’d rather not go for that solution … (should have done that before starting the upgrade maybe)

Polle · October 8, 2020, 6:18am

Well, spent a night digging and made some important progress - added
kickstart_repository_id,lifecycle_environment_id,content_view_id and content_source_id columns to the hostgroups table (with the correct foreign keys) and that got editing hostgroups and provisioning hosts working again!
Now, opening a host group showed empty fields for those records - that is logical, but when I select values and save, the values aren’t saved and no error is shown. Can anyone point me to the code/file that is responsible for saving the data from the hostgroups form? (as a temporary solution, I just just some update query to fill in those columns).

Polle · October 14, 2020, 1:29pm

So I got at the point where almost everything is back to normal - almost everything.
I made a fresh installation on a spare machine and compared the postgres db schemas. Using that comparison I made some (mostly minor) changes to the db so that the schema of the ‘broken’ one is identical to the freshly installed one.
I copied some directory trees too making sure not the mess up settings / certificates / …

So I can do about everything now, when I deploy a bare metal host and boot it into the discovery image, I get a 422 error stating:
undefined method ‘execution"’ for #\u003cNic:Managed:0x…\u003e

Execution is a column in the nics table - that’s one that I needed to add - but I have no clue why I’m getting this error and how to fix. Same message if I try to register a host that’s already deployed.
Really driving me nuts - I scanned/compared lots of ruby code, all seems to be identical to my freshly installed machine so getting a little desperate here (knowing it’s the only thing that’s not working)

lzap · October 15, 2020, 9:49am

It comes from the Remote Execution plugin, compare what plugins were installed and are installed. Plugins do actually extend db tables quite often. You need to make sure you have the exactly same set of plugins on your new installation.

I will add a big fat warning once I start migrating our installation guide to asciidoc, @mcorr please remind me It needs to be next to the commands users copy so it’s obvious.

Alternatively, you could sort out all those nightly problems and stay on nigthly until rc1 is released. It’s gonna be pretty soon, less than two months.

Polle · October 15, 2020, 11:25am

Thanks for your advice - got good news however
I finally managed to get everything working again - actually, I’m not 100% sure what really did the trick. Since the issue was in kickstart/discovery I did a yum reinstall of
tfm-rubygem-foreman_discovery-16.1.2-1.fm2_1.el7.noarch
tfm-rubygem-smart_proxy_discovery-1.0.5-6.fm2_1.el7.noarch
and I also started a foreman-rake facts:clean - looks to me that this clean shouldn’t make a much of a difference so I tend to believe that the reinstalls did it.
Anyway: big lesson learned here: triple check what you’re updating to …

BTW, it might be a good idea to log all package upgrades (from version / to version) in the upgrade log and all database actions just in case some idiot updates to the wrong version …

lzap · October 16, 2020, 7:40am

Yum provides that in /var/log/yum.log. It also have a feature to rollback whole transactions using yum transaction commands. However your database was tained anyway, that could be also fixed using foreman-rake rollback:db but it’s untested and complicated. Reinstall was probably cleaner.