Katello 3.18 upgrade failed with PG::NotNullViolation

Problem:
I want to upgrade to Katello 4.x and have to migrate pulp2 to pulp3.

[root@katello ~]# foreman-maintain content prepare
Running Prepare content for Pulp 3
 ================================================================================
Prepare content for Pulp 3: 
API controllers newer than Apipie cache! Run apipie:cache rake task to regenerate cache.
Checking for valid Katello configuraton.
Starting task.
2021-08-10 08:48:11 +0200: Importing migrated yum repositories: 1/4303Migration failed, You will want to investigate: https://katello.balu.lan/foreman_tasks/tasks/1b08d6b1-ddc1-4856-a659-addf61199a6e
rake aborted!
ForemanTasks::TaskError: Task 1b08d6b1-ddc1-4856-a659-addf61199a6e: ActiveRecord::NotNullViolation: PG::NotNullViolation: ERROR:  null value in column "repository_href" violates not-null constraint
DETAIL:  Failing row contains (20, null, 5, 8).
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.18.4/lib/katello/tasks/pulp3_migration.rake:39:in `block (2 levels) in <top (required)>'
/opt/rh/rh-ruby25/root/usr/share/gems/gems/rake-12.3.3/exe/rake:27:in `<top (required)>'
Tasks: TOP => katello:pulp3_migration
(See full trace by running task with --trace)
                                                                      [FAIL]
Failed executing foreman-rake katello:pulp3_migration, exit status 1
--------------------------------------------------------------------------------
Scenario [Prepare content for Pulp 3] failed.

The following steps ended up in failing state:

  [content-prepare]

Resolve the failed steps and rerun
the command. In case the failures are false positives,
use --whitelist="content-prepare"

Content prepare fails on something with the postgres database. Foreman upgrade/installer works without errors.

Expected outcome:
On content prepare, I would expect to prepare it successfully and start switching over to pulp3.

Foreman and Proxy versions:
Foreman 2.3.5
foreman-proxy-2.3.5-1.el7.noarch

Foreman and Proxy plugin versions:
Katello 3.18.4

Distribution and version:
Static hostname: katello.balu.lan
Icon name: computer-vm
Chassis: vm
Machine ID: 18d10fd53a634db8a8fb09003c704f5f
Boot ID: 6b70a8c335b942a2a8ddffe1c234ed6b
Virtualization: kvm
Operating System: CentOS Linux 7 (Core)
CPE OS Name: cpe:/o:centos:centos:7
Kernel: Linux 3.10.0-1160.36.2.el7.x86_64
Architecture: x86-64

Other relevant data:
For a successful migration, I remove all CentOS 8 data including all RedHat data. repositories, products and content/composite views.

Hi @balu,

Could you please enter the console with foreman-rake console and show me the output of the following:

::Katello::Repository.where(id: ::Katello::Pulp3::RepositoryReference.where(repository_href: "nil").select(:id))

Let me know if there is anything surprising about the repositories that pop up from that command.

A couple things to try that might help you get through the migration:

  1. Run foreman-rake katello:delete_orphaned_content
  2. Don’t create repositories or publish content view versions during the migration. There is a bug related to this that was fixed very recently into the 3.18 code.

The content prepare step can be run any amount of times, so no worries about re-running it.

Hi @iballou ,

thanks for your answer. Here is the output of foreman-rake console command:

[root@katello ~]# foreman-rake console
API controllers newer than Apipie cache! Run apipie:cache rake task to regenerate cache.
Loading production environment (Rails 6.0.3.4)
irb(main):001:0> ::Katello::Repository.where(id: ::Katello::Pulp3::RepositoryReference.where(repository_href: "nil").select(:id))
=> #<ActiveRecord::Relation []>
irb(main):002:0> 

I run a foreman-rake katello:delete_orphaned_content with following output:

[root@katello ~]# foreman-rake katello:delete_orphaned_content
API controllers newer than Apipie cache! Run apipie:cache rake task to regenerate cache.
Orphaned content deletion started in background.

It excited with 100%, success and a green bar.

So I did a prepare again with same result:

[root@katello ~]# foreman-maintain content prepare
Running Prepare content for Pulp 3
================================================================================
Prepare content for Pulp 3: 
API controllers newer than Apipie cache! Run apipie:cache rake task to regenerate cache.
Checking for valid Katello configuraton.
Starting task.
2021-08-11 19:00:31 +0200: Importing migrated yum repositories: 1/2988Migration failed, You will want to investigate: https://katello.balu.lan/foreman_tasks/tasks/b13209cb-a05f-4a3d-93c7-8b3f9338dd6a
rake aborted!
ForemanTasks::TaskError: Task b13209cb-a05f-4a3d-93c7-8b3f9338dd6a: ActiveRecord::NotNullViolation: PG::NotNullViolation: ERROR:  null value in column "repository_href" violates not-null constraint
DETAIL:  Failing row contains (20, null, 5, 8).
/opt/theforeman/tfm/root/usr/share/gems/gems/katello-3.18.4/lib/katello/tasks/pulp3_migration.rake:39:in `block (2 levels) in <top (required)>'
/opt/rh/rh-ruby25/root/usr/share/gems/gems/rake-12.3.3/exe/rake:27:in `<top (required)>'
Tasks: TOP => katello:pulp3_migration
(See full trace by running task with --trace)
                                                                      [FAIL]
Failed executing foreman-rake katello:pulp3_migration, exit status 1
--------------------------------------------------------------------------------
Scenario [Prepare content for Pulp 3] failed.

The following steps ended up in failing state:

  [content-prepare]

Resolve the failed steps and rerun
the command. In case the failures are false positives,
use --whitelist="content-prepare"

I have configured a cronjob for publishing content views and composite views automatically. I don’t think, there was a publish or promote command while running content prepare, but it is possible that has happened. What can I do, if its happened?

Thanks very much

Hi @balu,

Apologies for the wait. It looks like the data coming from Pulp 3 might be bad since the missing repository_href comes right from Pulp 3.

Was your original content migration very long? If not, I might recommend trying a foreman-maintain content migration-reset and then migrate again. I’d also consider turning off the content view publishing cron job if possible. The protections for content view publishing during the migration are still waiting on Katello 3.18.5.

Hi @iballou,

no reason to say sorry :slight_smile:

I will start the migration-reset and try again. Maybe it takes 2-3 days. I will give an update if there is one.

Thanks very much for your help.

@balu sounds good. If you see the same issue, please let me know and I’ll make sure to ask the Pulp team about it. I’m pretty sure I’ve seen this before and it wasn’t reproducible after a reset.

Okay so I did multiple resets, because of multiple damaged packages on the system, that cannot be fixed via any type of repository sync or checksum check. So I deleted it on the filesystem itself, did a cleanup orphaned and tried again. Last result is following:

On a content prepare, I get following:

[root@katello ~]# foreman-maintain content prepare                                                                     
Running Prepare content for Pulp 3
================================================================================
Prepare content for Pulp 3: 
API controllers newer than Apipie cache! Run apipie:cache rake task to regenerate cache.
Checking for valid Katello configuraton.
Starting task.
2021-09-02 17:07:31 +0200: Importing migrated content type package_group: 132/132              
Content Migration completed successfully
I, [2021-09-02T17:07:56.923783 #30984]  INFO -- /default_dead_letter_handler: got dead letter #<Concurrent::Actor::Envelope:100819740> @message=:tick, @sender=#<Thread:0x000000000c091e40@io-worker-75@/opt/theforeman/tfm/root/usr/share/gems/gems/logging-2.3.0/lib/logging/diagnostic_context.rb:471 run>, @address=#<Dynflow::ClockReference:0x000000000a848460 /clock (Dynflow::Clock)>>

Script started on Thu 02 Sep 2021 10:50:10 AM CEST
API controllers newer than Apipie cache! Run apipie:cache rake task to regenerate cache.
Checking for valid Katello configuraton.
Starting task.
2021-09-02 17:07:31 +0200: Importing migrated content type package_group: 132/132              
Content Migration completed successfully
I, [2021-09-02T17:07:56.923783 #30984]  INFO -- /default_dead_letter_handler: got dead letter #<Concurrent::Actor::Envelope:100819740> @message=:tick, @sender=#<Thread:0x000000000c091e40@io-worker-75@/opt/theforeman/tfm/root/usr/share/gems/gems/logging-2.3.0/lib/logging/diagnostic_context.rb:471 run>, @address=#<Dynflow::ClockReference:0x000000000a848460 /clock (Dynflow::Clock)>>

                                                                      [OK]
--------------------------------------------------------------------------------

After this, I tried a switchover and get this:

[root@katello ~]# foreman-maintain content prepare                                                                     
Running Prepare content for Pulp 3
================================================================================
Prepare content for Pulp 3: 
API controllers newer than Apipie cache! Run apipie:cache rake task to regenerate cache.
Checking for valid Katello configuraton.
Starting task.
2021-09-02 17:07:31 +0200: Importing migrated content type package_group: 132/132              
Content Migration completed successfully
I, [2021-09-02T17:07:56.923783 #30984]  INFO -- /default_dead_letter_handler: got dead letter #<Concurrent::Actor::Envelope:100819740> @message=:tick, @sender=#<Thread:0x000000000c091e40@io-worker-75@/opt/theforeman/tfm/root/usr/share/gems/gems/logging-2.3.0/lib/logging/diagnostic_context.rb:471 run>, @address=#<Dynflow::ClockReference:0x000000000a848460 /clock (Dynflow::Clock)>>

Script started on Thu 02 Sep 2021 10:50:10 AM CEST
API controllers newer than Apipie cache! Run apipie:cache rake task to regenerate cache.
Checking for valid Katello configuraton.
Starting task.
2021-09-02 17:07:31 +0200: Importing migrated content type package_group: 132/132              
Content Migration completed successfully
I, [2021-09-02T17:07:56.923783 #30984]  INFO -- /default_dead_letter_handler: got dead letter #<Concurrent::Actor::Envelope:100819740> @message=:tick, @sender=#<Thread:0x000000000c091e40@io-worker-75@/opt/theforeman/tfm/root/usr/share/gems/gems/logging-2.3.0/lib/logging/diagnostic_context.rb:471 run>, @address=#<Dynflow::ClockReference:0x000000000a848460 /clock (Dynflow::Clock)>>

                                                                      [OK]
--------------------------------------------------------------------------------

Do I have do execute a reimport of everything? I did multiple resets, but it has not solved the problem, yet.

Thanks very much,
Ludwig

Hey @balu, I think you copy-and-pasted the content prepare step twice.

You shouldn’t have to reimport_all after doing a migration reset especially, so I’m assuming there is an error going on.

Also, when you mentioned “damaged packages”, do you mean you saw the migration-stats report them as corrupt?

Hi @iballou,

oh sorry :-/
Yes the migration-stats give me a list of packages in a file. This file was save anywhere in /tmp.

Ok I will try it one last time. I think something strange is happening on my installation. One error to another comes up, but I don’t know, why its happening.

Next time you get your switchover error, let me know and we’ll look more into it. It could potentially be related to a bug that would be fixed in 3.18.5.

I just wanted to give you a heads up that 3.18.5 is now out, which may help the migration issues.