Foreman online backup struck on Backup Pulp data: - Collecting Pulp data

Problem:
Foreman and Katello version upgraded from Foreman 3.10 and Katello 4.12 to Foreman 3.11 and Katello 4.13
When Upgrading,

dnf -y module switch-to postgresql:13

command moves postgresql service to failed state.
After updating locale from en_US.UTF-8 to C.UTF-8 using below command the foreman-installer completed without issues,

localectl set-locale C.UTF-8

After Foreman upgrade all the services are running good and able to use the UI without any problem.
But when trying to run Foreman backup online using the command,
foreman-maintain backup online /var
Backup runs at the begenning and it is struck on,

Backup Pulp data: - Collecting Pulp data line

There is an No disk space issue raised on vm eventhough it had around 800GB space left.
Expected outcome:
Backup command runs successfully.
Foreman and Proxy versions:
Foreman 3.11 and Katello 4.13
Distribution and version:
OS: Rocky Linux release 8.10 (Green Obsidian)
Other relevant data:
VM Spec:
CPU: 8
RAM: 40GB
Disk: 2TB
Size of products and content views: 550GB

Were you using online backups before successfully?
Does an offline backup produce the same results?
Can you post the whole output of the backup process please?
Thats rubygem-foreman_maintain 1.6.9 you are running, right?

Our requirement is to take backup without any downtime in production server,so online backup is being followed,will check on offline backup and update you.
Online backup:

[root@foreman ~]# foreman-maintain backup online /var 
Starting backup: 2024-09-27 11:26:09 +0530
Running preparation steps required to run the next scenarios
================================================================================
Make sure Foreman DB is up:
/ Checking connection to the Foreman DB                               [OK]
--------------------------------------------------------------------------------
Make sure Candlepin DB is up:
- Checking connection to the Candlepin DB                             [OK]
--------------------------------------------------------------------------------
Make sure Pulpcore DB is up:
\ Checking connection to the Pulpcore DB                              [OK]
--------------------------------------------------------------------------------


Running Backup
================================================================================
Check if the incremental backup has the right type:                   [OK]
--------------------------------------------------------------------------------
Check for running tasks:                                              [OK]
--------------------------------------------------------------------------------
Check for running pulpcore tasks:                                     [OK]
--------------------------------------------------------------------------------
Prepare backup Directory:
Creating backup folder /var/katello-backup-2024-09-27-11-26-09        [OK]
--------------------------------------------------------------------------------
Generate metadata:
| Saving metadata to metadata.yml                                     [OK]
--------------------------------------------------------------------------------
Stop applicable services:

Stopping the following service(s):
pulpcore-worker@1.service, pulpcore-worker@2.service, pulpcore-worker@3.service, pulpc                                                                                  ore-worker@4.service, pulpcore-worker@5.service, pulpcore-worker@6.service, pulpcore-w                                                                                  orker@7.service, pulpcore-worker@8.service, dynflow-sidekiq@worker-1, dynflow-sidekiq@                                                                                  worker-hosts-queue-1
/ All services stopped                                                [OK]
--------------------------------------------------------------------------------
Backup config files:
/ Collecting config files to backup                                   [OK]
--------------------------------------------------------------------------------
Backup Pulp data:
\ Collecting Pulp data

Backup is struck on above task.

rubygem-foreman_maintain version:

[root@foreman ~]# rpm -qa | grep rubygem-foreman_maintain
rubygem-foreman_maintain-1.7.4-1.el8.noarch

Thanks!
(Before Foreman 3.12 online backup was not officially considered “production” ready)

That’s from Foreman 3.12, not 3.11.
It’s actually good that you have that, it has a few fixes for online backups, but unexpected :slight_smile:

And just to make sure, you do not see “Data in /var/lib/pulp changed during backup. Retrying…” messages printed, right?

Offline backup also produce the same issue,

[root@foreman ~]# foreman-maintain backup offline /var
Starting backup: 2024-09-24 20:40:52 +0530
Running preparation steps required to run the next scenarios
================================================================================
Make sure Foreman DB is up:
/ Checking connection to the Foreman DB                               [OK]
--------------------------------------------------------------------------------
Make sure Candlepin DB is up:
- Checking connection to the Candlepin DB                             [OK]
--------------------------------------------------------------------------------
Make sure Pulpcore DB is up:
\ Checking connection to the Pulpcore DB                              [OK]
--------------------------------------------------------------------------------


Running Backup
================================================================================
Check if the incremental backup has the right type:                   [OK]
--------------------------------------------------------------------------------
Check for running tasks:                                              [OK]
--------------------------------------------------------------------------------
Check for running pulpcore tasks:                                     [OK]
--------------------------------------------------------------------------------
Confirm turning off services is allowed:
WARNING: This script will stop your services.

Do you want to proceed?, [y(yes), q(quit)] y
                                                                      [OK]
--------------------------------------------------------------------------------
Prepare backup Directory:
Creating backup folder /var/katello-backup-2024-09-24-20-40-52        [OK]
--------------------------------------------------------------------------------
Generate metadata:
/ Saving metadata to metadata.yml                                     [OK]
--------------------------------------------------------------------------------
Detect features available in the local proxy:                         [OK]
--------------------------------------------------------------------------------
Stop applicable services:

Stopping the following service(s):
redis, postgresql, pulpcore-api, pulpcore-content, pulpcore-api.socket, pulpcore-content.socket, pulpcore-worker@1.service, pulpcore-worker@2.service, pulpcore-worker@3.service, pulpcore-worker@4.service, pulpcore-worker@5.service, pulpcore-worker@6.service, pulpcore-worker@7.service, pulpcore-worker@8.service, tomcat, dynflow-sidekiq@orchestrator, foreman, httpd, foreman.socket, dynflow-sidekiq@worker-1, dynflow-sidekiq@worker-hosts-queue-1, foreman-proxy
- All services stopped                                                [OK]
--------------------------------------------------------------------------------
Backup config files:
/ Collecting config files to backup                                   [OK]
--------------------------------------------------------------------------------
Backup Pulp data:
/ Collecting Pulp data

Backup is stopped here.
Disk details before the vm got struck:

[root@foreman ~]# df -h
Filesystem           Size  Used Avail Use% Mounted on
devtmpfs              20G     0   20G   0% /dev
tmpfs                 20G     0   20G   0% /dev/shm
tmpfs                 20G  113M   20G   1% /run
tmpfs                 20G     0   20G   0% /sys/fs/cgroup
/dev/mapper/rl-root  2.0T  1.1T  913G  56% /
/dev/sda2           1014M  309M  706M  31% /boot
tmpfs                4.0G     0  4.0G   0% /run/user/0
[root@foreman ~]# du -sh /var/katello-backup-2024-09-24-20-40-52/
484G    /var/katello-backup-2024-09-24-20-40-52/

I have upgraded foreman to 3.12,still facing the same issue.

No,the above message is not seen.
Does changing localectl to C.UTF-8 can be the reason for this issue?

“good”, well, at least in the sense of “this is not caused by our recent changes to online backups”.

I don’t think so, no.

When the backup is stuck at the “Collecting Pulp data”, I’d expect there is a process running tar on /var/lib/pulp, can you check that such a process is running and actually reading and writing data?

These are the process running related to tar,

[root@rhss-3 ~]# ps aux | grep tar
tomcat    115996  3.8  3.0 10526428 1260256 ?    Ssl  01:06   1:44 /usr/lib/jvm/jre-17/bin/java -Xms1024m -Xmx4096m -Dcom.redhat.fips=false -Djava.security.auth.login.config=/usr/share/tomcat/conf/login.config -classpath /usr/share/tomcat/bin/bootstrap.jar:/usr/share/tomcat/bin/tomcat-juli.jar: -Dcatalina.base=/usr/share/tomcat -Dcatalina.home=/usr/share/tomcat -Djava.endorsed.dirs= -Djava.io.tmpdir=/var/cache/tomcat/temp -Djava.util.logging.config.file=/usr/share/tomcat/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager org.apache.catalina.startup.Bootstrap start
root      125854  0.0  0.0 229764  2800 pts/1    S+   01:41   0:00 sh -c tar --selinux --no-check-device --create --file=/var/katello-backup-2024-09-22-01-40-47/pulp_data.tar --new-volume-script=/usr/share/gems/gems/foreman_maintain-1.7.4/bin/foreman-maintain-rotate-tar --exclude=assets --exclude=exports --exclude=imports --exclude=sync_imports --exclude=tmp --listed-incremental=/var/katello-backup-2024-09-22-01-40-47/.pulp.snar --transform 's,^,var/lib/pulp/,S' -S * 2>&1
root      125855 17.5  0.0 254720 17520 pts/1    D+   01:41   1:46 tar --selinux --no-check-device --create --file=/var/katello-backup-2024-09-22-01-40-47/pulp_data.tar --new-volume-script=/usr/share/gems/gems/foreman_maintain-1.7.4/bin/foreman-maintain-rotate-tar --exclude=assets --exclude=exports --exclude=imports --exclude=sync_imports --exclude=tmp --listed-incremental=/var/katello-backup-2024-09-22-01-40-47/.pulp.snar --transform s,^,var/lib/pulp/,S -S assets exports imports media sync_imports tmp
root      125946  0.0  0.0 229176  1188 pts/2    S+   01:51   0:00 grep --color=auto tar

Files under /var/lib/pulp,

[root@rhss-3 ~]# ls /var/lib/pulp/ assets exports imports media sync_imports tmp

Below process are related to pulp,

ps aux | grep pulp
root      125854  0.0  0.0 229764  2652 pts/1    S+   01:41   0:00 sh -c tar --selinux --no-check-device --create --file=/var/katello-backup-2024-09-22-01-40-47/pulp_data.tar --new-volume-script=/usr/share/gems/gems/foreman_maintain-1.7.4/bin/foreman-maintain-rotate-tar --exclude=assets --exclude=exports --exclude=imports --exclude=sync_imports --exclude=tmp --listed-incremental=/var/katello-backup-2024-09-22-01-40-47/.pulp.snar --transform 's,^,var/lib/pulp/,S' -S * 2>&1
root      125855 15.6  0.0 265160 27576 pts/1    D+   01:41   7:09 tar --selinux --no-check-device --create --file=/var/katello-backup-2024-09-22-01-40-47/pulp_data.tar --new-volume-script=/usr/share/gems/gems/foreman_maintain-1.7.4/bin/foreman-maintain-rotate-tar --exclude=assets --exclude=exports --exclude=imports --exclude=sync_imports --exclude=tmp --listed-incremental=/var/katello-backup-2024-09-22-01-40-47/.pulp.snar --transform s,^,var/lib/pulp/,S -S assets exports imports media sync_imports tmp

I would assume that process from the 22 is hanging.
Can you try killing it and see if a backup would now proceed?
If not I’d be interested what the (new) tar process is doing.

If I stop that process then backup fails,

[root@foreman ~]# foreman-maintain backup online /var/ --whitelist="pulpcore-no-running                                                                                  -tasks,foreman-tasks-not-running"
Starting backup: 2024-09-30 14:07:10 +0530
Running preparation steps required to run the next scenarios
================================================================================
Make sure Foreman DB is up:
/ Checking connection to the Foreman DB                               [OK]
--------------------------------------------------------------------------------
Make sure Candlepin DB is up:
- Checking connection to the Candlepin DB                             [OK]
--------------------------------------------------------------------------------
Make sure Pulpcore DB is up:
\ Checking connection to the Pulpcore DB                              [OK]
--------------------------------------------------------------------------------


Running Backup
================================================================================
Check if the incremental backup has the right type:                   [OK]
--------------------------------------------------------------------------------
Check for running tasks:                                              [SKIPPED]
--------------------------------------------------------------------------------
Check for running pulpcore tasks:                                     [SKIPPED]
--------------------------------------------------------------------------------
Prepare backup Directory:
Creating backup folder /var/katello-backup-2024-09-30-14-07-10        [OK]
--------------------------------------------------------------------------------
Generate metadata:
- Saving metadata to metadata.yml                                     [OK]
--------------------------------------------------------------------------------
Stop applicable services:

Stopping the following service(s):
pulpcore-worker@1.service, pulpcore-worker@2.service, pulpcore-worker@3.service, pulpc                                                                                  ore-worker@4.service, pulpcore-worker@5.service, pulpcore-worker@6.service, pulpcore-w                                                                                  orker@7.service, pulpcore-worker@8.service, dynflow-sidekiq@worker-1, dynflow-sidekiq@                                                                                  worker-hosts-queue-1
| stopping pulpcore-worker@8.service
\ stopping pulpcore-worker@8.service
/ All services stopped                                                [OK]
--------------------------------------------------------------------------------
Backup config files:
\ Collecting config files to backup                                   [OK]
--------------------------------------------------------------------------------
Backup Pulp data:
\ Collecting Pulp datash: line 1: 130939 Killed                  tar --selinux --no-ch                                                                                  eck-device --create --file=/var/katello-backup-2024-09-30-14-07-10/pulp_data.tar --new                                                                                  -volume-script=/usr/share/gems/gems/foreman_maintain-1.7.4/bin/foreman-maintain-rotate                                                                                  -tar --exclude=assets --exclude=exports --exclude=imports --exclude=sync_imports --exc                                                                                  lude=tmp --listed-incremental=/var/katello-backup-2024-09-30-14-07-10/.pulp.snar --tra                                                                                  nsform 's,^,var/lib/pulp/,S' -S * 2>&1
                                                [FAIL]
Failed executing tar --selinux --no-check-device --create --file=/var/katello-backup-2                                                                                  024-09-30-14-07-10/pulp_data.tar --new-volume-script=/usr/share/gems/gems/foreman_main                                                                                  tain-1.7.4/bin/foreman-maintain-rotate-tar --exclude=assets --exclude=exports --exclud                                                                                  e=imports --exclude=sync_imports --exclude=tmp --listed-incremental=/var/katello-backu                                                                                  p-2024-09-30-14-07-10/.pulp.snar --transform 's,^,var/lib/pulp/,S' -S *, exit status 1                                                                                  37
--------------------------------------------------------------------------------
Scenario [Backup] failed.

The following steps ended up in failing state:

  [backup-pulp]

Resolve the failed steps and rerun the command.
In case the failures are false positives, use
--whitelist="backup-pulp,foreman-tasks-not-running,pulpcore-no-running-tasks"



Running Failed backup cleanup
================================================================================
Start applicable services:

Starting the following service(s):
redis, postgresql, pulpcore-api, pulpcore-content, pulpcore-worker@1.service, pulpcore                                                                                  -worker@2.service, pulpcore-worker@3.service, pulpcore-worker@4.service, pulpcore-work                                                                                  er@5.service, pulpcore-worker@6.service, pulpcore-worker@7.service, pulpcore-worker@8.                                                                                  service, tomcat, dynflow-sidekiq@orchestrator, foreman, httpd, dynflow-sidekiq@worker-                                                                                  1, dynflow-sidekiq@worker-hosts-queue-1, foreman-proxy
| starting dynflow-sidekiq@worker-hosts-queue-1
| All services started                                                [OK]
--------------------------------------------------------------------------------
Clean up backup directory:                                            [OK]
--------------------------------------------------------------------------------

Done with backup: 2024-09-30 14:09:50 +0530

But your ps output had a tar with --file=/var/katello-backup-2024-09-22-01-40-47 (so from last Sunday), did you kill that one too?

Date and time was not synced,I took the command output by today with todays date as 22-09-2024 on the VM.Now I have synced the time so it is showing correctly.

Huh, why did you run with whitelist optons? foreman-maintain backup online /var/ --whitelist="pulpcore-no-running-tasks,foreman-tasks-not-running"

(But that’s probably unrelated – your earlier pastes didn’t have those and also stuck)

There was some failurs shows due to running tasks now when starting the backup.So,I have added the whitelist to run the backup.

One general doubt regarding online backup,
Pulp services are stopped during the backup,so new repos creation and sync are not available till the backup completion in latest foreman version?
Thats why you said online backups can be used for production in latest version of foreman and katello?

Correct. We need to ensure things don’t change while we’re taking the backup.