Unable to take incremental backup with foreman-maintain service

katello
foreman-debug

#21

Hi Martin,

###############################################
D, [2019-02-05 01:00:06+0000 #42893] DEBUG – : cp: ["/backup_directory/katello-backup-2019-02-03-01-00-04/.config.snar", “/backup_directory/katello-backup-2019-02-03-01-00-04/.pulp.snar”, “/backup_directory/katello-backup-2019-02-03-01-00-04/.postgres.snar”, “/backup_directory/katello-backup-2019-02-03-01-00-04/.mongo.snar”]
D, [2019-02-05 01:00:06+0000 #42893] DEBUG – : backup_dir: /backup_directory/katello-backup-2019-02-05-01-00-06
D, [2019-02-05 01:00:06+0000 #42893] DEBUG – : tar: /bin/tar

D, [2019-02-05 01:00:06+0000 #42893] DEBUG – : ENV:{“XDG_SESSION_ID”=>“60287”, “SHELL”=>"/bin/bash", “USER”=>“root”, “PATH”=>"/sbin:/bin:/usr/sbin:/usr/bin", “PWD”=>"/root", “LANG”=>“en_US.UTF-8”, “HOME”=>"/root", “SHLVL”=>“2”, “LOGNAME”=>“root”, “XDG_RUNTIME_DIR”=>"/run/user/0", “_”=>"/bin/foreman-maintain"}
I, [2019-02-05 01:00:07+0000 #42893] INFO – : — Execution step ‘Prepare backup Directory’ finished —
################################################

Few points from above lines:

  1. The script runs on 5th of Feb.
  2. *.snar files are copied from 3rd Feb, but where it is copied???
  3. A new folder is created for 5th Feb, this will be destination for backup to happen.

I could make out these statements, let me know if i can get answers.

Also provide your analysis from the above attachment.

Thanks
Farista Singh


#22

It looks like the expected output to me

tagging @mbacovsky so he gets a notification and can take a look


#23

Hi Farista,
As John said this results looks expected. Can you confirm the resulting backup is NOT incremental?

Could you share the same values from the backup run manually with the same params? Is the resulting backup incremental?

To your questions:
The .snar files are copied from the source backup folder (3rd Feb) into the destination folder (5th Feb) Then in that folder the tar is executed. From the logs everything looks okay, the .snars are found and seems to be copied correctly to new destination. The environment looks set correctly too.


#24

Hi Martin,

The resulting backup is not incremental, its full backup. (script runs through cron).

IF I run the same script using " sh /root/katellobackupscript.sh" on CLI, it again goes for full backup (it should be incremental as script says).
Please find the logs below
################################################
I, [2019-02-07 04:15:07+0000 #27744] INFO – : === Scenario ‘Backup’ started ===
I, [2019-02-07 04:15:07+0000 #27744] INFO – : — Execution step ‘Confirm turning off services is allowed’ [backup-accessibility-confirmation] started —
I, [2019-02-07 04:15:07+0000 #27744] INFO – : — Execution step ‘Confirm turning off services is allowed’ finished —
I, [2019-02-07 04:15:07+0000 #27744] INFO – : — Execution step ‘Prepare backup Directory’ [backup-prepare-directory] started —
D, [2019-02-07 04:15:07+0000 #27744] DEBUG – : cp: ["/backup_directory/katello-backup-2019-02-06-01-00-07/.config.snar", “/backup_directory/katello-backup-2019-02-06-01-00-07/.pulp.snar”, “/backup_directory/katello-backup-2019-02-06-01-00-07/.postgres.snar”, “/backup_directory/katello-backup-2019-02-06-01-00-07/.mongo.snar”]
D, [2019-02-07 04:15:07+0000 #27744] DEBUG – : backup_dir: /backup_directory/katello-backup-2019-02-07-04-15-07
D, [2019-02-07 04:15:07+0000 #27744] DEBUG – : tar: /bin/tar
##############################################

Please suggest.

Farista Singh


#25

Just for curiosity, how can you say that
D, [2019-02-07 04:15:07+0000 #27744] DEBUG – : cp: ["/backup_directory/katello-backup-2019-02-06-01-00-07/.config.snar", “/backup_directory/katello-backup-2019-02-06-01-00-07/.pulp.snar”, “/backup_directory/katello-backup-2019-02-06-01-00-07/.postgres.snar”, “/backup_directory/katello-backup-2019-02-06-01-00-07/.mongo.snar”]
D, [2019-02-07 04:15:07+0000 #27744] DEBUG – : backup_dir: /backup_directory/katello-backup-2019-02-07-04-15-07

the *.snar files are copied to the new folder (katello-backup-2019-02-07-04-15-07) ?
Can we make make this profound?
or something like first create directory and then do copy <-- not visible in the logs

Farista Singh


#26

Hi There,

I have update foreman_maintain version from 0.2 to below one:
rubygem-foreman_maintain-0.3.0-1.el7.noarch
I still see that it goes for full backup daily instead of incremental.
Please suggest.

Thanks
Farista Singh


#27

Hi Farista,

from the information you provided it looks the problem persist even if run from shell so it seems it is not related to running via cron vs. shell. That is suggesting we are not looking at the right place and the cause is not in the cron env setup.

To your question:

right after the logs the tool does the copying - files logged as cp: are copied to destination logged as backup_dir:. While we don’t see any falures there I guess we can assume the files were copied correctly. Makes sense?

I’ve checked how tar actually does the incremental backups and I found out the .snar file is a snapshot of filenames and timestamps in the archive. Tar is sensitive to two things - change of the timestamps of the files and change of the device numbers (it can be the case when using e.g. NFS or LVM ). For more details see http://www.gnu.org/software/tar/manual/html_node/Incremental-Dumps.html.

Is it possible some tool is changing timestamps on your system in the archived directories?
Is it possible the device numbers can change during the time (I’m not sure when that can happen)?

We can probably tell if you check the device numbers in some of the consequent snar files using tar-snapshot-edit tool. More details in https://www.gnu.org/software/tar/utils/tar-snapshot-edit.html.
Can you confirm?

E.g. on my system it is preserved:

# ./tar-snapshot-edit /tmp/satellite-backup-2019-02-06-23-52-14/.pulp.snar 

File: /tmp/satellite-backup-2019-02-06-23-52-14/.pulp.snar
  Detected snapshot file version: 2

  Device 0xfd03 occurs 17 times.

# ./tar-snapshot-edit /tmp/satellite-backup-2019-02-11-08-55-45/.pulp.snar 

File: /tmp/satellite-backup-2019-02-11-08-55-45/.pulp.snar
  Detected snapshot file version: 2

  Device 0xfd03 occurs 17 times.

If that is different on yours we can add --no-check-device option to the tar execution as the tar manual suggests.

Regards,
Martin


#28

Hi Martin,

Thanks for clarifying the backup source/destination doubt, appreciated.
I checked the device numbers using tar-snapshot-edit utility, its preserved:

#./tar-snapshot-edit /backup_directory/katello-backup-2019-02-12-01-00-06/.pulp.snar

File: /backup_directory/katello-backup-2019-02-12-01-00-06/.pulp.snar
Detected snapshot file version: 2

Device 0xfd00 occurs 22372 times.

#./tar-snapshot-edit /backup_directory/katello-backup-2019-02-11-01-00-07/.pulp.snar

File: /backup_directory/katello-backup-2019-02-11-01-00-07/.pulp.snar
Detected snapshot file version: 2

Device 0xfd00 occurs 22372 times.

No it is not related to running via cron/shell issue.
Instead in brief, i would like to mention below:

On sunday, it should be full backup, other day should be incremental as per script
BUT, the other day as well script takes full backup,
AND if I run the script again (lets say on Monday, thru cron or CLI, it takes incremental)
BECAUSE the folder with today’s date is already present.

I hope this makes sense in investigating further.

NOTE: I have not added --no-check-device option yet.

Thanks
Farista Singh


#29

Hi Martin,

I do not see any --no-check-device options in --help here:
##########################################################
#foreman-maintain backup offline --help
Usage:
foreman-maintain backup offline [OPTIONS] BACKUP_DIR

Parameters:
    BACKUP_DIR                    Path to backup dir

Options:
    -y, --assumeyes               Automatically answer yes for all questions
    -w, --whitelist whitelist     Comma-separated list of labels of steps to be skipped
    -f, --force                   Force steps that would be skipped as they were already run
    -s, --skip-pulp-content       Do not backup Pulp content
    -p, --preserve-directory      Do not create a time-stamped subdirectory
    -t, --split-pulp-tar SPLIT_SIZE Split pulp data into files of a specified size, i.e. (100M, 50G). See '--tape-length' in 'info tar' for all sizes
    -i, --incremental PREVIOUS_BACKUP_DIR Backup changes since previous backup
    --features FEATURES           Foreman Proxy features to include in the backup. Valid features are tftp, dns, dhcp, openscap, and all. (comma-separated list)
    --include-db-dumps            Also dump full database schema before offline backup
    -h, --help                    print help

#30

Hi Farista,
thanks for the information provided.

--no-check-device is option of tar not foreman-maintain. To add we would have to patch our backup script first. But as your device ids keeps preserved it should have no impact.

With the new findings it is not very clear where to look next as the original hypothesis (the problem is caused by running via cron) is not valid anymore. I’d suggest to find a clean minimal reproducer again.

  • do a fresh full backup foreman-maintain backup offline --assumeyes /backup_test
  • do a fresh incremental backup foreman-maintain backup offline --assumeyes --incremental /backup_test/<last full backup path> /backup_test
  • check if the new backup is incremental.
    • If YES you can check how what you did differs from what the cron script is doing. Do we have cron logs with the commands it runs?
    • If NO, we can then try to reduce the steps run by foreman-maintain to create a folder, copying .snar file manually and run tar manually. Then we could see if should look into tar docs or fix foreman-maintain.

Make ssense? Any other ideas?


#31

Hi Martin,
I guess I was not able to make you understand then:

Let me put here once more:
We have once script that should take full backup on sunday and incremental on other days, below is the script:
###################################################################
#!/bin/bash -e
PATH=/sbin:/bin:/usr/sbin:/usr/bin
DESTINATION=/backup_directory
if [[ $(date +%w) == 0 ]]; then
foreman-maintain backup offline --assumeyes $DESTINATION
else
LAST=$(ls -td – $DESTINATION/*/ | head -n 1)
foreman-maintain backup offline --assumeyes --incremental $LAST $DESTINATION
fi
exit 0
###################################################################
This script is scheduled in cron to run once daily.

The problem is:

  1. The script takes full backup on other days as well, it should take only on sunday followed by incremental on other days.
  2. When we run the same script second time (using cron or CLI), it takes incremental because there is already one folder present for today’s date. <---- Point to be noted

Resolution try-outs:

  1. I troubleshooted by upgrading foreman-maintain version from 0.2 to 0.3.x.x.x, it didnt worked.
  2. To troubleshoot, we also kept DEBUG mode on, but that looks perfect in foreman-maintain logs.

Let me know if you have got the issue.

Thanks
Farista Singh


#32

Hi Farista,
thanks for the resume. Let us see if I understand it correctly:

  1. It is Sunday. foreman-maintain backup offline --assumeyes /backup_directory will create full backup (BACKUP-0)
  2. It is Monday. foreman-maintain backup offline --assumeyes --incremental /backup_directory/BACKUP-0 /backup_directory creates full backup (BACKUP-1) despite the --incremental.
  3. It is Monday. The backup script is invoked again manually. I assume it will result in foreman-maintain backup offline --assumeyes --incremental /backup_directory/BACKUP-1 /backup_directory and incremental backup (BACKUP-2)

Correct?
How does it look on Tuesday? Is it full backup again by cron and incremental manually?

If I read it correctly there must be difference between BACKUP-0 and BACKUP-1 .snar files. Could you confirm the corresponding .snar files have the same content (device ids and timestamps) in these two backups?

To explain my motivation for the instructions in my previous post - there is a lot of extra things (the script with its logic and the cron vs. manual invocation) which can influence the results. I wanted to find clear and minimal reproducer so that we can understand what is going on without hitting dead-ends as we did with the cron ENV.


#33

Hi Martin,
You got it perfect.

It is Tuesday,
It takes full backup despite the --incremental
Similarly for other days till saturday.

If the backup script runs SECOND time (manually or cron) in same day, the second backup is incremental.

Yes it is full by cron and incremental manually.
ONE THING to NOTE here is instead of manual, if i run cron again, it goes for incremental because we already have todays’ folder present there < — try to get this.

Yes, BACKUP-0 and BACKUP-1 has same content
drwxrwx—. 2 root postgres 4.0K Feb 9 01:18 katello-backup-2019-02-09-01-00-06
drwxrwx—. 2 root postgres 4.0K Feb 10 01:25 katello-backup-2019-02-10-01-00-05

Hope all this helps to narrow down your search area.

Thanks
Farista Singh


#34

Hi Team,
Any updates for me ?


#35

@FaristaS following along the thread and with what @mbacovsky suggested, I think he is looking for more information from those two backups. Take a look at the actual .snar files in the backups and look at the device ids and timestamps.

If we are still out of ideas after that, we can perhaps set up a reproducer using your cron script


#36

Hi John,
I have taken timestamp of 18th and 19th Feb folder.
Please let me know if more input is required.

###########################################################################
[root@abcdefgh katello-backup-2019-02-18-01-00-07]# stat .pulp.snar
File: ‘.pulp.snar’
Size: 11969821 Blocks: 23384 IO Block: 4096 regular file
Device: fd03h/64771d Inode: 8388611 Links: 1
Access: (0644/-rw-r–r--) Uid: ( 0/ root) Gid: ( 0/ root)
Context: unconfined_u:object_r:unlabeled_t:s0
Access: 2019-02-19 01:00:06.000000000 +0000
Modify: 2019-02-18 01:13:21.000000000 +0000
Change: 2019-02-19 01:22:32.762005459 +0000
Birth: -
[root@abcdefgh katello-backup-2019-02-18-01-00-07]# stat .config.snar
File: ‘.config.snar’
Size: 549796 Blocks: 1080 IO Block: 4096 regular file
Device: fd03h/64771d Inode: 8388610 Links: 1
Access: (0644/-rw-r–r--) Uid: ( 0/ root) Gid: ( 0/ root)
Context: unconfined_u:object_r:unlabeled_t:s0
Access: 2019-02-19 01:00:06.000000000 +0000
Modify: 2019-02-18 01:00:36.000000000 +0000
Change: 2019-02-19 01:22:07.122006921 +0000
Birth: -
[root@abcdefgh katello-backup-2019-02-18-01-00-07]# stat .mongo.snar
File: ‘.mongo.snar’
Size: 2420 Blocks: 8 IO Block: 4096 regular file
Device: fd03h/64771d Inode: 8388613 Links: 1
Access: (0644/-rw-r–r--) Uid: ( 0/ root) Gid: ( 0/ root)
Context: unconfined_u:object_r:unlabeled_t:s0
Access: 2019-02-19 01:00:07.000000000 +0000
Modify: 2019-02-18 01:13:38.000000000 +0000
Change: 2019-02-19 01:22:32.852005454 +0000
Birth: -
[root@abcdefgh katello-backup-2019-02-18-01-00-07]# stat .postgres.snar
File: ‘.postgres.snar’
Size: 22979 Blocks: 48 IO Block: 4096 regular file
Device: fd03h/64771d Inode: 8388612 Links: 1
Access: (0644/-rw-r–r--) Uid: ( 0/ root) Gid: ( 0/ root)
Context: unconfined_u:object_r:unlabeled_t:s0
Access: 2019-02-19 01:00:07.000000000 +0000
Modify: 2019-02-18 01:13:45.000000000 +0000
Change: 2019-02-19 01:22:32.815005456 +0000
Birth: -
[root@abcdefgh katello-backup-2019-02-18-01-00-07]#
##############################################################################

[root@abcdefgh katello-backup-2019-02-19-01-00-06]# stat .pulp.snar
File: ‘.pulp.snar’
Size: 11969821 Blocks: 23384 IO Block: 4096 regular file
Device: fd03h/64771d Inode: 7340035 Links: 1
Access: (0644/-rw-r–r--) Uid: ( 0/ root) Gid: ( 0/ root)
Context: unconfined_u:object_r:unlabeled_t:s0
Access: 2019-02-19 01:00:57.000000000 +0000
Modify: 2019-02-19 01:13:33.000000000 +0000
Change: 2019-02-19 01:24:31.213998709 +0000
Birth: -
[root@abcdefgh katello-backup-2019-02-19-01-00-06]# stat .config.snar
File: ‘.config.snar’
Size: 549793 Blocks: 1080 IO Block: 4096 regular file
Device: fd03h/64771d Inode: 7340034 Links: 1
Access: (0644/-rw-r–r--) Uid: ( 0/ root) Gid: ( 0/ root)
Context: unconfined_u:object_r:unlabeled_t:s0
Access: 2019-02-19 01:00:29.000000000 +0000
Modify: 2019-02-19 01:00:35.000000000 +0000
Change: 2019-02-19 01:24:19.830999357 +0000
Birth: -
[root@abcdefgh katello-backup-2019-02-19-01-00-06]# stat .mongo.snar
File: ‘.mongo.snar’
Size: 2273 Blocks: 8 IO Block: 4096 regular file
Device: fd03h/64771d Inode: 7340037 Links: 1
Access: (0644/-rw-r–r--) Uid: ( 0/ root) Gid: ( 0/ root)
Context: unconfined_u:object_r:unlabeled_t:s0
Access: 2019-02-19 01:13:35.000000000 +0000
Modify: 2019-02-19 01:13:51.000000000 +0000
Change: 2019-02-19 01:24:31.303998703 +0000
Birth: -
[root@abcdefgh katello-backup-2019-02-19-01-00-06]# stat .postgres.snar
File: ‘.postgres.snar’
Size: 22984 Blocks: 48 IO Block: 4096 regular file
Device: fd03h/64771d Inode: 7340036 Links: 1
Access: (0644/-rw-r–r--) Uid: ( 0/ root) Gid: ( 0/ root)
Context: unconfined_u:object_r:unlabeled_t:s0
Access: 2019-02-19 01:13:54.000000000 +0000
Modify: 2019-02-19 01:14:06.000000000 +0000
Change: 2019-02-19 01:24:31.282998705 +0000
Birth: -
[root@abcdefgh katello-backup-2019-02-19-01-00-06]#
##########################################################################

Thanks
Farista Singh


#37

Hi Farista,
sorry for the delay.

I’m aware of this. Let me explain what I think is going on. If the backup is run for the second time in the day the LAST is evaluated to be the previous backup ran first in the day. SO the difference actually is the first backup is using the .snar files from previous day while the second is using the .snar from the same day (earlier backup the same day). That is why I wanted you to compare what is inside the snar files. While the .snar file is a snapshot of the archived files metadata my guess is something will have to differ in the metadata. My assumption is the ‘non-incremental’ backup is actually incremental but tar has some reason to thing everything changed. Could you please share e.g. mongo.snar from Sunday and Monday(or Tue) and 2nd Monday(or Tue) backup?

Thanks for the stats. It looks okay, no unexpected things there. Now we should look inside the .snars.

Would you want to try to patch your foreman-maintain and add the --no-check-device switch to see if that makes any difference. I can prepare it for you.

Regards,
Martin


#38

Hi,
You are absolutely correct.

Even I find the same root cause for not happening incremental backups.

I also wanted to create a script that takes backup using “tar” utility (not foreman-maintain)
but then there are lot of files to consider for backup, and even if a single file is missed, then restoration will be a problem, so i omitted this thought.

Further I have attached mongo.snar files, please have a look
sunday_mongo.snar <= for sunday (full backup through cron)
monday_mongo.snar <= for monday (should be incremental, but it is full)
wednesday_first_mongo.snar <= for wednesday (should be incremental, but it is full)
wednesday_second_mongo.snar <= incremental

Let me know if anything else is required.

Thanks
Farista Singh


#39

mongo-snar.tar (20 KB)


#40

Sure Martin, lets go ahead with that.

Thanks
Farista Singh