amazon-s3duplicity

Duplicity incremental backup taking too long


I have duplicity running an incremental daily backup to S3. About 37 GiB.

On the first month or so, it went ok. It used to finish in about 1 hour. Then it started taking too long to complete the task. Right now, as I type, it is still running the daily backup that started 7 hours ago.

I'm running two commands, first to backup and then cleanup:

duplicity --full-if-older-than 1M LOCAL.SOURCE S3.DEST --volsize 666 --verbosity 8
duplicity remove-older-than 2M S3.DEST

The logs

Temp has 54774476800 available, backup will use approx 907857100.

So the temp has enough space, good. Then it starts with this...

Copying duplicity-full-signatures.20161107T090303Z.sigtar.gpg to local cache.
Deleting /tmp/duplicity-ipmrKr-tempdir/mktemp-13tylb-2
Deleting /tmp/duplicity-ipmrKr-tempdir/mktemp-NanCxQ-3
[...]
Copying duplicity-inc.20161110T095624Z.to.20161111T103456Z.manifest.gpg to local cache.
Deleting /tmp/duplicity-ipmrKr-tempdir/mktemp-VQU2zx-30
Deleting /tmp/duplicity-ipmrKr-tempdir/mktemp-4Idklo-31
[...]

This continues for each day till today, taking long minutes for each file. And continues with this...

Ignoring incremental Backupset (start_time: Thu Nov 10 09:56:24 2016; needed: Mon Nov  7 09:03:03 2016)
Ignoring incremental Backupset (start_time: Thu Nov 10 09:56:24 2016; needed: Wed Nov  9 18:09:07 2016)
Added incremental Backupset (start_time: Thu Nov 10 09:56:24 2016 / end_time: Fri Nov 11 10:34:56 2016)

After a long time...

Warning, found incomplete backup sets, probably left from aborted session
Last full backup date: Sun Mar 12 09:54:00 2017
Collection Status
-----------------
Connecting with backend: BackendWrapper
Archive dir: /home/user/.cache/duplicity/700b5f90ee4a620e649334f96747bd08

Found 6 secondary backup chains.
Secondary chain 1 of 6:
-------------------------
Chain start time: Mon Nov  7 09:03:03 2016
Chain end time: Mon Nov  7 09:03:03 2016
Number of contained backup sets: 1
Total number of contained volumes: 2
Type of backup set:                            Time:      Num volumes:
               Full         Mon Nov  7 09:03:03 2016                 2
-------------------------
Secondary chain 2 of 6:
-------------------------
Chain start time: Wed Nov  9 18:09:07 2016
Chain end time: Wed Nov  9 18:09:07 2016
Number of contained backup sets: 1
Total number of contained volumes: 11
Type of backup set:                            Time:      Num volumes:
               Full         Wed Nov  9 18:09:07 2016                11
-------------------------

Secondary chain 3 of 6:
-------------------------
Chain start time: Thu Nov 10 09:56:24 2016
Chain end time: Sat Dec 10 09:44:31 2016
Number of contained backup sets: 31
Total number of contained volumes: 41
Type of backup set:                            Time:      Num volumes:
               Full         Thu Nov 10 09:56:24 2016                11
        Incremental         Fri Nov 11 10:34:56 2016                 1
        Incremental         Sat Nov 12 09:59:47 2016                 1
        Incremental         Sun Nov 13 09:57:15 2016                 1
        Incremental         Mon Nov 14 09:48:31 2016                 1
        [...]

After listing all chains:

Also found 0 backup sets not part of any chain, and 1 incomplete backup set.
These may be deleted by running duplicity with the "cleanup" command.

This was only the backup part. It takes hours doing this and only 10 minutes to upload 37 GiB to S3.

ElapsedTime 639.59 (10 minutes 39.59 seconds)
SourceFiles 288
SourceFileSize 40370795351 (37.6 GB)

Then comes the cleanup, that gives me this:

Cleaning up
Local and Remote metadata are synchronized, no sync needed.
Warning, found incomplete backup sets, probably left from aborted session
Last full backup date: Sun Mar 12 09:54:00 2017
There are backup set(s) at time(s):
Tue Jan 10 09:58:05 2017
Wed Jan 11 09:54:03 2017
Thu Jan 12 09:56:42 2017
Fri Jan 13 10:05:05 2017
Sat Jan 14 10:24:54 2017
Sun Jan 15 09:49:31 2017
Mon Jan 16 09:39:41 2017
Tue Jan 17 09:59:05 2017
Wed Jan 18 09:59:56 2017
Thu Jan 19 10:01:51 2017
Fri Jan 20 09:35:30 2017
Sat Jan 21 09:53:26 2017
Sun Jan 22 09:48:57 2017
Mon Jan 23 09:38:45 2017
Tue Jan 24 09:54:29 2017
Which can't be deleted because newer sets depend on them.
Found old backup chains at the following times:
Mon Nov  7 09:03:03 2016
Wed Nov  9 18:09:07 2016
Sat Dec 10 09:44:31 2016
Mon Jan  9 10:04:51 2017
Rerun command with --force option to actually delete.

Solution

  • I found the problem. Because of an issue, I followed this answer, and added this code to my script:

    rm -rf ~/.cache/deja-dup/*
    rm -rf ~/.cache/duplicity/*
    

    This is supposed to be a one time thing because of random bug duplicity had. But the answer didn't mention this. So every day the script was removing the cache just after syncing the cache, and then, on the next day it had to download the whole thing again.