Posts
Quick, cheap and easy (cheap as in “free”, “free” as in “freedom”) alternative to both heavy backup solutions and lightweight git-repo based backups with much less overhead and better compression: 7-zip.
7-zip’s -u
switch provides fine-grained control over creating and updating archives based on states of already-archived and to-be-archived files. The switch is specified as a combination of state-action flags (more info):
State | State condition | File on Disk | File in Archive |
---|---|---|---|
p | File exists in archive, but is not matched with wildcard | ? | Exists, but is not matched |
q | File exists in archive, but doesn’t exist on disk | Doesn’t exist | Exists |
r | File doesn’t exist in archive, but exists on disk | Exists | Doesn’t exist |
x | File in archive is newer than the file on disk | Older | Newer |
y | File in archive is older than the file on disk | Newer | Older |
z | File in archive is same as the file on disk | Same | Same |
w | Can not be detected what file is newer (times are the same, sizes are different) | ? | ? |
Action | Description |
---|---|
0 | Ignore file (don’t create item in new archive for this file) |
1 | Copy file (copy from old archive to new) |
2 | Compress (compress file from disk to new archive) |
3 | Create Anti-item (item that will delete file or directory during extracting) |
Full combination of the above states-actions covers every possible backup scenario without any extra file comparison logic.
Examples:
$HOME/*
directory:7z u full_backup.7z $HOME/* -up0q0r2x2y2z1w2
p0
- ignore files not matched by wildcard (irrelevant in case of $HOME/*
wildcard)q0
- ignore removed filesr2
- if new file was created, compress itx2
, y2
- if file is newer or older, compress itz1
- if file is the same, copy it without compression (this flag significantly reduces compression time)w2
- if in doubt, compress the file7z u full_backup.7z -u- -"up0q3r2x2y2z0w2!{differential_backup.7z}" $HOME/*
-u-
- “dash” parameter disables updates in the base archive full_backup.7z
q0
- if file was removed, “remember” the removal by creating an “anti-item”z0
- if file is the same, skip it since backup is differentialIncremental backups can be achieved by creating “decremental” backups along the way with a rolling up-to-date full backup (order matters!) in two steps:
7z u full_backup.7z $HOME/* -u- -up1q1r3x1y1z0w1!{incremental_backup.7z}
7z u full_backup.7z $HOME/* -up0q0r2x2y2z1w2
Using this approach, files can be rolled back to the state of any incremental backup by simply extracting all the backups in reverse chronological order, e.g. files can be rolled back to “three backups back” in four steps:
7z x -y full_backup.7z -o$HOME
7z x -y incremental_backup_2020_04_05.7z -o$HOME
7z x -y incremental_backup_2020_04_04.7z -o$HOME
7z x -y incremental_backup_2020_04_03.7z -o$HOME
Thanks to 7-zip’s open file format, you can easily peek inside of any incremental/differential backup. Combined with a strong encryption and incredible compression, this makes 7-zip my go-to choice for all of my backups.
AFAIK the only way to support development of 7-zip is to use developer’s referral link to digital ocean, so please do so if you can :) https://m.do.co/c/cab893b82fa8
Here’s a little MWE to test incremental backups.
Prepare a test folder (echo
is used instead of touch
so the size of files can be changed and tracked):
cd /tmp
mkdir test
echo 'test' >test/1
echo 'test' >test/2
mkdir test/3
echo 'test' >test/3/4
ls -ld $(find test)
Get the expected list of 5 byte long files:
drwxr-xr-x 3 nagimov nagimov 4096 Apr 5 21:15 test
-rw-r--r-- 1 nagimov nagimov 5 Apr 5 21:15 test/1
-rw-r--r-- 1 nagimov nagimov 5 Apr 5 21:15 test/2
drwxr-xr-x 2 nagimov nagimov 4096 Apr 5 21:15 test/3
-rw-r--r-- 1 nagimov nagimov 5 Apr 5 21:15 test/3/4
Create initial full backup and list its files:
7z a test.7z test/*
7z l test.7z
Note the timestamps and file sizes:
Date Time Attr Size Compressed Name
------------------- ----- ------------ ------------ ------------------------
2020-04-05 21:15:06 D.... 0 0 test/3
2020-04-05 21:15:06 ....A 5 19 test/1
2020-04-05 21:15:06 ....A 5 test/2
2020-04-05 21:15:06 ....A 5 test/3/4
------------------- ----- ------------ ------------ ------------------------
2020-04-05 21:15:06 15 19 3 files, 1 folders
Increase the size of file 2
and create new folder 5
with file 6
:
echo 'testtest' >test/2
mkdir test/5
echo 'testtest' >test/5/6
ls -ld $(find test)
Get an expected output - file 2
is larger, new file 5/6
is appeared:
drwxr-xr-x 4 nagimov nagimov 4096 Apr 5 21:15 test
-rw-r--r-- 1 nagimov nagimov 5 Apr 5 21:15 test/1
-rw-r--r-- 1 nagimov nagimov 9 Apr 5 21:15 test/2
drwxr-xr-x 2 nagimov nagimov 4096 Apr 5 21:15 test/3
-rw-r--r-- 1 nagimov nagimov 5 Apr 5 21:15 test/3/4
drwxr-xr-x 2 nagimov nagimov 4096 Apr 5 21:15 test/5
-rw-r--r-- 1 nagimov nagimov 9 Apr 5 21:15 test/5/6
Create an incremental backup:
7z u test.7z -u- -up1q1r3x1y1z0w1'!'test_inc1.7z test/*
7z l test_inc1.7z
Expectedly, there are only pre-modified versions of modified files in the incremental backup - file 2
is still tiny and file 5/6
is an “anti-item” of newly created file (note that its size is 0):
Date Time Attr Size Compressed Name
------------------- ----- ------------ ------------ ------------------------
..... 0 0 test/5/6
D.... 0 0 test/5
2020-04-05 21:15:06 ....A 5 9 test/2
------------------- ----- ------------ ------------ ------------------------
2020-04-05 21:15:06 5 9 2 files, 1 folders
Update full backup:
7z u test.7z test/* -up0q0r2x2y2z1w2
7z l test.7z
It now contains current versions of all the files - larger 2
and 5/6
:
Date Time Attr Size Compressed Name
------------------- ----- ------------ ------------ ------------------------
2020-04-05 21:15:06 D.... 0 0 test/3
2020-04-05 21:15:31 D.... 0 0 test/5
2020-04-05 21:15:06 ....A 5 14 test/1
2020-04-05 21:15:06 ....A 5 test/3/4
2020-04-05 21:15:31 ....A 9 20 test/2
2020-04-05 21:15:31 ....A 9 test/5/6
------------------- ----- ------------ ------------ ------------------------
2020-04-05 21:15:31 28 34 4 files, 2 folders
Make 2
even larger and remove 1
:
echo 'testtesttest' >test/2
rm test/1
ls -ld $(find test)
Now 2
is the largest and 1
isn’t present anymore:
drwxr-xr-x 4 nagimov nagimov 4096 Apr 5 21:16 test
-rw-r--r-- 1 nagimov nagimov 13 Apr 5 21:16 test/2
drwxr-xr-x 2 nagimov nagimov 4096 Apr 5 21:15 test/3
-rw-r--r-- 1 nagimov nagimov 5 Apr 5 21:15 test/3/4
drwxr-xr-x 2 nagimov nagimov 4096 Apr 5 21:15 test/5
-rw-r--r-- 1 nagimov nagimov 9 Apr 5 21:15 test/5/6
Another step of incremental backup:
7z u test.7z -u- -up1q1r3x1y1z0w1'!'test_inc2.7z test/*
7z l test_inc2.7z
Pre-modified versions of 1
and 2
are archived - 1
in its original size and 2
in its intermediate size:
Date Time Attr Size Compressed Name
------------------- ----- ------------ ------------ ------------------------
2020-04-05 21:15:06 ....A 5 9 test/1
2020-04-05 21:15:31 ....A 9 13 test/2
------------------- ----- ------------ ------------ ------------------------
2020-04-05 21:15:31 14 22 2 files
Another update of the full backup:
7z u test.7z test/* -up0q0r2x2y2z1w2
7z l test.7z
Full archive is now up to date:
Date Time Attr Size Compressed Name
------------------- ----- ------------ ------------ ------------------------
2020-04-05 21:15:06 D.... 0 0 test/3
2020-04-05 21:15:31 D.... 0 0 test/5
2020-04-05 21:15:06 ....A 5 9 test/3/4
2020-04-05 21:15:31 ....A 9 13 test/5/6
2020-04-05 21:16:00 ....A 13 17 test/2
------------------- ----- ------------ ------------ ------------------------
2020-04-05 21:16:00 27 39 3 files, 2 folders
Now it’s time to roll back through every state of the test
folder:
mkdir unzip
7z x -y test.7z -ounzip
ls -ld $(find unzip)
Latest state with absent 1
, beefy 13 bytes long 2
and present 6
:
drwxr-xr-x 3 nagimov nagimov 4096 Apr 5 21:16 unzip
drwx------ 4 nagimov nagimov 4096 Apr 5 21:16 unzip/test
-rw-r--r-- 1 nagimov nagimov 13 Apr 5 21:16 unzip/test/2
drwxr-xr-x 2 nagimov nagimov 4096 Apr 5 21:15 unzip/test/3
-rw-r--r-- 1 nagimov nagimov 5 Apr 5 21:15 unzip/test/3/4
drwxr-xr-x 2 nagimov nagimov 4096 Apr 5 21:15 unzip/test/5
-rw-r--r-- 1 nagimov nagimov 9 Apr 5 21:15 unzip/test/5/6
Going back to state inc2:
7z x -y test_inc2.7z -ounzip
ls -ld $(find unzip)
Getting 1
undeleted and 2
thinned down to 9 bytes:
drwxr-xr-x 3 nagimov nagimov 4096 Apr 5 21:16 unzip
drwx------ 4 nagimov nagimov 4096 Apr 5 21:16 unzip/test
-rw-r--r-- 1 nagimov nagimov 5 Apr 5 21:15 unzip/test/1
-rw-r--r-- 1 nagimov nagimov 9 Apr 5 21:15 unzip/test/2
drwxr-xr-x 2 nagimov nagimov 4096 Apr 5 21:15 unzip/test/3
-rw-r--r-- 1 nagimov nagimov 5 Apr 5 21:15 unzip/test/3/4
drwxr-xr-x 2 nagimov nagimov 4096 Apr 5 21:15 unzip/test/5
-rw-r--r-- 1 nagimov nagimov 9 Apr 5 21:15 unzip/test/5/6
Going back to initial state:
7z x -y test_inc1.7z -ounzip
ls -ld $(find unzip)
Getting 5/6
uncreated and 2
reduced to 5 bytes:
drwxr-xr-x 3 nagimov nagimov 4096 Apr 5 21:16 unzip
drwx------ 3 nagimov nagimov 4096 Apr 5 21:17 unzip/test
-rw-r--r-- 1 nagimov nagimov 5 Apr 5 21:15 unzip/test/1
-rw-r--r-- 1 nagimov nagimov 5 Apr 5 21:15 unzip/test/2
drwxr-xr-x 2 nagimov nagimov 4096 Apr 5 21:15 unzip/test/3
-rw-r--r-- 1 nagimov nagimov 5 Apr 5 21:15 unzip/test/3/4