Skip to content

Commit a81ad9d

Browse files
committed
initial submission
1 parent c7c631f commit a81ad9d

File tree

4 files changed

+299
-0
lines changed

4 files changed

+299
-0
lines changed

README.md

Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
# paperbackup.py
2+
3+
Create a pdf with barcodes to backup text files on paper.
4+
Designed to backup ASCII-armored GnuPG and SSH key files and ciphertext.
5+
6+
## How to use
7+
8+
###### Backup
9+
10+
```
11+
gpg2 --armor --export "User Name" >key.asc
12+
gpg2 --armor --export-secret-key "User Name" >>key.asc
13+
paperbackup.py key.asc
14+
paperrestore.sh key.asc.pdf | diff key.asc -
15+
lpr key.asc.pdf
16+
```
17+
18+
This will print out the public and private key of "User Name". The
19+
private key is still encrypted with it's passphrase, so make sure
20+
you don't lose or forget it.
21+
22+
See some example output here:
23+
24+
###### Restore
25+
26+
1. Scan the papers
27+
2. Create one file containing all the pages. zbar supports e.g. PDF, TIFF, PNG, JPG,...
28+
3. `paperrestore.sh scanned.pdf >key.asc`
29+
4. `gpg2 --import key.asc`
30+
31+
If one or more barcodes could not be decoded, try scanning them again. If that does
32+
not work, type in the missing letters from the plaintext output at the end of the pdf.
33+
34+
## Dependencies
35+
36+
- python 3 https://www.python.org/
37+
- python3-pillow https://python-pillow.org/
38+
- PyX http://pyx.sourceforge.net/
39+
- LaTeX (required by PyX) https://www.latex-project.org/
40+
- python3-qrencode https://github.com/Arachnid/pyqrencode
41+
- enscript python3-qrencode
42+
- ghostscript https://www.ghostscript.com/
43+
- ZBar http://zbar.sourceforge.net/
44+
45+
## Why backup on paper?
46+
47+
Some data, like GnuPG or SSH keys, can be really really important for you, like that your whole
48+
business relies on them. If that is the case, you should have multiple backups at multiple
49+
places of it.
50+
51+
I also think it is a good idea to use different media types for it. Hard disks, flash based
52+
media and CD-R are not only susceptible to heat, water and strong EM waves, but also age.
53+
54+
Paper, if properly stored, has proven to be able to be legible after centuries. It is also
55+
quite resistant to fire if stored as a thick stack like a book.
56+
57+
So I think it is a good idea to throw a backup on paper into the mix of locations and media
58+
types of your important backups.
59+
60+
Storing the paper backup in a machine readable format like barcodes makes it practical to restore
61+
even large amounts in short order. If the paper is too damaged for the barcodes to be readable,
62+
you still have the printed plaintext that paperbackup produces.
63+
64+
## Choice and error resilency of barcodes
65+
66+
Only 2D barcodes have the density to make key backup practical. QR Code and DataMatrix are
67+
the most common 2D barcodes.
68+
69+
Several papers comparing QR and DataMatrix come to the conclusion that DataMatrix allows
70+
a higher density and offers better means for error correction. I tested this and came
71+
to the conclusion that the QR code decoding programs available to me had better error
72+
resilency than the ones for DataMatrix.
73+
74+
The toughest test I found, other than cutting complete parts from a code, was printing
75+
the code, scanning it, printing the scanned image on a pure black and white printer
76+
and then repeating this several times. While the barcode still looks good to the human
77+
eye, this process slightly deforms the barcode in an irregular pattern.
78+
79+
libdmtx was still able to decode a DataMatrix barcode with 3 repetitions of the above
80+
procedure. A expensive commercial library was still able to decode after 5 repetitions.
81+
82+
ZBar and the commercial library could still decode a QR code after 7 repetitions.
83+
84+
A laser printed QR code, completely soaked in dirty water for a few hours, rinsed with
85+
clean water, dried and then scanned, could be decoded by ZBar on the first try.
86+
87+
This is why I chose QR code for this program.
88+
89+
## Encoding and data format
90+
91+
In my tests I found that larger QR codes are more at risk to becoming undecodable due to
92+
wrinkles and deformations of the paper. So paperbackup splits the barcodes at 140 bytes of data.
93+
94+
QR codes offer a feature to concatenate the data of several barcodes. As this is not supported
95+
by all programs, I chose not to use it.
96+
97+
Each barcode is labeled with a start marker `^<sequence number><space>`. After that the raw
98+
and otherwise unencoded data follows.
99+
100+
## Changing the paper format
101+
102+
The program writes PDFs in A4 by default. You can uncomment the respective lines
103+
in the constants section of the source to change to US Letter.
104+
105+
## License
106+
107+
MIT X11 License

example_output.pdf

56 KB
Binary file not shown.

paperbackup.py

Lines changed: 159 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,159 @@
1+
#!/usr/bin/python3
2+
3+
#
4+
# create a pdf with barcodes to backup text files on paper
5+
# designed to backup ascii-armored key files and ciphertext
6+
#
7+
8+
# Copyright 2017 by Intra2net AG, Germany
9+
#
10+
# Permission is hereby granted, free of charge, to any person obtaining
11+
# a copy of this software and associated documentation files (the
12+
# "Software"), to deal in the Software without restriction, including
13+
# without limitation the rights to use, copy, modify, merge, publish,
14+
# distribute, sublicense, and/or sell copies of the Software, and to
15+
# permit persons to whom the Software is furnished to do so, subject to
16+
# the following conditions:
17+
#
18+
# The above copyright notice and this permission notice shall be
19+
# included in all copies or substantial portions of the Software.
20+
#
21+
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
22+
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
23+
# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
24+
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
25+
# LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
26+
# OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
27+
# WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
28+
#
29+
30+
import os
31+
import re
32+
import sys
33+
import shlex
34+
import qrencode
35+
from tempfile import mkstemp
36+
from PIL import Image
37+
from pyx import *
38+
39+
# constants for the size and layout of the barcodes on page
40+
max_bytes_in_barcode = 140
41+
barcodes_per_page = 6
42+
barcode_height = 8
43+
barcode_x_positions = [1.5, 11, 1.5, 11, 1.5, 11]
44+
barcode_y_positions = [18.7, 18.7, 10, 10, 1.2, 1.2]
45+
text_x_offset = 0
46+
text_y_offset = 8.2
47+
48+
# the paperformat to use, activate the one you want
49+
paperformat_obj = document.paperformat.A4
50+
paperformat_str = "A4"
51+
# paperformat_obj=document.paperformat.Letter
52+
# paperformat_str="Letter"
53+
54+
55+
def create_barcode(chunkdata):
56+
version, size, im = qrencode.encode(chunkdata,
57+
level=qrencode.QR_ECLEVEL_H,
58+
case_sensitive=True)
59+
return im
60+
61+
62+
def finish_page(pdf, canv, pageno):
63+
canv.text(10, 0.6, "Page %i" % (pageno+1))
64+
pdf.append(document.page(canv, paperformat=paperformat_obj,
65+
fittosize=0, centered=0))
66+
67+
# main code
68+
69+
if len(sys.argv) != 2:
70+
raise RuntimeError('Usage {} FILENAME.asc'.format(sys.argv[0]))
71+
72+
input_path = sys.argv[1]
73+
if not os.path.isfile(input_path):
74+
raise RuntimeError('File {} not found'.format(input_path))
75+
just_filename = os.path.basename(input_path)
76+
77+
with open(input_path) as inputfile:
78+
ascdata = inputfile.read()
79+
80+
# only allow some harmless characters
81+
# this is much more strict than neccessary, but good enough for key files
82+
# you really need to forbid ^, NULL and anything that could upset enscript
83+
allowedchars = re.compile(r"^[A-Za-z0-9/=+:., #@!()\n-]*")
84+
allowedmatch = allowedchars.match(ascdata)
85+
if allowedmatch.group() != ascdata:
86+
raise RuntimeError('Illegal char found at %d >%s<'
87+
% (len(allowedmatch.group()),
88+
ascdata[len(allowedmatch.group())]))
89+
90+
# split the ascdata into chunks of max_bytes_in_barcode size
91+
# each chunk begins with ^<sequence number><space>
92+
# this allows to easily put them back together in the correct order
93+
barcode_blocks = []
94+
chunkdata = "^1 "
95+
for char in list(ascdata):
96+
if len(chunkdata)+1 > max_bytes_in_barcode:
97+
# chunk is full -> create barcode from it
98+
barcode_blocks.append(create_barcode(chunkdata))
99+
chunkdata = "^" + str(len(barcode_blocks)+1) + " "
100+
101+
chunkdata += char
102+
103+
# handle the last, non filled chunk too
104+
barcode_blocks.append(create_barcode(chunkdata))
105+
106+
# init PyX
107+
unit.set(defaultunit="cm")
108+
pdf = document.document()
109+
110+
# place barcodes on pages
111+
pgno = 0 # page number
112+
ppos = 0 # position id on page
113+
c = canvas.canvas()
114+
for bc in range(len(barcode_blocks)):
115+
# page full?
116+
if ppos >= barcodes_per_page:
117+
finish_page(pdf, c, pgno)
118+
c = canvas.canvas()
119+
pgno += 1
120+
ppos = 0
121+
122+
c.text(barcode_x_positions[ppos] + text_x_offset,
123+
barcode_y_positions[ppos] + text_y_offset,
124+
"%s (%i/%i)" % (text.escapestring(just_filename),
125+
bc+1, len(barcode_blocks)))
126+
c.insert(bitmap.bitmap(barcode_x_positions[ppos],
127+
barcode_y_positions[ppos],
128+
barcode_blocks[bc], height=barcode_height))
129+
ppos += 1
130+
131+
finish_page(pdf, c, pgno)
132+
pgno += 1
133+
134+
fd, temp_barcode_path = mkstemp('.pdf', 'qr_', '.')
135+
# will use pdf as the tmpfile has a .pdf suffix
136+
pdf.writetofile(temp_barcode_path)
137+
138+
# use "enscript" to create postscript with the plaintext
139+
fd, temp_text_path = mkstemp('.ps', 'text_', '.')
140+
ret = os.system("enscript -p" + shlex.quote(temp_text_path) +
141+
" -f Courier12 -M" + paperformat_str +
142+
" " + shlex.quote(input_path))
143+
if ret != 0:
144+
raise RuntimeError('error calling enscript')
145+
146+
# combine both files with ghostscript
147+
ret = os.system("gs -dBATCH -dNOPAUSE -q -sDEVICE=pdfwrite -sOutputFile=" +
148+
shlex.quote(just_filename) +
149+
".pdf " + shlex.quote(temp_barcode_path) + " " +
150+
shlex.quote(temp_text_path))
151+
if ret != 0:
152+
raise RuntimeError('error calling ghostscript')
153+
154+
# using enscript and ghostscript to create the plaintext output is a hack,
155+
# using PyX and LaTeX would be more elegant. But I could not find an easy
156+
# solution to flow the text over several pages with PyX.
157+
158+
os.remove(temp_text_path)
159+
os.remove(temp_barcode_path)

paperrestore.sh

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
#!/bin/bash
2+
3+
# restore data backed up with paperbackup.py
4+
5+
# give one file containing all qrcodes as parameter
6+
7+
SCANNEDFILE=$1
8+
9+
if [ -z "$SCANNEDFILE" ]; then
10+
echo "give one file containing all qrcodes as parameter"
11+
exit 1
12+
fi
13+
14+
if ! [ -f "$SCANNEDFILE" ]; then
15+
echo "$SCANNEDFILE is not a file"
16+
exit 1
17+
fi
18+
19+
if [ ! -x "/usr/bin/zbarimg" ]; then
20+
echo "/usr/bin/zbarimg missing"
21+
exit 2
22+
fi
23+
24+
# zbarimg ends each scanned code with a newline
25+
26+
# each barcode content begins with ^<number><space>
27+
# so convert that to \0<number><space>, so sort can sort on that
28+
# then remove all \n\0<number><space> so we get the originial without newlines added
29+
30+
/usr/bin/zbarimg --raw -Sdisable -Sqrcode.enable "$SCANNEDFILE" \
31+
| sed -e "s/\^/\x0/g" \
32+
| sort -z -n \
33+
| sed ':a;N;$!ba;s/\n\x0[0-9]* //g;s/\x0[0-9]* //g;s/\n\x0//g'

0 commit comments

Comments
 (0)