Skip to content

Commit 9dd3846

Browse files
authored
Merge pull request #34 from trapexit/updates
append to db, maxtime limit, interrupted flag
2 parents 1bf4030 + 9e11db3 commit 9dd3846

File tree

2 files changed

+493
-268
lines changed

2 files changed

+493
-268
lines changed

README.md

Lines changed: 87 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -7,64 +7,72 @@ scorch is a tool to catalog files and their hashes to help in discovering file c
77
```
88
usage: scorch [<options>] <instruction> [<directory>]
99
10-
scorch (Silent CORruption CHecker) is a tool to catalog files and hashes
11-
to help in discovering file corruption, missing files, duplicates, etc.
10+
scorch (Silent CORruption CHecker) is a tool to catalog files, hash
11+
digests, and other metadata to help in discovering file corruption,
12+
missing files, duplicates, etc.
1213
1314
positional arguments:
14-
instruction: * add: compute and store hashes for all found files
15-
* append: compute and store for newly found files
16-
* backup: backs up selected database
17-
* restore: restore backed up database
18-
* list-backups: list database backups
19-
* diff-backup: show diff between current & backup DB
20-
* hashes: print available hash functions
21-
* check: check stored hashes against files
22-
* update: update metadata of changed files
23-
* check+update: check and update if new
24-
* cleanup: remove hashes of missing files
25-
* delete: remove hashes for found files
26-
* list-dups: list files w/ dup hashes
27-
* list-missing: list files no longer on filesystem
28-
* list-solo: list files w/ no dup hashes
29-
* list-unhashed: list files not yet hashed
30-
* list: md5sum'ish compatible listing
31-
* in-db: show if hashed files exist in DB
32-
* found-in-db: print files found in DB
33-
* notfound-in-db: print files not found in DB
34-
directory: Directory or file to scan
15+
instruction: * add: compute & store digests for found files
16+
* append: compute & store digests for unhashed files
17+
* backup: backs up selected database
18+
* restore: restore backed up database
19+
* list-backups: list database backups
20+
* diff-backup: show diff between current & backup DB
21+
* hashes: print available hash functions
22+
* check: check stored info against files
23+
* update: update metadata of changed files
24+
* check+update: check and update if new
25+
* cleanup: remove info of missing files
26+
* delete: remove info for found files
27+
* list: md5sum'ish compatible listing
28+
* list-unhashed: list files not yet hashed
29+
* list-missing: list files no longer on filesystem
30+
* list-dups: list files w/ dup digests
31+
* list-solo: list files w/ no dup digests
32+
* list-failed: list files marked failed
33+
* list-changed: list files marked changed
34+
* in-db: show if files exist in DB
35+
* found-in-db: print files found in DB
36+
* notfound-in-db: print files not found in DB
37+
directory: Directory or file to scan.
3538
3639
optional arguments:
37-
-d, --db=: File to store hashes and other metadata in.
38-
(default: /var/tmp/scorch/scorch.db)
39-
-v, --verbose: Make `instruction` more verbose. Actual behavior
40-
depends on the instruction. Can be used multiple
41-
times.
42-
-q, --quote: Shell quote/escape filenames when printed.
43-
-r, --restrict=: * sticky: restrict scan to files with sticky bit
44-
* readonly: restrict scan to readonly files
45-
-f, --fnfilter=: Restrict actions to files which match regex
46-
-F, --negate-fnfilter Negate the fnfilter regex match
47-
-s, --sort=: Sorting routine on input & output (default: natural)
48-
* random: shuffled / random
49-
* natural: human-friendly sort, ascending
50-
* reverse-natural: human-friendly sort, descending
51-
* radix: RADIX sort, ascending
52-
* reverse-radix: RADIX sort, descending
53-
* time: sort by file mtime, ascending
54-
* reverse-time: sort by file mtime, descending
55-
-m, --maxactions=: Max actions to take before exiting (default: maxint)
56-
-M, --maxdata=: Max bytes to process before exiting (default: maxint)
57-
-b, --break-on-error: Any error or hash failure will exit
58-
-D, --diff-fields=: Fields to use to indicate a file has 'changed' and
59-
and should be rehashed. Combine with ','.
60-
(default: size)
61-
* size
62-
* inode
63-
* mtime
64-
* mode
65-
-H, --hash=: Hash algo. Use 'scorch hashes' get available algos.
66-
(default: md5)
67-
-h, --help: Print this message
40+
-d, --db=: File to store digests and other metadata in. See
41+
docs for info. (default: /var/tmp/scorch/scorch.db)
42+
-v, --verbose: Make `instruction` more verbose. Actual behavior
43+
depends on the instruction. Can be used multiple
44+
times.
45+
-q, --quote: Shell quote/escape filenames when printed.
46+
-r, --restrict=: * sticky: restrict scan to files with sticky bit
47+
* readonly: restrict scan to readonly files
48+
-f, --fnfilter=: Restrict actions to files which match regex.
49+
-F, --negate-fnfilter Negate the fnfilter regex match.
50+
-s, --sort=: Sorting routine on input & output. (default: natural)
51+
* random: shuffled / random
52+
* natural: human-friendly sort, ascending
53+
* natural-desc: human-friendly sort, descending
54+
* radix: RADIX sort, ascending
55+
* radix-desc: RADIX sort, descending
56+
* mtime: sort by file mtime, ascending
57+
* mtime-desc: sort by file mtime, descending
58+
* checked: sort by last time checked, ascending
59+
* checked-desc: sort by last time checked, descending
60+
-m, --maxactions=: Max actions before exiting. (default: maxint)
61+
-M, --maxdata=: Max bytes to process before exiting. (default: maxint)
62+
Can use 'K', 'M', 'G', 'T' suffix.
63+
-T, --maxtime=: Max time to process before exiting. (default: maxint)
64+
Can use 's', 'm', 'h', 'd' suffix.
65+
-b, --break-on-error: Any error or digest mismatch will cause an exit.
66+
-D, --diff-fields=: Fields to use to indicate a file has 'changed' (vs.
67+
bitrot / modified) and should be rehashed.
68+
Combine with ','. (default: size)
69+
* size
70+
* inode
71+
* mtime
72+
* mode
73+
-H, --hash=: Hash algo. Use 'scorch hashes' get available algos.
74+
(default: md5)
75+
-h, --help: Print this message.
6876
6977
exit codes:
7078
* 0 : success, behavior executed, something found
@@ -73,6 +81,7 @@ exit codes:
7381
* 4 : hash mismatch
7482
* 8 : found
7583
* 16 : not found, nothing processed
84+
* 32 : interrupted
7685
```
7786

7887
### Database
@@ -82,14 +91,19 @@ exit codes:
8291
The file is simply CSV compressed with gzip.
8392

8493
```
85-
$ # file, hash digest, size, mode, mtime, inode
94+
$ # file, hash:digest, size, mode, mtime, inode, state, checked
8695
$ zcat /var/tmp/scorch/scorch.db
87-
/tmp/files/a,md5:d41d8cd98f00b204e9800998ecf8427e,0,33188,1546377833.3844686,123456
96+
/tmp/files/a,md5:d41d8cd98f00b204e9800998ecf8427e,0,33188,1546377833.3844686,123456,0,1588895022.6193066
8897
```
8998

99+
The 'state' value can be 'U' for unknown, 'C' for changed, 'F' for failed, or 'O' for OK.
100+
101+
The 'mtime' and 'checked' values are floating point seconds since epoch.
102+
103+
90104
#### --db argument
91105

92-
The `--db` argument is takes more than a path.
106+
The `--db` argument can take more than a path.
93107

94108
* /tmp/test/myfiles.db : Full path. Used as is.
95109
* /tmp/test : If /tmp/test is a directory -> /tmp/test/scorch.db
@@ -101,11 +115,6 @@ The `--db` argument is takes more than a path.
101115
If there is no extension then `.db` will be added.
102116

103117

104-
#### Upgrade
105-
106-
If you're using an older version of scorch with the default database in `/var/tmp/scorch.db` just copy/move the file to `/var/tmp/scorch/scorch.db`. The old format was not compressed but scorch will handle reading it uncompressed and compressing it on write.
107-
108-
109118
#### Backup / Restore
110119

111120
To simplify backing up the scorch database there is a backup command. Without a directory defined it will store the database to the same location as the database. If directories are added to the arguments then the database backup will be stored there.
@@ -149,10 +158,16 @@ $ scorch -v -d /tmp/hash.db list-unhashed /tmp/files
149158
/tmp/files/d
150159
151160
$ scorch -v -d /tmp/hash.db append /tmp/files
152-
1/1 /tmp/files/d: 2b00042f7481c7b056c4b410d28f33cf
161+
1/1 /tmp/files/d: md5:2b00042f7481c7b056c4b410d28f33cf
162+
163+
$ scorch -d /tmp/hash.db list-dups /tmp/files
164+
md5:d41d8cd98f00b204e9800998ecf8427e /tmp/files/a /tmp/files/b /tmp/files/c
153165
154166
$ scorch -v -d /tmp/hash.db list-dups /tmp/files
155-
d41d8cd98f00b204e9800998ecf8427e /tmp/files/a /tmp/files/b /tmp/files/c
167+
md5:d41d8cd98f00b204e9800998ecf8427e
168+
- /tmp/files/a
169+
- /tmp/files/b
170+
- /tmp/files/c
156171
157172
$ echo foo > /tmp/files/a
158173
$ scorch -v -d /tmp/hash.db check+update /tmp/files
@@ -179,7 +194,7 @@ A typical setup would probably be initialized manually by using **add** or **app
179194
```
180195
#!/bin/sh
181196
182-
scorch check+update /tmp/files
197+
scorch -M 128G -T 2h check+update /tmp/files
183198
scorch append /tmp/files
184199
scorch cleanup /tmp/files
185200
```
@@ -202,7 +217,10 @@ This software is free to use and released under a very liberal license. That sai
202217

203218
* PayPal: trapexit@spawn.link
204219
* Patreon: https://www.patreon.com/trapexit
205-
* Bitcoin (BTC): 12CdMhEPQVmjz3SSynkAEuD5q9JmhTDCZA
206-
* Bitcoin Cash (BCH): 1AjPqZZhu7GVEs6JFPjHmtsvmDL4euzMzp
207-
* Ethereum (ETH): 0x09A166B11fCC127324C7fc5f1B572255b3046E94
208-
* Litecoin (LTC): LXAsq6yc6zYU3EbcqyWtHBrH1Ypx4GjUjm
220+
* Bitcoin (BTC): 1DfoUd2m5WCxJAMvcFuvDpT4DR2gWX2PWb
221+
* Bitcoin Cash (BCH): qrf257j0l09yxty4kur8dk2uma8p5vntdcpks72l8z
222+
* Ethereum (ETH): 0xb486C0270fF75872Fc51d85879b9c15C380E66CA
223+
* Litecoin (LTC): LW1rvHRPWtm2NUEMhJpP4DjHZY1FaJ1WYs
224+
* Basic Attention Token (BAT): 0xE651d4900B4C305284Da43E2e182e9abE149A87A
225+
* Zcash (ZEC): t1ZwTgmbQF23DJrzqbAmw8kXWvU2xUkkhTt
226+
* Zcoin (XZC): a8L5Vz35KdCQe7Y7urK2pcCGau7JsqZ5Gw

0 commit comments

Comments
 (0)