@@ -27,102 +27,131 @@ level margin: \\n[rst2man-indent\\n[rst2man-indent-level]]
2727.\" new: \\n[rst2man-indent\\n[rst2man-indent-level]]
2828.in \\n[ rst2man-indent\\n[ rst2man-indent-level ] ]u
2929..
30- .TH "BORG-CHECK" 1 "2023-06-11 " "" "borg backup tool"
30+ .TH "BORG-CHECK" 1 "2023-09-14 " "" "borg backup tool"
3131.SH NAME
3232borg-check \- Check repository consistency
3333.SH SYNOPSIS
3434.sp
3535borg [common options] check [options]
3636.SH DESCRIPTION
3737.sp
38- The check command verifies the consistency of a repository and the corresponding archives.
38+ The check command verifies the consistency of a repository and its archives.
39+ It consists of two major steps:
40+ .INDENT 0.0
41+ .IP 1. 3
42+ Checking the consistency of the repository itself. This includes checking
43+ the segment magic headers, and both the metadata and data of all objects in
44+ the segments. The read data is checked by size and CRC. Bit rot and other
45+ types of accidental damage can be detected this way. Running the repository
46+ check can be split into multiple partial checks using \fB \-\- max \- duration \fP \& .
47+ When checking a remote repository, please note that the checks run on the
48+ server and do not cause significant network traffic.
49+ .IP 2. 3
50+ Checking consistency and correctness of the archive metadata and optionally
51+ archive data (requires \fB \-\- verify \- data \fP ). This includes ensuring that the
52+ repository manifest exists, the archive metadata chunk is present, and that
53+ all chunks referencing files (items) in the archive exist. This requires
54+ reading archive and file metadata, but not data. To cryptographically verify
55+ the file (content) data integrity pass \fB \-\- verify \- data \fP , but keep in mind
56+ that this requires reading all data and is hence very time consuming. When
57+ checking archives of a remote repository, archive checks run on the client
58+ machine because they require decrypting data and therefore the encryption
59+ key.
60+ .UNINDENT
3961.sp
40- check \-\- repair is a potentially dangerous function and might lead to data loss
41- (for kinds of corruption it is not capable of dealing with). BE VERY CAREFUL!
62+ Both steps can also be run independently. Pass \fB \-\- repository \- only \fP to run the
63+ repository checks only, or pass \fB \-\- archives \- only \fP to run the archive checks
64+ only.
4265.sp
43- Pursuant to the previous warning it is also highly recommended to test the
44- reliability of the hardware running this software with stress testing software
45- such as memory testers. Unreliable hardware can also lead to data loss especially
46- when this command is run in repair mode.
66+ The \fB \-\- max \- duration \fP option can be used to split a long\- running repository
67+ check into multiple partial checks. After the given number of seconds the check
68+ is interrupted. The next partial check will continue where the previous one
69+ stopped, until the full repository has been checked. Assuming a complete check
70+ would take 7 hours, then running a daily check with \fB \-\- max \- duration=3600 \fP
71+ (1 hour) would result in one full repository check per week. Doing a full
72+ repository check aborts any previous partial check; the next partial check will
73+ restart from the beginning. With partial repository checks you can run neither
74+ archive checks, nor enable repair mode. Consequently, if you want to use
75+ \fB \-\- max \- duration \fP you must also pass \fB \-\- repository \- only \fP , and must not pass
76+ \fB \-\- archives \- only \fP , nor \fB \-\- repair \fP \& .
77+ .sp
78+ \fB Warning: \fP Please note that partial repository checks (i.e. running it with
79+ \fB \-\- max \- duration \fP ) can only perform non\- cryptographic checksum checks on the
80+ segment files. A full repository check (i.e. without \fB \-\- max \- duration \fP ) can
81+ also do a repository index check. Enabling partial repository checks excepts
82+ archive checks for the same reason. Therefore partial checks may be useful with
83+ very large repositories only where a full check would take too long.
84+ .sp
85+ The \fB \-\- verify \- data \fP option will perform a full integrity verification (as
86+ opposed to checking the CRC32 of the segment) of data, which means reading the
87+ data from the repository, decrypting and decompressing it. It is a complete
88+ cryptographic verification and hence very time consuming, but will detect any
89+ accidental and malicious corruption. Tamper\- resistance is only guaranteed for
90+ encrypted repositories against attackers without access to the keys. You can
91+ not use \fB \-\- verify \- data \fP with \fB \-\- repository \- only \fP \& .
92+ .SS About repair mode
93+ .sp
94+ The check command is a readonly task by default. If any corruption is found,
95+ Borg will report the issue and proceed with checking. To actually repair the
96+ issues found, pass \fB \-\- repair \fP \& .
4797.sp
48- First, the underlying repository data files are checked:
98+ \fB NOTE: \fP
4999.INDENT 0.0
50- .IP \(bu 2
51- For all segments, the segment magic header is checked.
52- .IP \(bu 2
53- For all objects stored in the segments, all metadata (e.g. CRC and size) and
54- all data is read. The read data is checked by size and CRC. Bit rot and other
55- types of accidental damage can be detected this way.
56- .IP \(bu 2
57- In repair mode, if an integrity error is detected in a segment, try to recover
58- as many objects from the segment as possible.
59- .IP \(bu 2
60- In repair mode, make sure that the index is consistent with the data stored in
61- the segments.
62- .IP \(bu 2
63- If checking a remote repo via \fB ssh: \fP , the repo check is executed on the server
64- without causing significant network traffic.
65- .IP \(bu 2
66- The repository check can be skipped using the \fB \-\- archives \- only \fP option.
67- .IP \(bu 2
68- A repository check can be time consuming. Partial checks are possible with the
69- \fB \-\- max \- duration \fP option.
100+ .INDENT 3.5
101+ \fB \-\- repair \fP is a \fB POTENTIALLY DANGEROUS FEATURE \fP and might lead to data
102+ loss! This does not just include data that was previously lost anyway, but
103+ might include more data for kinds of corruption it is not capable of
104+ dealing with. \fB BE VERY CAREFUL! \fP
70105.UNINDENT
106+ .UNINDENT
107+ .sp
108+ Pursuant to the previous warning it is also highly recommended to test the
109+ reliability of the hardware running Borg with stress testing software. This
110+ especially includes storage and memory testers. Unreliable hardware might lead
111+ to additional data loss.
71112.sp
72- Second, the consistency and correctness of the archive metadata is verified:
113+ It is highly recommended to create a backup of your repository before running
114+ in repair mode (i.e. running it with \fB \-\- repair \fP ).
115+ .sp
116+ Repair mode will attempt to fix any corruptions found. Fixing corruptions does
117+ not mean recovering lost data: Borg can not magically restore data lost due to
118+ e.g. a hardware failure. Repairing a repository means sacrificing some data
119+ for the sake of the repository as a whole and the remaining data. Hence it is,
120+ by definition, a potentially lossy task.
121+ .sp
122+ In practice, repair mode hooks into both the repository and archive checks:
73123.INDENT 0.0
74- .IP \(bu 2
75- Is the repo manifest present? If not, it is rebuilt from archive metadata
76- chunks (this requires reading and decrypting of all metadata and data).
77- .IP \(bu 2
78- Check if archive metadata chunk is present; if not, remove archive from manifest.
79- .IP \(bu 2
80- For all files (items) in the archive, for all chunks referenced by these
81- files, check if chunk is present. In repair mode, if a chunk is not present,
82- replace it with a same\- size replacement chunk of zeroes. If a previously lost
83- chunk reappears (e.g. via a later backup), in repair mode the all\- zero replacement
84- chunk will be replaced by the correct chunk. This requires reading of archive and
85- file metadata, but not data.
86- .IP \(bu 2
87- In repair mode, when all the archives were checked, orphaned chunks are deleted
88- from the repo. One cause of orphaned chunks are input file related errors (like
89- read errors) in the archive creation process.
90- .IP \(bu 2
91- In verify\- data mode, a complete cryptographic verification of the archive data
92- integrity is performed. This conflicts with \fB \-\- repository \- only \fP as this mode
93- only makes sense if the archive checks are enabled. The full details of this mode
94- are documented below.
95- .IP \(bu 2
96- If checking a remote repo via \fB ssh: \fP , the archive check is executed on the
97- client machine because it requires decryption, and this is always done client\- side
98- as key access is needed.
99- .IP \(bu 2
100- The archive checks can be time consuming; they can be skipped using the
101- \fB \-\- repository \- only \fP option.
124+ .IP 1. 3
125+ When checking the repository\(aq s consistency, repair mode will try to recover
126+ as many objects from segments with integrity errors as possible, and ensure
127+ that the index is consistent with the data stored in the segments.
128+ .IP 2. 3
129+ When checking the consistency and correctness of archives, repair mode might
130+ remove whole archives from the manifest if their archive metadata chunk is
131+ corrupt or lost. On a chunk level (i.e. the contents of files), repair mode
132+ will replace corrupt or lost chunks with a same\- size replacement chunk of
133+ zeroes. If a previously zeroed chunk reappears, repair mode will restore
134+ this lost chunk using the new chunk. Lastly, repair mode will also delete
135+ orphaned chunks (e.g. caused by read errors while creating the archive).
102136.UNINDENT
103137.sp
104- The \fB \-\- max \- duration \fP option can be used to split a long\- running repository check
105- into multiple partial checks. After the given number of seconds the check is
106- interrupted. The next partial check will continue where the previous one stopped,
107- until the complete repository has been checked. Example: Assuming a complete check took 7
108- hours, then running a daily check with \-\- max\- duration=3600 (1 hour) resulted in one
109- completed check per week.
110- .sp
111- Attention: A partial \-\- repository\- only check can only do way less checking than a full
112- \-\- repository\- only check: only the non\- cryptographic checksum checks on segment file
113- entries are done, while a full \-\- repository\- only check would also do a repo index check.
114- A partial check cannot be combined with the \fB \-\- repair \fP option. Partial checks
115- may therefore be useful only with very large repositories where a full check would take
116- too long.
117- Doing a full repository check aborts a partial check; the next partial check will restart
118- from the beginning.
119- .sp
120- The \fB \-\- verify \- data \fP option will perform a full integrity verification (as opposed to
121- checking the CRC32 of the segment) of data, which means reading the data from the
122- repository, decrypting and decompressing it. This is a cryptographic verification,
123- which will detect (accidental) corruption. For encrypted repositories it is
124- tamper\- resistant as well, unless the attacker has access to the keys. It is also very
125- slow.
138+ Most steps taken by repair mode have a one\- time effect on the repository, like
139+ removing a lost archive from the repository. However, replacing a corrupt or
140+ lost chunk with an all\- zero replacement will have an ongoing effect on the
141+ repository: When attempting to extract a file referencing an all\- zero chunk,
142+ the \fB extract \fP command will distinctly warn about it. The FUSE filesystem
143+ created by the \fB mount \fP command will reject reading such a \(dq zero\- patched\(dq
144+ file unless a special mount option is given.
145+ .sp
146+ As mentioned earlier, Borg might be able to \(dq heal\(dq a \(dq zero\- patched\(dq file in
147+ repair mode, if all its previously lost chunks reappear (e.g. via a later
148+ backup). This is achieved by Borg not only keeping track of the all\- zero
149+ replacement chunks, but also by keeping metadata about the lost chunks. In
150+ repair mode Borg will check whether a previously lost chunk reappeared and will
151+ replace the all\- zero replacement chunk by the reappeared chunk. If all lost
152+ chunks of a \(dq zero\- patched\(dq file reappear, this effectively \(dq heals\(dq the file.
153+ Consequently, if lost chunks were repaired earlier, it is advised to run
154+ \fB \-\- repair \fP a second time after creating some new backups.
126155.SH OPTIONS
127156.sp
128157See \fI borg \- common(1) \fP for common options of Borg commands.
0 commit comments