Skip to content

Commit 86b008e

Browse files
peffgitster
authored andcommitted
t: add library for munging chunk-format files
When testing corruption of files using the chunk format (like commit-graphs and midx files), it's helpful to be able to modify bytes in specific chunks. This requires being able both to read the table-of-contents (to find the chunk to modify) but also to adjust it (to account for size changes in the offsets of subsequent chunks). We have some tests already which corrupt chunk files, but they have some downsides: 1. They are very brittle, as they manually compute the expected size of a particular instance of the file (e.g., see the definitions starting with NUM_OBJECTS in t5319). 2. Because they rely on manual offsets and don't read the table-of-contents, they're limited to overwriting bytes. But there are many interesting corruptions that involve changing the sizes of chunks (especially smaller-than-expected ones). This patch adds a perl script which makes such corruptions easy. We'll use it in subsequent patches. Note that we could get by with just a big "perl -e" inside the helper function. I chose to put it in a separate script for two reasons. One, so we don't have to worry about the extra layer of shell quoting. And two, the script is kind of big, and running the tests with "-x" would repeatedly dump it into the log output. Signed-off-by: Jeff King <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 570b8b8 commit 86b008e

File tree

2 files changed

+83
-0
lines changed

2 files changed

+83
-0
lines changed

t/lib-chunk.sh

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Shell library for working with "chunk" files (commit-graph, midx, etc).
2+
3+
# corrupt_chunk_file <fn> <chunk> <offset> <bytes>
4+
#
5+
# Corrupt a chunk-based file (like a commit-graph) by overwriting the bytes
6+
# found in the chunk specified by the 4-byte <chunk> identifier. If <offset> is
7+
# "clear", replace the chunk entirely. Otherwise, overwrite data <offset> bytes
8+
# into the chunk.
9+
#
10+
# The <bytes> are interpreted as pairs of hex digits (so "000000FE" would be
11+
# big-endian 254).
12+
corrupt_chunk_file () {
13+
fn=$1; shift
14+
perl "$TEST_DIRECTORY"/lib-chunk/corrupt-chunk-file.pl \
15+
"$@" <"$fn" >"$fn.tmp" &&
16+
mv "$fn.tmp" "$fn"
17+
}

t/lib-chunk/corrupt-chunk-file.pl

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
#!/usr/bin/perl
2+
3+
my ($chunk, $seek, $bytes) = @ARGV;
4+
$bytes =~ s/../chr(hex($&))/ge;
5+
6+
binmode STDIN;
7+
binmode STDOUT;
8+
9+
# A few helpers to read bytes, or read and copy them to the
10+
# output.
11+
sub get {
12+
my $n = shift;
13+
return unless $n;
14+
read(STDIN, my $buf, $n)
15+
or die "read error or eof: $!\n";
16+
return $buf;
17+
}
18+
sub copy {
19+
my $buf = get(@_);
20+
print $buf;
21+
return $buf;
22+
}
23+
24+
# read until we find table-of-contents entry for chunk;
25+
# note that we cheat a bit by assuming 4-byte alignment and
26+
# that no ToC entry will accidentally look like a header.
27+
#
28+
# If we don't find the entry, copy() will hit EOF and exit
29+
# (which should cause the caller to fail the test).
30+
while (copy(4) ne $chunk) { }
31+
my $offset = unpack("Q>", copy(8));
32+
33+
# In clear mode, our length will change. So figure out
34+
# the length by comparing to the offset of the next chunk, and
35+
# then adjust that offset (and all subsequent) ones.
36+
my $len;
37+
if ($seek eq "clear") {
38+
my $id;
39+
do {
40+
$id = copy(4);
41+
my $next = unpack("Q>", get(8));
42+
if (!defined $len) {
43+
$len = $next - $offset;
44+
}
45+
print pack("Q>", $next - $len + length($bytes));
46+
} while (unpack("N", $id));
47+
}
48+
49+
# and now copy up to our existing chunk data
50+
copy($offset - tell(STDIN));
51+
if ($seek eq "clear") {
52+
# if clearing, skip past existing data
53+
get($len);
54+
} else {
55+
# otherwise, copy up to the requested offset,
56+
# and skip past the overwritten bytes
57+
copy($seek);
58+
get(length($bytes));
59+
}
60+
61+
# now write out the requested bytes, along
62+
# with any other remaining data
63+
print $bytes;
64+
while (read(STDIN, my $buf, 4096)) {
65+
print $buf;
66+
}

0 commit comments

Comments
 (0)