Skip to content

Commit 6fcb384

Browse files
committed
Merge branch 'rt/zlib-smaller-window'
* rt/zlib-smaller-window: test: consolidate definition of $LF Tolerate zlib deflation with window size < 32Kb
2 parents 5245720 + 3f4ab62 commit 6fcb384

23 files changed

+99
-13
lines changed

sha1_file.c

Lines changed: 26 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1217,14 +1217,34 @@ static int experimental_loose_object(unsigned char *map)
12171217
unsigned int word;
12181218

12191219
/*
1220-
* Is it a zlib-compressed buffer? If so, the first byte
1221-
* must be 0x78 (15-bit window size, deflated), and the
1222-
* first 16-bit word is evenly divisible by 31. If so,
1223-
* we are looking at the official format, not the experimental
1224-
* one.
1220+
* We must determine if the buffer contains the standard
1221+
* zlib-deflated stream or the experimental format based
1222+
* on the in-pack object format. Compare the header byte
1223+
* for each format:
1224+
*
1225+
* RFC1950 zlib w/ deflate : 0www1000 : 0 <= www <= 7
1226+
* Experimental pack-based : Stttssss : ttt = 1,2,3,4
1227+
*
1228+
* If bit 7 is clear and bits 0-3 equal 8, the buffer MUST be
1229+
* in standard loose-object format, UNLESS it is a Git-pack
1230+
* format object *exactly* 8 bytes in size when inflated.
1231+
*
1232+
* However, RFC1950 also specifies that the 1st 16-bit word
1233+
* must be divisible by 31 - this checksum tells us our buffer
1234+
* is in the standard format, giving a false positive only if
1235+
* the 1st word of the Git-pack format object happens to be
1236+
* divisible by 31, ie:
1237+
* ((byte0 * 256) + byte1) % 31 = 0
1238+
* => 0ttt10000www1000 % 31 = 0
1239+
*
1240+
* As it happens, this case can only arise for www=3 & ttt=1
1241+
* - ie, a Commit object, which would have to be 8 bytes in
1242+
* size. As no Commit can be that small, we find that the
1243+
* combination of these two criteria (bitmask & checksum)
1244+
* can always correctly determine the buffer format.
12251245
*/
12261246
word = (map[0] << 8) + map[1];
1227-
if (map[0] == 0x78 && !(word % 31))
1247+
if ((map[0] & 0x8F) == 0x08 && !(word % 31))
12281248
return 0;
12291249
else
12301250
return 1;

t/t1013-loose-object-format.sh

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
#!/bin/sh
2+
#
3+
# Copyright (c) 2011 Roberto Tyley
4+
#
5+
6+
test_description='Correctly identify and parse loose object headers
7+
8+
There are two file formats for loose objects - the original standard
9+
format, and the experimental format introduced with Git v1.4.3, later
10+
deprecated with v1.5.3. Although Git no longer writes the
11+
experimental format, objects in both formats must be read, with the
12+
format for a given file being determined by the header.
13+
14+
Detecting file format based on header is not entirely trivial, not
15+
least because the first byte of a zlib-deflated stream will vary
16+
depending on how much memory was allocated for the deflation window
17+
buffer when the object was written out (for example 4KB on Android,
18+
rather that 32KB on a normal PC).
19+
20+
The loose objects used as test vectors have been generated with the
21+
following Git versions:
22+
23+
standard format: Git v1.7.4.1
24+
experimental format: Git v1.4.3 (legacyheaders=false)
25+
standard format, deflated with 4KB window size: Agit/JGit on Android
26+
'
27+
28+
. ./test-lib.sh
29+
30+
assert_blob_equals() {
31+
printf "%s" "$2" >expected &&
32+
git cat-file -p "$1" >actual &&
33+
test_cmp expected actual
34+
}
35+
36+
test_expect_success setup '
37+
cp -R "$TEST_DIRECTORY/t1013/objects" .git/
38+
git --version
39+
'
40+
41+
test_expect_success 'read standard-format loose objects' '
42+
git cat-file tag 8d4e360d6c70fbd72411991c02a09c442cf7a9fa &&
43+
git cat-file commit 6baee0540ea990d9761a3eb9ab183003a71c3696 &&
44+
git ls-tree 7a37b887a73791d12d26c0d3e39568a8fb0fa6e8 &&
45+
assert_blob_equals "257cc5642cb1a054f08cc83f2d943e56fd3ebe99" "foo$LF"
46+
'
47+
48+
test_expect_success 'read experimental-format loose objects' '
49+
git cat-file tag 76e7fa9941f4d5f97f64fea65a2cba436bc79cbb &&
50+
git cat-file commit 7875c6237d3fcdd0ac2f0decc7d3fa6a50b66c09 &&
51+
git ls-tree 95b1625de3ba8b2214d1e0d0591138aea733f64f &&
52+
assert_blob_equals "2e65efe2a145dda7ee51d1741299f848e5bf752e" "a" &&
53+
assert_blob_equals "9ae9e86b7bd6cb1472d9373702d8249973da0832" "ab" &&
54+
assert_blob_equals "85df50785d62d3b05ab03d9cbf7e4a0b49449730" "abcd" &&
55+
assert_blob_equals "1656f9233d999f61ef23ef390b9c71d75399f435" "abcdefgh" &&
56+
assert_blob_equals "1e72a6b2c4a577ab0338860fa9fe87f761fc9bbd" "abcdefghi" &&
57+
assert_blob_equals "70e6a83d8dcb26fc8bc0cf702e2ddeb6adca18fd" "abcdefghijklmnop" &&
58+
assert_blob_equals "bd15045f6ce8ff75747562173640456a394412c8" "abcdefghijklmnopqrstuvwx"
59+
'
60+
61+
test_expect_success 'read standard-format objects deflated with smaller window buffer' '
62+
git cat-file tag f816d5255855ac160652ee5253b06cd8ee14165a &&
63+
git cat-file tag 149cedb5c46929d18e0f118e9fa31927487af3b6
64+
'
65+
66+
test_done
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.

t/t1013/objects/76/e7fa9941f4d5f97f64fea65a2cba436bc79cbb

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
� x�%�A�0@�}O1{cSZ(��ν��th���Z��ޠ��?�m�6d�i��9��G�h�ب�ZR'Q���R������p���qL9��=g���sI�oop���eϫ_1����$��*Si��NwpP�RB�����
2+
��[(�d-���L9�

0 commit comments

Comments
 (0)