Skip to content

Commit 0b1493c

Browse files
peffttaylorr
authored andcommitted
git_inflate(): skip zlib_post_call() sanity check on Z_NEED_DICT
This fixes a case where malformed object input can cause us to hit a BUG() call in the git-zlib.c code. The zlib format allows the use of preset dictionaries to reduce the size of deflated data. The checksum of the dictionary is computed by the deflate code and goes into the stream. On the inflating side, zlib sees the dictionary checksum and returns Z_NEED_DICT, asking the caller to provide the dictionary data via inflateSetDictionary(). This should never happen in Git, because we never provide a dictionary for deflating (and if we get a stream that mentions a dictionary, we have no idea how to provide it). So normally Z_NEED_DICT is a hard error for us. But something interesting happens if we _do_ happen to see it (e.g., because of a corrupt or malicious input). In git_inflate() as we loop over calls to zlib's inflate(), we translate between our large-integer git_zstream values and zlib's native z_stream types, copying in and out with zlib_pre_call() and zlib_post_call(). In zlib_post_call() we have a few sanity checks, including one that checks that the number of bytes consumed by zlib (as measured by it moving the "next_in" pointer) is equal to the movement of its "total_in" count. But these do not correspond when we see Z_NEED_DICT! Zlib consumes the bytes from the input buffer but it does not increment total_in. And so we hit the BUG("total_in mismatch") call. There are a few options here: - We could ditch that BUG() check. It is making too many assumptions about how zlib updates these values. But it does have value in most cases as a sanity check on the values we're copying. - We could skip the zlib_post_call() entirely when we see Z_NEED_DICT. We know that it's hard error for us, so we should just send the status up the stack and let the caller bail. The downside is that if we ever did want to support dictionaries, we couldn't (the git_zstream will be out of sync, since we never copied its values back from the z_stream). - We could continue to call zlib_post_call(), but skip just that BUG() check if the status is Z_NEED_DICT. This keeps git_inflate() as a thin wrapper around inflate(), and would let us later support dictionaries for some calls if we wanted to. This patch uses the third approach. It seems like the least-surprising thing to keep git_inflate() a close to inflate() as possible. And while it makes the diff a bit larger (since we have to pass the status down to to the zlib_post_call() function), it's a static local function, and every caller by definition will have just made a zlib call (and so will have a status integer). Co-authored-by: Taylor Blau <[email protected]> Signed-off-by: Jeff King <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent b748ddb commit 0b1493c

File tree

2 files changed

+48
-11
lines changed

2 files changed

+48
-11
lines changed

git-zlib.c

Lines changed: 16 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@ static void zlib_pre_call(git_zstream *s)
4545
s->z.avail_out = zlib_buf_cap(s->avail_out);
4646
}
4747

48-
static void zlib_post_call(git_zstream *s)
48+
static void zlib_post_call(git_zstream *s, int status)
4949
{
5050
unsigned long bytes_consumed;
5151
unsigned long bytes_produced;
@@ -54,7 +54,12 @@ static void zlib_post_call(git_zstream *s)
5454
bytes_produced = s->z.next_out - s->next_out;
5555
if (s->z.total_out != s->total_out + bytes_produced)
5656
BUG("total_out mismatch");
57-
if (s->z.total_in != s->total_in + bytes_consumed)
57+
/*
58+
* zlib does not update total_in when it returns Z_NEED_DICT,
59+
* causing a mismatch here. Skip the sanity check in that case.
60+
*/
61+
if (status != Z_NEED_DICT &&
62+
s->z.total_in != s->total_in + bytes_consumed)
5863
BUG("total_in mismatch");
5964

6065
s->total_out = s->z.total_out;
@@ -71,7 +76,7 @@ void git_inflate_init(git_zstream *strm)
7176

7277
zlib_pre_call(strm);
7378
status = inflateInit(&strm->z);
74-
zlib_post_call(strm);
79+
zlib_post_call(strm, status);
7580
if (status == Z_OK)
7681
return;
7782
die("inflateInit: %s (%s)", zerr_to_string(status),
@@ -89,7 +94,7 @@ void git_inflate_init_gzip_only(git_zstream *strm)
8994

9095
zlib_pre_call(strm);
9196
status = inflateInit2(&strm->z, windowBits);
92-
zlib_post_call(strm);
97+
zlib_post_call(strm, status);
9398
if (status == Z_OK)
9499
return;
95100
die("inflateInit2: %s (%s)", zerr_to_string(status),
@@ -102,7 +107,7 @@ void git_inflate_end(git_zstream *strm)
102107

103108
zlib_pre_call(strm);
104109
status = inflateEnd(&strm->z);
105-
zlib_post_call(strm);
110+
zlib_post_call(strm, status);
106111
if (status == Z_OK)
107112
return;
108113
error("inflateEnd: %s (%s)", zerr_to_string(status),
@@ -121,7 +126,7 @@ int git_inflate(git_zstream *strm, int flush)
121126
? 0 : flush);
122127
if (status == Z_MEM_ERROR)
123128
die("inflate: out of memory");
124-
zlib_post_call(strm);
129+
zlib_post_call(strm, status);
125130

126131
/*
127132
* Let zlib work another round, while we can still
@@ -163,7 +168,7 @@ void git_deflate_init(git_zstream *strm, int level)
163168
memset(strm, 0, sizeof(*strm));
164169
zlib_pre_call(strm);
165170
status = deflateInit(&strm->z, level);
166-
zlib_post_call(strm);
171+
zlib_post_call(strm, status);
167172
if (status == Z_OK)
168173
return;
169174
die("deflateInit: %s (%s)", zerr_to_string(status),
@@ -179,7 +184,7 @@ static void do_git_deflate_init(git_zstream *strm, int level, int windowBits)
179184
status = deflateInit2(&strm->z, level,
180185
Z_DEFLATED, windowBits,
181186
8, Z_DEFAULT_STRATEGY);
182-
zlib_post_call(strm);
187+
zlib_post_call(strm, status);
183188
if (status == Z_OK)
184189
return;
185190
die("deflateInit2: %s (%s)", zerr_to_string(status),
@@ -210,7 +215,7 @@ int git_deflate_abort(git_zstream *strm)
210215

211216
zlib_pre_call(strm);
212217
status = deflateEnd(&strm->z);
213-
zlib_post_call(strm);
218+
zlib_post_call(strm, status);
214219
return status;
215220
}
216221

@@ -230,7 +235,7 @@ int git_deflate_end_gently(git_zstream *strm)
230235

231236
zlib_pre_call(strm);
232237
status = deflateEnd(&strm->z);
233-
zlib_post_call(strm);
238+
zlib_post_call(strm, status);
234239
return status;
235240
}
236241

@@ -247,7 +252,7 @@ int git_deflate(git_zstream *strm, int flush)
247252
? 0 : flush);
248253
if (status == Z_MEM_ERROR)
249254
die("deflate: out of memory");
250-
zlib_post_call(strm);
255+
zlib_post_call(strm, status);
251256

252257
/*
253258
* Let zlib work another round, while we can still

t/t1006-cat-file.sh

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -836,6 +836,38 @@ test_expect_success 'truncated object with --allow-unknown-type' - <<\EOT
836836
test_grep "unable to unpack $blob header" err
837837
EOT
838838

839+
test_expect_success 'object reading handles zlib dictionary' - <<\EOT
840+
echo 'content that will be recompressed' >file &&
841+
blob=$(git hash-object -w file) &&
842+
objpath=.git/objects/$(test_oid_to_path "$blob") &&
843+
844+
# Recompress a loose object using a precomputed zlib dictionary.
845+
# This was originally done with:
846+
#
847+
# perl -MCompress::Raw::Zlib -e '
848+
# binmode STDIN;
849+
# binmode STDOUT;
850+
# my $data = do { local $/; <STDIN> };
851+
# my $in = new Compress::Raw::Zlib::Inflate;
852+
# my $de = new Compress::Raw::Zlib::Deflate(
853+
# -Dictionary => "anything"
854+
# );
855+
# $in->inflate($data, $raw);
856+
# $de->deflate($raw, $out);
857+
# print $out;
858+
# ' <obj.bak >$objpath
859+
#
860+
# but we do not want to require the perl module for all test runs (nor
861+
# carry a custom t/helper program that uses zlib features we don't
862+
# otherwise care about).
863+
mv "$objpath" obj.bak &&
864+
test_when_finished 'mv obj.bak "$objpath"' &&
865+
printf '\170\273\017\112\003\143' >$objpath &&
866+
867+
test_must_fail git cat-file blob $blob 2>err &&
868+
test_grep 'error: inflate: needs dictionary' err
869+
EOT
870+
839871
# Tests for git cat-file --follow-symlinks
840872
test_expect_success 'prep for symlink tests' '
841873
echo_without_newline "$hello_content" >morx &&

0 commit comments

Comments
 (0)