Skip to content

Commit aa962fe

Browse files
peffgitster
authored andcommitted
v0 protocol: fix infinite loop when parsing multi-valued capabilities
If Git's client-side parsing of an upload-pack response (so git-fetch or ls-remote) sees multiple instances of a single capability, it can enter an infinite loop due to a bug in advancing the "offset" parameter in the parser. This bug can't happen between a client and server of the same Git version. The client bug is in parse_feature_value() when the caller passes in an offset parameter. And that only happens when the v0 protocol is parsing "symref" and "object-format" capabilities, via next_server_feature_value(). But Git has never produced multiple object-format capabilities, and it stopped producing multiple symref values in d007dbf (Revert "upload-pack: send non-HEAD symbolic refs", 2013-11-18). However, upload-pack did produce multiple symref entries for a while, and they are valid. Plus other implementations, such as Dulwich will still do so. So we should handle them. And even if we do not expect it, it is obviously a bug for the parser to enter an infinite loop. The bug itself is pretty simple. Commit 2c6a403 (connect: add function to parse multiple v1 capability values, 2020-05-25) added the "offset" parameter, which is used as both an in- and out-parameter. When parsing the first "symref" capability, *offset will be 0 on input, and after parsing the capability, we set *offset to an index just past the value by taking a pointer difference "(value + end) - feature_list". But on the second call, now *offset is set to that larger index, which lets us skip past the first "symref" capability. However, we do so by incrementing feature_list. That means our pointer difference is now too small; it is counting from where we resumed parsing, not from the start of the original feature_list pointer. And because we incremented feature_list only inside our function, and not the caller, that increment is lost next time the function is called. One solution would be to account for those skipped bytes by incrementing *offset, rather than assigning to it. But wait, there's more! We also increment feature_list if we have a near-miss. Say we are looking for "symref" and find "almost-symref". In that case we'll point feature_list to the "y" in "almost-symref" and restart our search. But that again means our offset won't be correct, as it won't account for the bytes between the start of the string and that "y". So instead, let's just record the beginning of the feature_list string in a separate pointer that we never touch. That offset we take in and return is meant to be using that point as a base, and now we'll do so consistently. Since the bug can't be reproduced using the current version of git-upload-pack, we'll instead hard-code an input which triggers the problem. Before this patch it loops forever re-parsing the second symref entry. Now we check both that it finishes, and that it parses both entries correctly (a case we could not test at all before). We don't need to worry about testing v2 here; it communicates the capabilities in a completely different way, and doesn't use this code at all. There are tests earlier in t5512 that are meant to cover this (they don't, but we'll address that in a future patch). Reported-by: Jonas Haag <[email protected]> Signed-off-by: Jeff King <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 9857273 commit aa962fe

File tree

2 files changed

+55
-2
lines changed

2 files changed

+55
-2
lines changed

connect.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -597,6 +597,7 @@ struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
597597

598598
const char *parse_feature_value(const char *feature_list, const char *feature, int *lenp, int *offset)
599599
{
600+
const char *orig_start = feature_list;
600601
int len;
601602

602603
if (!feature_list)
@@ -616,7 +617,7 @@ const char *parse_feature_value(const char *feature_list, const char *feature, i
616617
if (lenp)
617618
*lenp = 0;
618619
if (offset)
619-
*offset = found + len - feature_list;
620+
*offset = found + len - orig_start;
620621
return value;
621622
}
622623
/* feature with a value (e.g., "agent=git/1.2.3") */
@@ -628,7 +629,7 @@ const char *parse_feature_value(const char *feature_list, const char *feature, i
628629
if (lenp)
629630
*lenp = end;
630631
if (offset)
631-
*offset = value + end - feature_list;
632+
*offset = value + end - orig_start;
632633
return value;
633634
}
634635
/*

t/t5512-ls-remote.sh

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,19 @@ generate_references () {
1515
done
1616
}
1717

18+
test_expect_success 'set up fake upload-pack' '
19+
# This can be used to simulate an upload-pack that just shows the
20+
# contents of the "input" file (prepared with the test-tool pkt-line
21+
# helper), and does not do any negotiation (since ls-remote does not
22+
# need it).
23+
write_script cat-input <<-\EOF
24+
# send our initial advertisement/response
25+
cat input
26+
# soak up the flush packet from the client
27+
cat
28+
EOF
29+
'
30+
1831
test_expect_success 'dies when no remote found' '
1932
test_must_fail git ls-remote
2033
'
@@ -360,4 +373,43 @@ test_expect_success 'ls-remote prefixes work with all protocol versions' '
360373
test_cmp expect actual.v2
361374
'
362375

376+
test_expect_success 'v0 clients can handle multiple symrefs' '
377+
# Modern versions of Git will not return multiple symref capabilities
378+
# for v0, so we have to hard-code the response. Note that we will
379+
# always use both v0 and object-format=sha1 here, as the hard-coded
380+
# response reflects a server that only supports those.
381+
oid=1234567890123456789012345678901234567890 &&
382+
symrefs="symref=refs/remotes/origin/HEAD:refs/remotes/origin/main" &&
383+
symrefs="$symrefs symref=HEAD:refs/heads/main" &&
384+
385+
# Likewise we want to make sure our parser is not fooled by the string
386+
# "symref" appearing as part of an earlier cap. But there is no way to
387+
# do that via upload-pack, as arbitrary strings can appear only in a
388+
# "symref" value itself (where we skip past the values as a whole)
389+
# and "agent" (which always appears after "symref", so putting our
390+
# parser in a confused state is less interesting).
391+
caps="some other caps including a-fake-symref-cap" &&
392+
393+
test-tool pkt-line pack >input.q <<-EOF &&
394+
$oid HEADQ$caps $symrefs
395+
$oid refs/heads/main
396+
$oid refs/remotes/origin/HEAD
397+
$oid refs/remotes/origin/main
398+
0000
399+
EOF
400+
q_to_nul <input.q >input &&
401+
402+
cat >expect <<-EOF &&
403+
ref: refs/heads/main HEAD
404+
$oid HEAD
405+
$oid refs/heads/main
406+
ref: refs/remotes/origin/main refs/remotes/origin/HEAD
407+
$oid refs/remotes/origin/HEAD
408+
$oid refs/remotes/origin/main
409+
EOF
410+
411+
git ls-remote --symref --upload-pack=./cat-input . >actual &&
412+
test_cmp expect actual
413+
'
414+
363415
test_done

0 commit comments

Comments
 (0)