Skip to content

Commit 74b052f

Browse files
matheustavaresgitster
authored andcommitted
packfile: fix race condition on unpack_entry()
The third phase of unpack_entry() performs the following sequence in a loop, until all the deltas enumerated in phase one are applied and the entry is fully reconstructed: 1. Add the current base entry to the delta base cache 2. Unpack the next delta 3. Patch the unpacked delta on top of the base When the optional object reading lock is enabled, the above steps will be performed while holding the lock. However, step 2. momentarily releases it so that inflation can be performed in parallel for increased performance. Because the `base` buffer inserted in the cache at 1. is not duplicated, another thread can potentially free() it while the lock is released at 2. (e.g. when there is no space left in the cache to insert another entry). In this case, the later attempt to dereference `base` at 3. will cause a segmentation fault. This problem was observed during a multithreaded git-grep execution on a repository with large objects. To fix the race condition (and later segmentation fault), let's reorder the aforementioned steps so that `base` is only added to the cache at the end. This will prevent the buffer from being released by another thread while it is still in use. An alternative solution which would not require the reordering would be to duplicate `base` before inserting it in the cache. However, as Phil Hord mentioned, memcpy()'ing large bases can negatively affect performance: in his experiments, this alternative approach slowed git-grep down by 10% to 20%. Reported-by: Phil Hord <[email protected]> Signed-off-by: Matheus Tavares <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 47ae905 commit 74b052f

File tree

1 file changed

+24
-17
lines changed

1 file changed

+24
-17
lines changed

packfile.c

Lines changed: 24 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -1764,12 +1764,10 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
17641764
void *external_base = NULL;
17651765
unsigned long delta_size, base_size = size;
17661766
int i;
1767+
off_t base_obj_offset = obj_offset;
17671768

17681769
data = NULL;
17691770

1770-
if (base)
1771-
add_delta_base_cache(p, obj_offset, base, base_size, type);
1772-
17731771
if (!base) {
17741772
/*
17751773
* We're probably in deep shit, but let's try to fetch
@@ -1807,24 +1805,33 @@ void *unpack_entry(struct repository *r, struct packed_git *p, off_t obj_offset,
18071805
"at offset %"PRIuMAX" from %s",
18081806
(uintmax_t)curpos, p->pack_name);
18091807
data = NULL;
1810-
free(external_base);
1811-
continue;
1812-
}
1808+
} else {
1809+
data = patch_delta(base, base_size, delta_data,
1810+
delta_size, &size);
18131811

1814-
data = patch_delta(base, base_size,
1815-
delta_data, delta_size,
1816-
&size);
1812+
/*
1813+
* We could not apply the delta; warn the user, but
1814+
* keep going. Our failure will be noticed either in
1815+
* the next iteration of the loop, or if this is the
1816+
* final delta, in the caller when we return NULL.
1817+
* Those code paths will take care of making a more
1818+
* explicit warning and retrying with another copy of
1819+
* the object.
1820+
*/
1821+
if (!data)
1822+
error("failed to apply delta");
1823+
}
18171824

18181825
/*
1819-
* We could not apply the delta; warn the user, but keep going.
1820-
* Our failure will be noticed either in the next iteration of
1821-
* the loop, or if this is the final delta, in the caller when
1822-
* we return NULL. Those code paths will take care of making
1823-
* a more explicit warning and retrying with another copy of
1824-
* the object.
1826+
* We delay adding `base` to the cache until the end of the loop
1827+
* because unpack_compressed_entry() momentarily releases the
1828+
* obj_read_mutex, giving another thread the chance to access
1829+
* the cache. Therefore, if `base` was already there, this other
1830+
* thread could free() it (e.g. to make space for another entry)
1831+
* before we are done using it.
18251832
*/
1826-
if (!data)
1827-
error("failed to apply delta");
1833+
if (!external_base)
1834+
add_delta_base_cache(p, base_obj_offset, base, base_size, type);
18281835

18291836
free(delta_data);
18301837
free(external_base);

0 commit comments

Comments
 (0)