Skip to content

Commit 2bfc24e

Browse files
ttaylorrgitster
authored andcommitted
Documentation/technical: describe pseudo-merge bitmaps format
Prepare to implement pseudo-merge bitmaps over the next several commits by first describing the serialization format which will store the new pseudo-merge bitmaps themselves. This format is implemented as an optional extension within the bitmap v1 format, making it compatible with previous versions of Git, as well as the original .bitmap implementation within JGit. The format is described in detail in the patch contents below, but the high-level description is as follows: - An array of pseudo-merge bitmaps, each containing a pair of EWAH bitmaps: one describing the set of pseudo-merge "parents", and another describing the set of object(s) reachable from those parents. - A lookup table to determine which pseudo-merge(s) a given commit appears in. An optional extended lookup table follows when there is at least one commit which appears in multiple pseudo-merge groups. - Trailing metadata, including the number of pseudo-merge(s), number of unique parents, the offset within the .bitmap file for the pseudo-merge commit lookup table, and the size of the optional extension itself. Signed-off-by: Taylor Blau <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 40864ac commit 2bfc24e

File tree

1 file changed

+132
-0
lines changed

1 file changed

+132
-0
lines changed

Documentation/technical/bitmap-format.txt

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -255,3 +255,135 @@ triplet is -
255255
xor_row (4 byte integer, network byte order): ::
256256
The position of the triplet whose bitmap is used to compress
257257
this one, or `0xffffffff` if no such bitmap exists.
258+
259+
Pseudo-merge bitmaps
260+
--------------------
261+
262+
If the `BITMAP_OPT_PSEUDO_MERGES` flag is set, a variable number of
263+
bytes (preceding the name-hash cache, commit lookup table, and trailing
264+
checksum) of the `.bitmap` file is used to store pseudo-merge bitmaps.
265+
266+
For more information on what pseudo-merges are, why they are useful, and
267+
how to configure them, see the information in linkgit:gitpacking[7].
268+
269+
=== File format
270+
271+
If enabled, pseudo-merge bitmaps are stored in an optional section at
272+
the end of a `.bitmap` file. The format is as follows:
273+
274+
....
275+
+-------------------------------------------+
276+
| .bitmap File |
277+
+-------------------------------------------+
278+
| |
279+
| Pseudo-merge bitmaps (Variable Length) |
280+
| +---------------------------+ |
281+
| | commits_bitmap (EWAH) | |
282+
| +---------------------------+ |
283+
| | merge_bitmap (EWAH) | |
284+
| +---------------------------+ |
285+
| |
286+
+-------------------------------------------+
287+
| |
288+
| Lookup Table |
289+
| +---------------------------+ |
290+
| | commit_pos (4 bytes) | |
291+
| +---------------------------+ |
292+
| | offset (8 bytes) | |
293+
| +------------+--------------+ |
294+
| |
295+
| Offset Cases: |
296+
| ------------- |
297+
| |
298+
| 1. MSB Unset: single pseudo-merge bitmap |
299+
| + offset to pseudo-merge bitmap |
300+
| |
301+
| 2. MSB Set: multiple pseudo-merges |
302+
| + offset to extended lookup table |
303+
| |
304+
+-------------------------------------------+
305+
| |
306+
| Extended Lookup Table (Optional) |
307+
| +----+----------+----------+----------+ |
308+
| | N | Offset 1 | .... | Offset N | |
309+
| +----+----------+----------+----------+ |
310+
| | | 8 bytes | .... | 8 bytes | |
311+
| +----+----------+----------+----------+ |
312+
| |
313+
+-------------------------------------------+
314+
| |
315+
| Pseudo-merge Metadata |
316+
| +-----------------------------------+ |
317+
| | # pseudo-merges (4 bytes) | |
318+
| +-----------------------------------+ |
319+
| | # commits (4 bytes) | |
320+
| +-----------------------------------+ |
321+
| | Lookup offset (8 bytes) | |
322+
| +-----------------------------------+ |
323+
| | Extension size (8 bytes) | |
324+
| +-----------------------------------+ |
325+
| |
326+
+-------------------------------------------+
327+
....
328+
329+
* One or more pseudo-merge bitmaps, each containing:
330+
331+
** `commits_bitmap`, an EWAH-compressed bitmap describing the set of
332+
commits included in the this psuedo-merge.
333+
334+
** `merge_bitmap`, an EWAH-compressed bitmap describing the union of
335+
the set of objects reachable from all commits listed in the
336+
`commits_bitmap`.
337+
338+
* A lookup table, mapping pseudo-merged commits to the pseudo-merges
339+
they belong to. Entries appear in increasing order of each commit's
340+
bit position. Each entry is 12 bytes wide, and is comprised of the
341+
following:
342+
343+
** `commit_pos`, a 4-byte unsigned value (in network byte-order)
344+
containing the bit position for this commit.
345+
346+
** `offset`, an 8-byte unsigned value (also in network byte-order)
347+
containing either one of two possible offsets, depending on whether or
348+
not the most-significant bit is set.
349+
350+
*** If unset (i.e. `offset & ((uint64_t)1<<63) == 0`), the offset
351+
(relative to the beginning of the `.bitmap` file) at which the
352+
pseudo-merge bitmap for this commit can be read. This indicates
353+
only a single pseudo-merge bitmap contains this commit.
354+
355+
*** If set (i.e. `offset & ((uint64_t)1<<63) != 0`), the offset
356+
(again relative to the beginning of the `.bitmap` file) at which
357+
the extended offset table can be located describing the set of
358+
pseudo-merge bitmaps which contain this commit. This indicates
359+
that multiple pseudo-merge bitmaps contain this commit.
360+
361+
* An (optional) extended lookup table (written if and only if there is
362+
at least one commit which appears in more than one pseudo-merge).
363+
There are as many entries as commits which appear in multiple
364+
pseudo-merges. Each entry contains the following:
365+
366+
** `N`, a 4-byte unsigned value equal to the number of pseudo-merges
367+
which contain a given commit.
368+
369+
** An array of `N` 8-byte unsigned values, each of which is
370+
interpreted as an offset (relative to the beginning of the
371+
`.bitmap` file) at which a pseudo-merge bitmap for this commit can
372+
be read. These values occur in no particular order.
373+
374+
* Positions for all pseudo-merges, each stored as an 8-byte unsigned
375+
value (in network byte-order) containing the offset (relative to the
376+
beginning of the `.bitmap` file) of each consecutive pseudo-merge.
377+
378+
* A 4-byte unsigned value (in network byte-order) equal to the number of
379+
pseudo-merges.
380+
381+
* A 4-byte unsigned value (in network byte-order) equal to the number of
382+
unique commits which appear in any pseudo-merge.
383+
384+
* An 8-byte unsigned value (in network byte-order) equal to the number
385+
of bytes between the start of the pseudo-merge section and the
386+
beginning of the lookup table.
387+
388+
* An 8-byte unsigned value (in network byte-order) equal to the number
389+
of bytes in the pseudo-merge section (including this field).

0 commit comments

Comments
 (0)