Skip to content

Commit b949784

Browse files
committed
Merge branch 'tb/incremental-midx-part-1'
Incremental updates of multi-pack index files. * tb/incremental-midx-part-1: midx: implement support for writing incremental MIDX chains t/t5313-pack-bounds-checks.sh: prepare for sub-directories t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' midx: implement verification support for incremental MIDXs midx: support reading incremental MIDX chains midx: teach `midx_fanout_add_midx_fanout()` about incremental MIDXs midx: teach `midx_preferred_pack()` about incremental MIDXs midx: teach `midx_contains_pack()` about incremental MIDXs midx: remove unused `midx_locate_pack()` midx: teach `fill_midx_entry()` about incremental MIDXs midx: teach `nth_midxed_offset()` about incremental MIDXs midx: teach `bsearch_midx()` about incremental MIDXs midx: introduce `bsearch_one_midx()` midx: teach `nth_bitmapped_pack()` about incremental MIDXs midx: teach `nth_midxed_object_oid()` about incremental MIDXs midx: teach `prepare_midx_pack()` about incremental MIDXs midx: teach `nth_midxed_pack_int_id()` about incremental MIDXs midx: add new fields for incremental MIDX chains Documentation: describe incremental MIDX format
2 parents 53129a0 + fcb2205 commit b949784

24 files changed

+958
-259
lines changed

Documentation/git-multi-pack-index.txt

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,12 @@ The file given at `<path>` is expected to be readable, and can contain
6464
duplicates. (If a given OID is given more than once, it is marked as
6565
preferred if at least one instance of it begins with the special `+`
6666
marker).
67+
68+
--incremental::
69+
Write an incremental MIDX file containing only objects
70+
and packs not present in an existing MIDX layer.
71+
Migrates non-incremental MIDXs to incremental ones when
72+
necessary. Incompatible with `--bitmap`.
6773
--
6874

6975
verify::
@@ -74,6 +80,8 @@ expire::
7480
have no objects referenced by the MIDX (with the exception of
7581
`.keep` packs and cruft packs). Rewrite the MIDX file afterward
7682
to remove all references to these pack-files.
83+
+
84+
NOTE: this mode is incompatible with incremental MIDX files.
7785

7886
repack::
7987
Create a new pack-file containing objects in small pack-files
@@ -95,7 +103,8 @@ repack::
95103
+
96104
If `repack.packKeptObjects` is `false`, then any pack-files with an
97105
associated `.keep` file will not be selected for the batch to repack.
98-
106+
+
107+
NOTE: this mode is incompatible with incremental MIDX files.
99108

100109
EXAMPLES
101110
--------

Documentation/technical/multi-pack-index.txt

Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,6 +61,109 @@ Design Details
6161
- The MIDX file format uses a chunk-based approach (similar to the
6262
commit-graph file) that allows optional data to be added.
6363

64+
Incremental multi-pack indexes
65+
------------------------------
66+
67+
As repositories grow in size, it becomes more expensive to write a
68+
multi-pack index (MIDX) that includes all packfiles. To accommodate
69+
this, the "incremental multi-pack indexes" feature allows for combining
70+
a "chain" of multi-pack indexes.
71+
72+
Each individual component of the chain need only contain a small number
73+
of packfiles. Appending to the chain does not invalidate earlier parts
74+
of the chain, so repositories can control how much time is spent
75+
updating the MIDX chain by determining the number of packs in each layer
76+
of the MIDX chain.
77+
78+
=== Design state
79+
80+
At present, the incremental multi-pack indexes feature is missing two
81+
important components:
82+
83+
- The ability to rewrite earlier portions of the MIDX chain (i.e., to
84+
"compact" some collection of adjacent MIDX layers into a single
85+
MIDX). At present the only supported way of shrinking a MIDX chain
86+
is to rewrite the entire chain from scratch without the `--split`
87+
flag.
88+
+
89+
There are no fundamental limitations that stand in the way of being able
90+
to implement this feature. It is omitted from the initial implementation
91+
in order to reduce the complexity, but will be added later.
92+
93+
- Support for reachability bitmaps. The classic single MIDX
94+
implementation does support reachability bitmaps (see the section
95+
titled "multi-pack-index reverse indexes" in
96+
linkgit:gitformat-pack[5] for more details).
97+
+
98+
As above, there are no fundamental limitations that stand in the way of
99+
extending the incremental MIDX format to support reachability bitmaps.
100+
The design below specifically takes this into account, and support for
101+
reachability bitmaps will be added in a future patch series. It is
102+
omitted from the current implementation for the same reason as above.
103+
+
104+
In brief, to support reachability bitmaps with the incremental MIDX
105+
feature, the concept of the pseudo-pack order is extended across each
106+
layer of the incremental MIDX chain to form a concatenated pseudo-pack
107+
order. This concatenation takes place in the same order as the chain
108+
itself (in other words, the concatenated pseudo-pack order for a chain
109+
`{$H1, $H2, $H3}` would be the pseudo-pack order for `$H1`, followed by
110+
the pseudo-pack order for `$H2`, followed by the pseudo-pack order for
111+
`$H3`).
112+
+
113+
The layout will then be extended so that each layer of the incremental
114+
MIDX chain can write a `*.bitmap`. The objects in each layer's bitmap
115+
are offset by the number of objects in the previous layers of the chain.
116+
117+
=== File layout
118+
119+
Instead of storing a single `multi-pack-index` file (with an optional
120+
`.rev` and `.bitmap` extension) in `$GIT_DIR/objects/pack`, incremental
121+
MIDXs are stored in the following layout:
122+
123+
----
124+
$GIT_DIR/objects/pack/multi-pack-index.d/
125+
$GIT_DIR/objects/pack/multi-pack-index.d/multi-pack-index-chain
126+
$GIT_DIR/objects/pack/multi-pack-index.d/multi-pack-index-$H1.midx
127+
$GIT_DIR/objects/pack/multi-pack-index.d/multi-pack-index-$H2.midx
128+
$GIT_DIR/objects/pack/multi-pack-index.d/multi-pack-index-$H3.midx
129+
----
130+
131+
The `multi-pack-index-chain` file contains a list of the incremental
132+
MIDX files in the chain, in order. The above example shows a chain whose
133+
`multi-pack-index-chain` file would contain the following lines:
134+
135+
----
136+
$H1
137+
$H2
138+
$H3
139+
----
140+
141+
The `multi-pack-index-$H1.midx` file contains the first layer of the
142+
multi-pack-index chain. The `multi-pack-index-$H2.midx` file contains
143+
the second layer of the chain, and so on.
144+
145+
When both an incremental- and non-incremental MIDX are present, the
146+
non-incremental MIDX is always read first.
147+
148+
=== Object positions for incremental MIDXs
149+
150+
In the original multi-pack-index design, we refer to objects via their
151+
lexicographic position (by object IDs) within the repository's singular
152+
multi-pack-index. In the incremental multi-pack-index design, we refer
153+
to objects via their index into a concatenated lexicographic ordering
154+
among each component in the MIDX chain.
155+
156+
If `objects_nr()` is a function that returns the number of objects in a
157+
given MIDX layer, then the index of an object at lexicographic position
158+
`i` within, say, $H3 is defined as:
159+
160+
----
161+
objects_nr($H2) + objects_nr($H1) + i
162+
----
163+
164+
(in the C implementation, this is often computed as `i +
165+
m->num_objects_in_base`).
166+
64167
Future Work
65168
-----------
66169

builtin/multi-pack-index.c

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,8 @@ static int cmd_multi_pack_index_write(int argc, const char **argv,
129129
MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX),
130130
OPT_BIT(0, "progress", &opts.flags,
131131
N_("force progress reporting"), MIDX_PROGRESS),
132+
OPT_BIT(0, "incremental", &opts.flags,
133+
N_("write a new incremental MIDX"), MIDX_WRITE_INCREMENTAL),
132134
OPT_BOOL(0, "stdin-packs", &opts.stdin_packs,
133135
N_("write multi-pack index containing only given indexes")),
134136
OPT_FILENAME(0, "refs-snapshot", &opts.refs_snapshot,

builtin/repack.c

Lines changed: 2 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1218,10 +1218,6 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
12181218
if (!write_midx &&
12191219
(!(pack_everything & ALL_INTO_ONE) || !is_bare_repository()))
12201220
write_bitmaps = 0;
1221-
} else if (write_bitmaps &&
1222-
git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0) &&
1223-
git_env_bool(GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP, 0)) {
1224-
write_bitmaps = 0;
12251221
}
12261222
if (pack_kept_objects < 0)
12271223
pack_kept_objects = write_bitmaps > 0 && !write_midx;
@@ -1521,8 +1517,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
15211517

15221518
if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) {
15231519
unsigned flags = 0;
1524-
if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP, 0))
1525-
flags |= MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX;
1520+
if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL, 0))
1521+
flags |= MIDX_WRITE_INCREMENTAL;
15261522
write_midx_file(get_object_directory(), NULL, NULL, flags);
15271523
}
15281524

ci/run-build-and-tests.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ linux-TEST-vars)
2525
export GIT_TEST_COMMIT_GRAPH=1
2626
export GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1
2727
export GIT_TEST_MULTI_PACK_INDEX=1
28-
export GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=1
28+
export GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=1
2929
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master
3030
export GIT_TEST_NO_WRITE_REV_INDEX=1
3131
export GIT_TEST_CHECKOUT_WORKERS=2

0 commit comments

Comments
 (0)