Skip to content

Commit faf558b

Browse files
ttaylorrgitster
authored andcommitted
pseudo-merge: implement support for selecting pseudo-merge commits
Teach the new pseudo-merge machinery how to select non-bitmapped commits for inclusion in different pseudo-merge group(s) based on a handful of criteria. Note that the selected pseudo-merge commits aren't actually used or written anywhere yet. This will be done in the following commit. Signed-off-by: Taylor Blau <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 5831f8a commit faf558b

File tree

7 files changed

+747
-0
lines changed

7 files changed

+747
-0
lines changed

Documentation/config.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -383,6 +383,8 @@ include::config/apply.txt[]
383383

384384
include::config/attr.txt[]
385385

386+
include::config/bitmap-pseudo-merge.txt[]
387+
386388
include::config/blame.txt[]
387389

388390
include::config/branch.txt[]
Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
NOTE: The configuration options in `bitmapPseudoMerge.*` are considered
2+
EXPERIMENTAL and may be subject to change or be removed entirely in the
3+
future. For more information about the pseudo-merge bitmap feature, see
4+
the "Pseudo-merge bitmaps" section of linkgit:gitpacking[7].
5+
6+
bitmapPseudoMerge.<name>.pattern::
7+
Regular expression used to match reference names. Commits
8+
pointed to by references matching this pattern (and meeting
9+
the below criteria, like `bitmapPseudoMerge.<name>.sampleRate`
10+
and `bitmapPseudoMerge.<name>.threshold`) will be considered
11+
for inclusion in a pseudo-merge bitmap.
12+
+
13+
Commits are grouped into pseudo-merge groups based on whether or not
14+
any reference(s) that point at a given commit match the pattern, which
15+
is an extended regular expression.
16+
+
17+
Within a pseudo-merge group, commits may be further grouped into
18+
sub-groups based on the capture groups in the pattern. These
19+
sub-groupings are formed from the regular expressions by concatenating
20+
any capture groups from the regular expression, with a '-' dash in
21+
between.
22+
+
23+
For example, if the pattern is `refs/tags/`, then all tags (provided
24+
they meet the below criteria) will be considered candidates for the
25+
same pseudo-merge group. However, if the pattern is instead
26+
`refs/remotes/([0-9])+/tags/`, then tags from different remotes will
27+
be grouped into separate pseudo-merge groups, based on the remote
28+
number.
29+
30+
bitmapPseudoMerge.<name>.decay::
31+
Determines the rate at which consecutive pseudo-merge bitmap
32+
groups decrease in size. Must be non-negative. This parameter
33+
can be thought of as `k` in the function `f(n) = C * n^-k`,
34+
where `f(n)` is the size of the `n`th group.
35+
+
36+
Setting the decay rate equal to `0` will cause all groups to be the
37+
same size. Setting the decay rate equal to `1` will cause the `n`th
38+
group to be `1/n` the size of the initial group. Higher values of the
39+
decay rate cause consecutive groups to shrink at an increasing rate.
40+
The default is `1`.
41+
+
42+
If all groups are the same size, it is possible that groups containing
43+
newer commits will be able to be used less often than earlier groups,
44+
since it is more likely that the references pointing at newer commits
45+
will be updated more often than a reference pointing at an old commit.
46+
47+
bitmapPseudoMerge.<name>.sampleRate::
48+
Determines the proportion of non-bitmapped commits (among
49+
reference tips) which are selected for inclusion in an
50+
unstable pseudo-merge bitmap. Must be between `0` and `1`
51+
(inclusive). The default is `1`.
52+
53+
bitmapPseudoMerge.<name>.threshold::
54+
Determines the minimum age of non-bitmapped commits (among
55+
reference tips, as above) which are candidates for inclusion
56+
in an unstable pseudo-merge bitmap. The default is
57+
`1.week.ago`.
58+
59+
bitmapPseudoMerge.<name>.maxMerges::
60+
Determines the maximum number of pseudo-merge commits among
61+
which commits may be distributed.
62+
+
63+
For pseudo-merge groups whose pattern does not contain any capture
64+
groups, this setting is applied for all commits matching the regular
65+
expression. For patterns that have one or more capture groups, this
66+
setting is applied for each distinct capture group.
67+
+
68+
For example, if your capture group is `refs/tags/`, then this setting
69+
will distribute all tags into a maximum of `maxMerges` pseudo-merge
70+
commits. However, if your capture group is, say,
71+
`refs/remotes/([0-9]+)/tags/`, then this setting will be applied to
72+
each remote's set of tags individually.
73+
+
74+
Must be non-negative. The default value is 64.
75+
76+
bitmapPseudoMerge.<name>.stableThreshold::
77+
Determines the minimum age of commits (among reference tips,
78+
as above, however stable commits are still considered
79+
candidates even when they have been covered by a bitmap) which
80+
are candidates for a stable a pseudo-merge bitmap. The default
81+
is `1.month.ago`.
82+
+
83+
Setting this threshold to a smaller value (e.g., 1.week.ago) will cause
84+
more stable groups to be generated (which impose a one-time generation
85+
cost) but those groups will likely become stale over time. Using a
86+
larger value incurs the opposite penalty (fewer stable groups which are
87+
more useful).
88+
89+
bitmapPseudoMerge.<name>.stableSize::
90+
Determines the size (in number of commits) of a stable
91+
psuedo-merge bitmap. The default is `512`.

Documentation/gitpacking.txt

Lines changed: 83 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,89 @@ can take advantage of the fact that we only care about the union of
9696
objects reachable from all of those tags, and answer the query much
9797
faster.
9898

99+
=== Configuration
100+
101+
Reference tips are grouped into different pseudo-merge groups according
102+
to two criteria. A reference name matches one or more of the defined
103+
pseudo-merge patterns, and optionally one or more capture groups within
104+
that pattern which further partition the group.
105+
106+
Within a group, commits may be considered "stable", or "unstable"
107+
depending on their age. These are adjusted by setting the
108+
`bitmapPseudoMerge.<name>.stableThreshold` and
109+
`bitmapPseudoMerge.<name>.threshold` configuration values, respectively.
110+
111+
All stable commits are grouped into pseudo-merges of equal size
112+
(`bitmapPseudoMerge.<name>.stableSize`). If the `stableSize`
113+
configuration is set to, say, 100, then the first 100 commits (ordered
114+
by committer date) which are older than the `stableThreshold` value will
115+
form one group, the next 100 commits will form another group, and so on.
116+
117+
Among unstable commits, the pseudo-merge machinery will attempt to
118+
combine older commits into large groups as opposed to newer commits
119+
which will appear in smaller groups. This is based on the heuristic that
120+
references whose tip commit is older are less likely to be modified to
121+
point at a different commit than a reference whose tip commit is newer.
122+
123+
The size of groups is determined by a power-law decay function, and the
124+
decay parameter roughly corresponds to "k" in `f(n) = C*n^(-k/100)`,
125+
where `f(n)` describes the size of the `n`-th pseudo-merge group. The
126+
sample rate controls what percentage of eligible commits are considered
127+
as candidates. The threshold parameter indicates the minimum age (so as
128+
to avoid including too-recent commits in a pseudo-merge group, making it
129+
less likely to be valid). The "maxMerges" parameter sets an upper-bound
130+
on the number of pseudo-merge commits an individual group
131+
132+
The "stable"-related parameters control "stable" pseudo-merge groups,
133+
comprised of a fixed number of commits which are older than the
134+
configured "stable threshold" value and may be grouped together in
135+
chunks of "stableSize" in order of age.
136+
137+
The exact configuration for pseudo-merges is as follows:
138+
139+
include::config/bitmap-pseudo-merge.txt[]
140+
141+
=== Examples
142+
143+
Suppose that you have a repository with a large number of references,
144+
and you want a bare-bones configuration of pseudo-merge bitmaps that
145+
will enhance bitmap coverage of the `refs/` namespace. You may start
146+
wiht a configuration like so:
147+
148+
[bitmapPseudoMerge "all"]
149+
pattern = "refs/"
150+
threshold = now
151+
stableThreshold = never
152+
sampleRate = 100
153+
maxMerges = 64
154+
155+
This will create pseudo-merge bitmaps for all references, regardless of
156+
their age, and group them into 64 pseudo-merge commits.
157+
158+
If you wanted to separate tags from branches when generating
159+
pseudo-merge commits, you would instead define the pattern with a
160+
capture group, like so:
161+
162+
[bitmapPseudoMerge "all"]
163+
pattern = "refs/(heads/tags)/"
164+
165+
Suppose instead that you are working in a fork-network repository, with
166+
each fork specified by some numeric ID, and whose refs reside in
167+
`refs/virtual/NNN/` (where `NNN` is the numeric ID corresponding to some
168+
fork) in the network. In this instance, you may instead write something
169+
like:
170+
171+
[bitmapPseudoMerge "all"]
172+
pattern = "refs/virtual/([0-9]+)/(heads|tags)/"
173+
threshold = now
174+
stableThreshold = never
175+
sampleRate = 100
176+
maxMerges = 64
177+
178+
Which would generate pseudo-merge group identifiers like "1234-heads",
179+
and "5678-tags" (for branches in fork "1234", and tags in remote "5678",
180+
respectively).
181+
99182
SEE ALSO
100183
--------
101184
linkgit:git-pack-objects[1]

pack-bitmap-write.c

Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
#include "trace2.h"
1818
#include "tree.h"
1919
#include "tree-walk.h"
20+
#include "pseudo-merge.h"
2021

2122
struct bitmapped_commit {
2223
struct commit *commit;
@@ -39,11 +40,25 @@ void bitmap_writer_init(struct bitmap_writer *writer, struct repository *r)
3940
if (writer->bitmaps)
4041
BUG("bitmap writer already initialized");
4142
writer->bitmaps = kh_init_oid_map();
43+
writer->pseudo_merge_commits = kh_init_oid_map();
44+
45+
string_list_init_dup(&writer->pseudo_merge_groups);
46+
47+
load_pseudo_merges_from_config(&writer->pseudo_merge_groups);
48+
}
49+
50+
static void free_pseudo_merge_commit_idx(struct pseudo_merge_commit_idx *idx)
51+
{
52+
if (!idx)
53+
return;
54+
free(idx->pseudo_merge);
55+
free(idx);
4256
}
4357

4458
void bitmap_writer_free(struct bitmap_writer *writer)
4559
{
4660
uint32_t i;
61+
struct pseudo_merge_commit_idx *idx;
4762

4863
if (!writer)
4964
return;
@@ -55,6 +70,10 @@ void bitmap_writer_free(struct bitmap_writer *writer)
5570

5671
kh_destroy_oid_map(writer->bitmaps);
5772

73+
kh_foreach_value(writer->pseudo_merge_commits, idx,
74+
free_pseudo_merge_commit_idx(idx));
75+
kh_destroy_oid_map(writer->pseudo_merge_commits);
76+
5877
for (i = 0; i < writer->selected_nr; i++) {
5978
struct bitmapped_commit *bc = &writer->selected[i];
6079
if (bc->write_as != bc->bitmap)
@@ -703,6 +722,8 @@ void bitmap_writer_select_commits(struct bitmap_writer *writer,
703722
}
704723

705724
stop_progress(&writer->progress);
725+
726+
select_pseudo_merges(writer, indexed_commits, indexed_commits_nr);
706727
}
707728

708729

pack-bitmap.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -110,6 +110,8 @@ struct bitmap_writer {
110110
struct bitmapped_commit *selected;
111111
unsigned int selected_nr, selected_alloc;
112112

113+
struct string_list pseudo_merge_groups;
114+
kh_oid_map_t *pseudo_merge_commits; /* oid -> pseudo merge(s) */
113115
uint32_t pseudo_merges_nr;
114116

115117
struct progress *progress;

0 commit comments

Comments
 (0)