Skip to content

Commit 42e686d

Browse files
committed
Merge branch 'jh/experimental-survey'
This topic branch brings in a new, experimental built-in command to assess the dimensions of a local repository. It is experimental and subject to change! It might grow new options, change its output, or even be moved into `git diagnose --analyze` or something like that. The hope is that this command, which was inspired by `git sizer` (https://github.com/github/git-sizer), will be helpful not only in diagnosing issues with large repositories, but also in modeling what shapes and sizes of repositories can be handled by Git (and as a corollary: where Git needs to improve to be able to accommodate the natural growth of repositories). Signed-off-by: Johannes Schindelin <[email protected]>
2 parents b105301 + 79f3731 commit 42e686d

File tree

11 files changed

+3210
-0
lines changed

11 files changed

+3210
-0
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -164,6 +164,7 @@
164164
/git-submodule
165165
/git-submodule--helper
166166
/git-subtree
167+
/git-survey
167168
/git-svn
168169
/git-switch
169170
/git-symbolic-ref

Documentation/config.txt

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -536,6 +536,8 @@ include::config/status.txt[]
536536

537537
include::config/submodule.txt[]
538538

539+
include::config/survey.txt[]
540+
539541
include::config/tag.txt[]
540542

541543
include::config/tar.txt[]

Documentation/config/survey.txt

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
survey.namerev::
2+
Boolean to show/hide `git name-rev` information for
3+
each reported commit and the containing commit of each
4+
reported tree and blob.
5+
6+
survey.progress::
7+
Boolean to show/hide progress information. Defaults to
8+
true when interactive (stderr is bound to a TTY).
9+
10+
survey.showBlobSizes::
11+
A non-negative integer value. Requests details on the <n>
12+
largest file blobs by size in bytes. Provides a default
13+
value for `--blob-sizes=<n>` in linkgit:git-survey[1].
14+
15+
survey.showCommitParents::
16+
A non-negative integer value. Requests details on the <n>
17+
commits with the most number of parents. Provides a default
18+
value for `--commit-parents=<n>` in linkgit:git-survey[1].
19+
20+
survey.showCommitSizes::
21+
A non-negative integer value. Requests details on the <n>
22+
largest commits by size in bytes. Generally, these are the
23+
commits with the largest commit messages. Provides a default
24+
value for `--commit-sizes=<n>` in linkgit:git-survey[1].
25+
26+
survey.showTreeEntries::
27+
A non-negative integer value. Requests details on the <n>
28+
trees (directories) with the most number of entries (files
29+
and subdirectories). Provides a default value for
30+
`--tree-entries=<n>` in linkgit:git-survey[1].
31+
32+
survey.showTreeSizes::
33+
A non-negative integer value. Requests details on the <n>
34+
largest trees (directories) by size in bytes. This will
35+
set will usually be equal to the `survey.showTreeEntries`
36+
set, but may be skewed by very long file or subdirectory
37+
entry names. Provides a default value for
38+
`--tree-sizes=<n>` in linkgit:git-survey[1].
39+
40+
survey.verbose::
41+
Boolean to show/hide verbose output. Default to false.

Documentation/git-survey.txt

Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
git-survey(1)
2+
=============
3+
4+
NAME
5+
----
6+
git-survey - EXPERIMENTAL: Measure various repository dimensions of scale
7+
8+
SYNOPSIS
9+
--------
10+
[verse]
11+
(EXPERIMENTAL!) `git survey` <options>
12+
13+
DESCRIPTION
14+
-----------
15+
16+
Survey the repository and measure various dimensions of scale.
17+
18+
As repositories grow to "monorepo" size, certain data shapes can cause
19+
performance problems. `git-survey` attempts to measure and report on
20+
known problem areas.
21+
22+
Ref Selection and Reachable Objects
23+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
24+
25+
In this first analysis phase, `git survey` will iterate over the set of
26+
requested branches, tags, and other refs and treewalk over all of the
27+
reachable commits, trees, and blobs and generate various statistics.
28+
29+
OPTIONS
30+
-------
31+
32+
--progress::
33+
Show progress. This is automatically enabled when interactive.
34+
35+
--json::
36+
Print results in JSON rather than in a human-friendly format.
37+
38+
--[no-]name-rev::
39+
Print `git name-rev` output for each commit, tree, and blob.
40+
Defaults to true.
41+
42+
Ref Selection
43+
~~~~~~~~~~~~~
44+
45+
The following options control the set of refs that `git survey` will examine.
46+
By default, `git survey` will look at tags, local branches, and remote refs.
47+
If any of the following options are given, the default set is cleared and
48+
only refs for the given options are added.
49+
50+
--all-refs::
51+
Use all refs. This includes local branches, tags, remote refs,
52+
notes, and stashes. This option overrides all of the following.
53+
54+
--branches::
55+
Add local branches (`refs/heads/`) to the set.
56+
57+
--tags::
58+
Add tags (`refs/tags/`) to the set.
59+
60+
--remotes::
61+
Add remote branches (`refs/remote/`) to the set.
62+
63+
--detached::
64+
Add HEAD to the set.
65+
66+
--other::
67+
Add notes (`refs/notes/`) and stashes (`refs/stash/`) to the set.
68+
69+
Large Item Selection
70+
~~~~~~~~~~~~~~~~~~~~
71+
72+
The following options control the optional display of large items under
73+
various dimensions of scale. The OID of the largest `n` objects will be
74+
displayed in reverse sorted order. For each, `n` defaults to 10.
75+
76+
--commit-parents::
77+
Shows the OIDs of the commits with the most parent commits.
78+
79+
--commit-sizes::
80+
Shows the OIDs of the largest commits by size in bytes. This is
81+
usually the ones with the largest commit messages.
82+
83+
--tree-entries::
84+
Shows the OIDs of the trees with the most number of entries. These
85+
are the directories with the most number of files or subdirectories.
86+
87+
--tree-sizes::
88+
Shows the OIDs of the largest trees by size in bytes. This set
89+
will usually be the same as the vector of number of entries unless
90+
skewed by very long entry names.
91+
92+
--blob-sizes::
93+
Shows the OIDs of the largest blobs by size in bytes.
94+
95+
OUTPUT
96+
------
97+
98+
By default, `git survey` will print information about the repository in a
99+
human-readable format that includes overviews and tables.
100+
101+
CONFIGURATION
102+
-------------
103+
104+
include::config/survey.txt[]
105+
106+
GIT
107+
---
108+
Part of the linkgit:git[1] suite

Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1309,6 +1309,7 @@ BUILTIN_OBJS += builtin/sparse-checkout.o
13091309
BUILTIN_OBJS += builtin/stash.o
13101310
BUILTIN_OBJS += builtin/stripspace.o
13111311
BUILTIN_OBJS += builtin/submodule--helper.o
1312+
BUILTIN_OBJS += builtin/survey.o
13121313
BUILTIN_OBJS += builtin/symbolic-ref.o
13131314
BUILTIN_OBJS += builtin/tag.o
13141315
BUILTIN_OBJS += builtin/unpack-file.o

builtin.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -238,6 +238,7 @@ int cmd_status(int argc, const char **argv, const char *prefix);
238238
int cmd_stash(int argc, const char **argv, const char *prefix);
239239
int cmd_stripspace(int argc, const char **argv, const char *prefix);
240240
int cmd_submodule__helper(int argc, const char **argv, const char *prefix);
241+
int cmd_survey(int argc, const char **argv, const char *prefix);
241242
int cmd_switch(int argc, const char **argv, const char *prefix);
242243
int cmd_symbolic_ref(int argc, const char **argv, const char *prefix);
243244
int cmd_tag(int argc, const char **argv, const char *prefix);

0 commit comments

Comments
 (0)