Skip to content

Commit dc1d73c

Browse files
jeffhostetlerdscho
authored andcommitted
survey: started TODO list at bottom of source file
1 parent 675fd70 commit dc1d73c

File tree

1 file changed

+46
-0
lines changed

1 file changed

+46
-0
lines changed

builtin/survey.c

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1688,3 +1688,49 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor
16881688
clear_survey_context(&ctx);
16891689
return 0;
16901690
}
1691+
1692+
/*
1693+
* NEEDSWORK: The following is a bit of a laundry list of things
1694+
* that I'd like to add.
1695+
*
1696+
* [] Dump stats on all of the packfiles. The number and size of each.
1697+
* Whether each is in the .git directory or in an alternate. The state
1698+
* of the IDX or MIDX files and etc. Delta chain stats. All of this
1699+
* data is relative to the "lived-in" state of the repository. Stuff
1700+
* that may change after a GC or repack.
1701+
*
1702+
* [] Dump stats on each remote. When we fetch from a remote the size
1703+
* of the response is related to the set of haves on the server. You
1704+
* can see this in `GIT_TRACE_CURL=1 git fetch`. We get a `ls-refs`
1705+
* payload that lists all of the branches and tags on the server, so
1706+
* at a minimum the RefName and SHA for each. But for annotated tags
1707+
* we also get the peeled SHA. The size of this overhead on every
1708+
* fetch is proporational to the size of the `git ls-remote` response
1709+
* (roughly, although the latter repeats the RefName of the peeled
1710+
* tag). If, for example, you have 500K refs on a remote, you're
1711+
* going to have a long "haves" message, so every fetch will be slow
1712+
* just because of that overhead (not counting new objects to be
1713+
* downloaded).
1714+
*
1715+
* Note that the local set of tags in "refs/tags/" is a union over all
1716+
* remotes. However, since most people only have one remote, we can
1717+
* probaly estimate the overhead value directly from the size of the
1718+
* set of "refs/tags/" that we visited while building the `ref_info`
1719+
* and `ref_array` and not need to ask the remote.
1720+
*
1721+
* [] Dump info on the complexity of the DAG. Criss-cross merges.
1722+
* The number of edges that must be touched to compute merge bases.
1723+
* Edge length. The number of parallel lanes in the history that must
1724+
* be navigated to get to the merge base. What affects the cost of
1725+
* the Ahead/Behind computation? How often do criss-crosses occur and
1726+
* do they cause various operations to slow down?
1727+
*
1728+
* [] If there are primary branches (like "main" or "master") are they
1729+
* always on the left side of merges? Does the graph have a clean
1730+
* left edge? Or are there normal and "backwards" merges? Do these
1731+
* cause problems at scale?
1732+
*
1733+
* [] If we have a hierarchy of FI/RI branches like "L1", "L2, ...,
1734+
* can we learn anything about the shape of the repo around these FI
1735+
* and RI integrations?
1736+
*/

0 commit comments

Comments
 (0)