@@ -1688,3 +1688,49 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor
1688
1688
clear_survey_context (& ctx );
1689
1689
return 0 ;
1690
1690
}
1691
+
1692
+ /*
1693
+ * NEEDSWORK: The following is a bit of a laundry list of things
1694
+ * that I'd like to add.
1695
+ *
1696
+ * [] Dump stats on all of the packfiles. The number and size of each.
1697
+ * Whether each is in the .git directory or in an alternate. The state
1698
+ * of the IDX or MIDX files and etc. Delta chain stats. All of this
1699
+ * data is relative to the "lived-in" state of the repository. Stuff
1700
+ * that may change after a GC or repack.
1701
+ *
1702
+ * [] Dump stats on each remote. When we fetch from a remote the size
1703
+ * of the response is related to the set of haves on the server. You
1704
+ * can see this in `GIT_TRACE_CURL=1 git fetch`. We get a `ls-refs`
1705
+ * payload that lists all of the branches and tags on the server, so
1706
+ * at a minimum the RefName and SHA for each. But for annotated tags
1707
+ * we also get the peeled SHA. The size of this overhead on every
1708
+ * fetch is proporational to the size of the `git ls-remote` response
1709
+ * (roughly, although the latter repeats the RefName of the peeled
1710
+ * tag). If, for example, you have 500K refs on a remote, you're
1711
+ * going to have a long "haves" message, so every fetch will be slow
1712
+ * just because of that overhead (not counting new objects to be
1713
+ * downloaded).
1714
+ *
1715
+ * Note that the local set of tags in "refs/tags/" is a union over all
1716
+ * remotes. However, since most people only have one remote, we can
1717
+ * probaly estimate the overhead value directly from the size of the
1718
+ * set of "refs/tags/" that we visited while building the `ref_info`
1719
+ * and `ref_array` and not need to ask the remote.
1720
+ *
1721
+ * [] Dump info on the complexity of the DAG. Criss-cross merges.
1722
+ * The number of edges that must be touched to compute merge bases.
1723
+ * Edge length. The number of parallel lanes in the history that must
1724
+ * be navigated to get to the merge base. What affects the cost of
1725
+ * the Ahead/Behind computation? How often do criss-crosses occur and
1726
+ * do they cause various operations to slow down?
1727
+ *
1728
+ * [] If there are primary branches (like "main" or "master") are they
1729
+ * always on the left side of merges? Does the graph have a clean
1730
+ * left edge? Or are there normal and "backwards" merges? Do these
1731
+ * cause problems at scale?
1732
+ *
1733
+ * [] If we have a hierarchy of FI/RI branches like "L1", "L2, ...,
1734
+ * can we learn anything about the shape of the repo around these FI
1735
+ * and RI integrations?
1736
+ */
0 commit comments