Skip to content

Commit 3a03633

Browse files
committed
Merge branch 'vd/scalar-doc'
Doc update. * vd/scalar-doc: scalar: convert README.md into a technical design doc scalar: reword command documentation to clarify purpose
2 parents 0c5222b + 72d3a5d commit 3a03633

File tree

3 files changed

+131
-87
lines changed

3 files changed

+131
-87
lines changed

Documentation/technical/scalar.txt

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
Scalar
2+
======
3+
4+
Scalar is a repository management tool that optimizes Git for use in large
5+
repositories. It accomplishes this by helping users to take advantage of
6+
advanced performance features in Git. Unlike most other Git built-in commands,
7+
Scalar is not executed as a subcommand of 'git'; rather, it is built as a
8+
separate executable containing its own series of subcommands.
9+
10+
Background
11+
----------
12+
13+
Scalar was originally designed as an add-on to Git and implemented as a .NET
14+
Core application. It was created based on the learnings from the VFS for Git
15+
project (another application aimed at improving the experience of working with
16+
large repositories). As part of its initial implementation, Scalar relied on
17+
custom features in the Microsoft fork of Git that have since been integrated
18+
into core Git:
19+
20+
* partial clone,
21+
* commit graphs,
22+
* multi-pack index,
23+
* sparse checkout (cone mode),
24+
* scheduled background maintenance,
25+
* etc
26+
27+
With the requisite Git functionality in place and a desire to bring the benefits
28+
of Scalar to the larger Git community, the Scalar application itself was ported
29+
from C# to C and integrated upstream.
30+
31+
Features
32+
--------
33+
34+
Scalar is comprised of two major pieces of functionality: automatically
35+
configuring built-in Git performance features and managing repository
36+
enlistments.
37+
38+
The Git performance features configured by Scalar (see "Background" for
39+
examples) confer substantial performance benefits to large repositories, but are
40+
either too experimental to enable for all of Git yet, or only benefit large
41+
repositories. As new features are introduced, Scalar should be updated
42+
accordingly to incorporate them. This will prevent the tool from becoming stale
43+
while also providing a path for more easily bringing features to the appropriate
44+
users.
45+
46+
Enlistments are how Scalar knows which repositories on a user's system should
47+
utilize Scalar-configured features. This allows it to update performance
48+
settings when new ones are added to the tool, as well as centrally manage
49+
repository maintenance. The enlistment structure - a root directory with a
50+
`src/` subdirectory containing the cloned repository itself - is designed to
51+
encourage users to route build outputs outside of the repository to avoid the
52+
performance-limiting overhead of ignoring those files in Git.
53+
54+
Design
55+
------
56+
57+
Scalar is implemented in C and interacts with Git via a mix of child process
58+
invocations of Git and direct usage of `libgit.a`. Internally, it is structured
59+
much like other built-ins with subcommands (e.g., `git stash`), containing a
60+
`cmd_<subcommand>()` function for each subcommand, routed through a `cmd_main()`
61+
function. Most options are unique to each subcommand, with `scalar` respecting
62+
some "global" `git` options (e.g., `-c` and `-C`).
63+
64+
Because `scalar` is not invoked as a Git subcommand (like `git scalar`), it is
65+
built and installed as its own executable in the `bin/` directory, alongside
66+
`git`, `git-gui`, etc.
67+
68+
Roadmap
69+
-------
70+
71+
NOTE: this section will be removed once the remaining tasks outlined in this
72+
roadmap are complete.
73+
74+
Scalar is a large enough project that it is being upstreamed incrementally,
75+
living in `contrib/` until it is feature-complete. So far, the following patch
76+
series have been accepted:
77+
78+
- `scalar-the-beginning`: The initial patch series which sets up
79+
`contrib/scalar/` and populates it with a minimal `scalar` command that
80+
demonstrates the fundamental ideas.
81+
82+
- `scalar-c-and-C`: The `scalar` command learns about two options that can be
83+
specified before the command, `-c <key>=<value>` and `-C <directory>`.
84+
85+
- `scalar-diagnose`: The `scalar` command is taught the `diagnose` subcommand.
86+
87+
Roughly speaking (and subject to change), the following series are needed to
88+
"finish" this initial version of Scalar:
89+
90+
- Finish Scalar features: Enable the built-in FSMonitor in Scalar enlistments
91+
and implement `scalar help`. At the end of this series, Scalar should be
92+
feature-complete from the perspective of a user.
93+
94+
- Generalize features not specific to Scalar: In the spirit of making Scalar
95+
configure only what is needed for large repo performance, move common
96+
utilities into other parts of Git. Some of this will be internal-only, but one
97+
major change will be generalizing `scalar diagnose` for use with any Git
98+
repository.
99+
100+
- Move Scalar to toplevel: Move Scalar out of `contrib/` and into the root of
101+
`git`, including updates to build and install it with the rest of Git. This
102+
change will incorporate Scalar into the Git CI and test framework, as well as
103+
expand regression and performance testing to ensure the tool is stable.
104+
105+
Finally, there are two additional patch series that exist in Microsoft's fork of
106+
Git, but there is no current plan to upstream them. There are some interesting
107+
ideas there, but the implementation is too specific to Azure Repos and/or VFS
108+
for Git to be of much help in general.
109+
110+
These still exist mainly because the GVFS protocol is what Azure Repos has
111+
instead of partial clone, while Git is focused on improving partial clone:
112+
113+
- `scalar-with-gvfs`: The primary purpose of this patch series is to support
114+
existing Scalar users whose repositories are hosted in Azure Repos (which does
115+
not support Git's partial clones, but supports its predecessor, the GVFS
116+
protocol, which is used by Scalar to emulate the partial clone).
117+
118+
Since the GVFS protocol will never be supported by core Git, this patch series
119+
will remain in Microsoft's fork of Git.
120+
121+
- `run-scalar-functional-tests`: The Scalar project developed a quite
122+
comprehensive set of integration tests (or, "Functional Tests"). They are the
123+
sole remaining part of the original C#-based Scalar project, and this patch
124+
adds a GitHub workflow that runs them all.
125+
126+
Since the tests partially depend on features that are only provided in the
127+
`scalar-with-gvfs` patch series, this patch cannot be upstreamed.

contrib/scalar/README.md

Lines changed: 0 additions & 82 deletions
This file was deleted.

contrib/scalar/scalar.txt

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ scalar(1)
33

44
NAME
55
----
6-
scalar - an opinionated repository management tool
6+
scalar - A tool for managing large Git repositories
77

88
SYNOPSIS
99
--------
@@ -20,10 +20,9 @@ scalar delete <enlistment>
2020
DESCRIPTION
2121
-----------
2222

23-
Scalar is an opinionated repository management tool. By creating new
24-
repositories or registering existing repositories with Scalar, your Git
25-
experience will speed up. Scalar sets advanced Git config settings,
26-
maintains your repositories in the background, and helps reduce data sent
23+
Scalar is a repository management tool that optimizes Git for use in large
24+
repositories. Scalar improves performance by configuring advanced Git settings,
25+
maintaining repositories in the background, and helping to reduce data sent
2726
across the network.
2827

2928
An important Scalar concept is the enlistment: this is the top-level directory

0 commit comments

Comments
 (0)