Skip to content

Commit 6bbd103

Browse files
committed
Merge branch 'jh/partial-clone-doc'
Doc updates. * jh/partial-clone-doc: partial-clone: render design doc using asciidoc
2 parents 4601516 + 5641eb9 commit 6bbd103

File tree

2 files changed

+105
-104
lines changed

2 files changed

+105
-104
lines changed

Documentation/Makefile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,7 @@ TECH_DOCS += technical/long-running-process-protocol
7676
TECH_DOCS += technical/pack-format
7777
TECH_DOCS += technical/pack-heuristics
7878
TECH_DOCS += technical/pack-protocol
79+
TECH_DOCS += technical/partial-clone
7980
TECH_DOCS += technical/protocol-capabilities
8081
TECH_DOCS += technical/protocol-common
8182
TECH_DOCS += technical/protocol-v2

Documentation/technical/partial-clone.txt

Lines changed: 104 additions & 104 deletions
Original file line numberDiff line numberDiff line change
@@ -69,24 +69,24 @@ Design Details
6969

7070
- A new pack-protocol capability "filter" is added to the fetch-pack and
7171
upload-pack negotiation.
72-
73-
This uses the existing capability discovery mechanism.
74-
See "filter" in Documentation/technical/pack-protocol.txt.
72+
+
73+
This uses the existing capability discovery mechanism.
74+
See "filter" in Documentation/technical/pack-protocol.txt.
7575

7676
- Clients pass a "filter-spec" to clone and fetch which is passed to the
7777
server to request filtering during packfile construction.
78-
79-
There are various filters available to accommodate different situations.
80-
See "--filter=<filter-spec>" in Documentation/rev-list-options.txt.
78+
+
79+
There are various filters available to accommodate different situations.
80+
See "--filter=<filter-spec>" in Documentation/rev-list-options.txt.
8181

8282
- On the server pack-objects applies the requested filter-spec as it
8383
creates "filtered" packfiles for the client.
84-
85-
These filtered packfiles are *incomplete* in the traditional sense because
86-
they may contain objects that reference objects not contained in the
87-
packfile and that the client doesn't already have. For example, the
88-
filtered packfile may contain trees or tags that reference missing blobs
89-
or commits that reference missing trees.
84+
+
85+
These filtered packfiles are *incomplete* in the traditional sense because
86+
they may contain objects that reference objects not contained in the
87+
packfile and that the client doesn't already have. For example, the
88+
filtered packfile may contain trees or tags that reference missing blobs
89+
or commits that reference missing trees.
9090

9191
- On the client these incomplete packfiles are marked as "promisor packfiles"
9292
and treated differently by various commands.
@@ -104,47 +104,47 @@ Handling Missing Objects
104104
to repository corruption. To differentiate these cases, the local
105105
repository specially indicates such filtered packfiles obtained from the
106106
promisor remote as "promisor packfiles".
107-
108-
These promisor packfiles consist of a "<name>.promisor" file with
109-
arbitrary contents (like the "<name>.keep" files), in addition to
110-
their "<name>.pack" and "<name>.idx" files.
107+
+
108+
These promisor packfiles consist of a "<name>.promisor" file with
109+
arbitrary contents (like the "<name>.keep" files), in addition to
110+
their "<name>.pack" and "<name>.idx" files.
111111

112112
- The local repository considers a "promisor object" to be an object that
113113
it knows (to the best of its ability) that the promisor remote has promised
114114
that it has, either because the local repository has that object in one of
115115
its promisor packfiles, or because another promisor object refers to it.
116-
117-
When Git encounters a missing object, Git can see if it a promisor object
118-
and handle it appropriately. If not, Git can report a corruption.
119-
120-
This means that there is no need for the client to explicitly maintain an
121-
expensive-to-modify list of missing objects.[a]
116+
+
117+
When Git encounters a missing object, Git can see if it a promisor object
118+
and handle it appropriately. If not, Git can report a corruption.
119+
+
120+
This means that there is no need for the client to explicitly maintain an
121+
expensive-to-modify list of missing objects.[a]
122122

123123
- Since almost all Git code currently expects any referenced object to be
124124
present locally and because we do not want to force every command to do
125125
a dry-run first, a fallback mechanism is added to allow Git to attempt
126126
to dynamically fetch missing objects from the promisor remote.
127-
128-
When the normal object lookup fails to find an object, Git invokes
129-
fetch-object to try to get the object from the server and then retry
130-
the object lookup. This allows objects to be "faulted in" without
131-
complicated prediction algorithms.
132-
133-
For efficiency reasons, no check as to whether the missing object is
134-
actually a promisor object is performed.
135-
136-
Dynamic object fetching tends to be slow as objects are fetched one at
137-
a time.
127+
+
128+
When the normal object lookup fails to find an object, Git invokes
129+
fetch-object to try to get the object from the server and then retry
130+
the object lookup. This allows objects to be "faulted in" without
131+
complicated prediction algorithms.
132+
+
133+
For efficiency reasons, no check as to whether the missing object is
134+
actually a promisor object is performed.
135+
+
136+
Dynamic object fetching tends to be slow as objects are fetched one at
137+
a time.
138138

139139
- `checkout` (and any other command using `unpack-trees`) has been taught
140140
to bulk pre-fetch all required missing blobs in a single batch.
141141

142142
- `rev-list` has been taught to print missing objects.
143-
144-
This can be used by other commands to bulk prefetch objects.
145-
For example, a "git log -p A..B" may internally want to first do
146-
something like "git rev-list --objects --quiet --missing=print A..B"
147-
and prefetch those objects in bulk.
143+
+
144+
This can be used by other commands to bulk prefetch objects.
145+
For example, a "git log -p A..B" may internally want to first do
146+
something like "git rev-list --objects --quiet --missing=print A..B"
147+
and prefetch those objects in bulk.
148148

149149
- `fsck` has been updated to be fully aware of promisor objects.
150150

@@ -154,11 +154,11 @@ Handling Missing Objects
154154
- The global variable "fetch_if_missing" is used to control whether an
155155
object lookup will attempt to dynamically fetch a missing object or
156156
report an error.
157-
158-
We are not happy with this global variable and would like to remove it,
159-
but that requires significant refactoring of the object code to pass an
160-
additional flag. We hope that concurrent efforts to add an ODB API can
161-
encompass this.
157+
+
158+
We are not happy with this global variable and would like to remove it,
159+
but that requires significant refactoring of the object code to pass an
160+
additional flag. We hope that concurrent efforts to add an ODB API can
161+
encompass this.
162162

163163

164164
Fetching Missing Objects
@@ -168,10 +168,10 @@ Fetching Missing Objects
168168
transport_fetch_refs(), setting a new transport option
169169
TRANS_OPT_NO_DEPENDENTS to indicate that only the objects themselves are
170170
desired, not any object that they refer to.
171-
172-
Because some transports invoke fetch_pack() in the same process, fetch_pack()
173-
has been updated to not use any object flags when the corresponding argument
174-
(no_dependents) is set.
171+
+
172+
Because some transports invoke fetch_pack() in the same process, fetch_pack()
173+
has been updated to not use any object flags when the corresponding argument
174+
(no_dependents) is set.
175175

176176
- The local repository sends a request with the hashes of all requested
177177
objects as "want" lines, and does not perform any packfile negotiation.
@@ -187,13 +187,13 @@ Current Limitations
187187

188188
- The remote used for a partial clone (or the first partial fetch
189189
following a regular clone) is marked as the "promisor remote".
190-
191-
We are currently limited to a single promisor remote and only that
192-
remote may be used for subsequent partial fetches.
193-
194-
We accept this limitation because we believe initial users of this
195-
feature will be using it on repositories with a strong single central
196-
server.
190+
+
191+
We are currently limited to a single promisor remote and only that
192+
remote may be used for subsequent partial fetches.
193+
+
194+
We accept this limitation because we believe initial users of this
195+
feature will be using it on repositories with a strong single central
196+
server.
197197

198198
- Dynamic object fetching will only ask the promisor remote for missing
199199
objects. We assume that the promisor remote has a complete view of the
@@ -221,13 +221,13 @@ Future Work
221221
- Allow more than one promisor remote and define a strategy for fetching
222222
missing objects from specific promisor remotes or of iterating over the
223223
set of promisor remotes until a missing object is found.
224-
225-
A user might want to have multiple geographically-close cache servers
226-
for fetching missing blobs while continuing to do filtered `git-fetch`
227-
commands from the central server, for example.
228-
229-
Or the user might want to work in a triangular work flow with multiple
230-
promisor remotes that each have an incomplete view of the repository.
224+
+
225+
A user might want to have multiple geographically-close cache servers
226+
for fetching missing blobs while continuing to do filtered `git-fetch`
227+
commands from the central server, for example.
228+
+
229+
Or the user might want to work in a triangular work flow with multiple
230+
promisor remotes that each have an incomplete view of the repository.
231231

232232
- Allow repack to work on promisor packfiles (while keeping them distinct
233233
from non-promisor packfiles).
@@ -238,25 +238,25 @@ Future Work
238238
- Investigate use of a long-running process to dynamically fetch a series
239239
of objects, such as proposed in [5,6] to reduce process startup and
240240
overhead costs.
241-
242-
It would be nice if pack protocol V2 could allow that long-running
243-
process to make a series of requests over a single long-running
244-
connection.
241+
+
242+
It would be nice if pack protocol V2 could allow that long-running
243+
process to make a series of requests over a single long-running
244+
connection.
245245

246246
- Investigate pack protocol V2 to avoid the info/refs broadcast on
247247
each connection with the server to dynamically fetch missing objects.
248248

249249
- Investigate the need to handle loose promisor objects.
250-
251-
Objects in promisor packfiles are allowed to reference missing objects
252-
that can be dynamically fetched from the server. An assumption was
253-
made that loose objects are only created locally and therefore should
254-
not reference a missing object. We may need to revisit that assumption
255-
if, for example, we dynamically fetch a missing tree and store it as a
256-
loose object rather than a single object packfile.
257-
258-
This does not necessarily mean we need to mark loose objects as promisor;
259-
it may be sufficient to relax the object lookup or is-promisor functions.
250+
+
251+
Objects in promisor packfiles are allowed to reference missing objects
252+
that can be dynamically fetched from the server. An assumption was
253+
made that loose objects are only created locally and therefore should
254+
not reference a missing object. We may need to revisit that assumption
255+
if, for example, we dynamically fetch a missing tree and store it as a
256+
loose object rather than a single object packfile.
257+
+
258+
This does not necessarily mean we need to mark loose objects as promisor;
259+
it may be sufficient to relax the object lookup or is-promisor functions.
260260

261261

262262
Non-Tasks
@@ -265,13 +265,13 @@ Non-Tasks
265265
- Every time the subject of "demand loading blobs" comes up it seems
266266
that someone suggests that the server be allowed to "guess" and send
267267
additional objects that may be related to the requested objects.
268-
269-
No work has gone into actually doing that; we're just documenting that
270-
it is a common suggestion. We're not sure how it would work and have
271-
no plans to work on it.
272-
273-
It is valid for the server to send more objects than requested (even
274-
for a dynamic object fetch), but we are not building on that.
268+
+
269+
No work has gone into actually doing that; we're just documenting that
270+
it is a common suggestion. We're not sure how it would work and have
271+
no plans to work on it.
272+
+
273+
It is valid for the server to send more objects than requested (even
274+
for a dynamic object fetch), but we are not building on that.
275275

276276

277277
Footnotes
@@ -282,43 +282,43 @@ Footnotes
282282
This would essentially be a sorted linear list of OIDs that the were
283283
omitted by the server during a clone or subsequent fetches.
284284

285-
This file would need to be loaded into memory on every object lookup.
286-
It would need to be read, updated, and re-written (like the .git/index)
287-
on every explicit "git fetch" command *and* on any dynamic object fetch.
285+
This file would need to be loaded into memory on every object lookup.
286+
It would need to be read, updated, and re-written (like the .git/index)
287+
on every explicit "git fetch" command *and* on any dynamic object fetch.
288288

289-
The cost to read, update, and write this file could add significant
290-
overhead to every command if there are many missing objects. For example,
291-
if there are 100M missing blobs, this file would be at least 2GiB on disk.
289+
The cost to read, update, and write this file could add significant
290+
overhead to every command if there are many missing objects. For example,
291+
if there are 100M missing blobs, this file would be at least 2GiB on disk.
292292

293-
With the "promisor" concept, we *infer* a missing object based upon the
294-
type of packfile that references it.
293+
With the "promisor" concept, we *infer* a missing object based upon the
294+
type of packfile that references it.
295295

296296

297297
Related Links
298298
-------------
299-
[0] https://bugs.chromium.org/p/git/issues/detail?id=2
300-
Chromium work item for: Partial Clone
299+
[0] https://crbug.com/git/2
300+
Bug#2: Partial Clone
301301

302-
[1] https://public-inbox.org/git/[email protected]/
303-
Subject: [RFC] Add support for downloading blobs on demand
302+
[1] https://public-inbox.org/git/[email protected]/ +
303+
Subject: [RFC] Add support for downloading blobs on demand +
304304
Date: Fri, 13 Jan 2017 10:52:53 -0500
305305

306-
[2] https://public-inbox.org/git/[email protected]/
307-
Subject: [PATCH 00/18] Partial clone (from clone to lazy fetch in 18 patches)
306+
[2] https://public-inbox.org/git/[email protected]/ +
307+
Subject: [PATCH 00/18] Partial clone (from clone to lazy fetch in 18 patches) +
308308
Date: Fri, 29 Sep 2017 13:11:36 -0700
309309

310-
[3] https://public-inbox.org/git/[email protected]/
311-
Subject: Proposal for missing blob support in Git repos
310+
[3] https://public-inbox.org/git/[email protected]/ +
311+
Subject: Proposal for missing blob support in Git repos +
312312
Date: Wed, 26 Apr 2017 15:13:46 -0700
313313

314-
[4] https://public-inbox.org/git/[email protected]/
315-
Subject: [PATCH 00/10] RFC Partial Clone and Fetch
314+
[4] https://public-inbox.org/git/[email protected]/ +
315+
Subject: [PATCH 00/10] RFC Partial Clone and Fetch +
316316
Date: Wed, 8 Mar 2017 18:50:29 +0000
317317

318-
[5] https://public-inbox.org/git/[email protected]/
319-
Subject: [PATCH v7 00/10] refactor the filter process code into a reusable module
318+
[5] https://public-inbox.org/git/[email protected]/ +
319+
Subject: [PATCH v7 00/10] refactor the filter process code into a reusable module +
320320
Date: Fri, 5 May 2017 11:27:52 -0400
321321

322-
[6] https://public-inbox.org/git/[email protected]/
323-
Subject: [RFC/PATCH v2 0/1] Add support for downloading blobs on demand
322+
[6] https://public-inbox.org/git/[email protected]/ +
323+
Subject: [RFC/PATCH v2 0/1] Add support for downloading blobs on demand +
324324
Date: Fri, 14 Jul 2017 09:26:50 -0400

0 commit comments

Comments
 (0)