@@ -107,15 +107,15 @@ mapping to allow naming objects using either their SHA-1 and SHA-256 names
107
107
interchangeably.
108
108
109
109
"git cat-file" and "git hash-object" gain options to display an object
110
- in its sha1 form and write an object given its sha1 form. This
110
+ in its SHA-1 form and write an object given its SHA-1 form. This
111
111
requires all objects referenced by that object to be present in the
112
112
object database so that they can be named using the appropriate name
113
113
(using the bidirectional hash mapping).
114
114
115
115
Fetches from a SHA-1 based server convert the fetched objects into
116
116
SHA-256 form and record the mapping in the bidirectional mapping table
117
117
(see below for details). Pushes to a SHA-1 based server convert the
118
- objects being pushed into sha1 form so the server does not have to be
118
+ objects being pushed into SHA-1 form so the server does not have to be
119
119
aware of the hash function the client is using.
120
120
121
121
Detailed Design
@@ -151,38 +151,38 @@ repository extensions.
151
151
152
152
Object names
153
153
~~~~~~~~~~~~
154
- Objects can be named by their 40 hexadecimal digit sha1- name or 64
155
- hexadecimal digit sha256- name, plus names derived from those (see
154
+ Objects can be named by their 40 hexadecimal digit SHA-1 name or 64
155
+ hexadecimal digit SHA-256 name, plus names derived from those (see
156
156
gitrevisions(7)).
157
157
158
- The sha1- name of an object is the SHA-1 of the concatenation of its
159
- type, length, a nul byte, and the object's sha1- content. This is the
158
+ The SHA-1 name of an object is the SHA-1 of the concatenation of its
159
+ type, length, a nul byte, and the object's SHA-1 content. This is the
160
160
traditional <sha1> used in Git to name objects.
161
161
162
- The sha256- name of an object is the SHA-256 of the concatenation of its
163
- type, length, a nul byte, and the object's sha256- content.
162
+ The SHA-256 name of an object is the SHA-256 of the concatenation of its
163
+ type, length, a nul byte, and the object's SHA-256 content.
164
164
165
165
Object format
166
166
~~~~~~~~~~~~~
167
167
The content as a byte sequence of a tag, commit, or tree object named
168
- by sha1 and sha256 differ because an object named by sha256- name refers to
169
- other objects by their sha256- names and an object named by sha1- name
170
- refers to other objects by their sha1- names.
168
+ by SHA-1 and SHA-256 differ because an object named by SHA-256 name refers to
169
+ other objects by their SHA-256 names and an object named by SHA-1 name
170
+ refers to other objects by their SHA-1 names.
171
171
172
- The sha256- content of an object is the same as its sha1- content, except
173
- that objects referenced by the object are named using their sha256- names
174
- instead of sha1- names. Because a blob object does not refer to any
175
- other object, its sha1- content and sha256- content are the same.
172
+ The SHA-256 content of an object is the same as its SHA-1 content, except
173
+ that objects referenced by the object are named using their SHA-256 names
174
+ instead of SHA-1 names. Because a blob object does not refer to any
175
+ other object, its SHA-1 content and SHA-256 content are the same.
176
176
177
- The format allows round-trip conversion between sha256- content and
178
- sha1- content.
177
+ The format allows round-trip conversion between SHA-256 content and
178
+ SHA-1 content.
179
179
180
180
Object storage
181
181
~~~~~~~~~~~~~~
182
182
Loose objects use zlib compression and packed objects use the packed
183
183
format described in Documentation/technical/pack-format.txt, just like
184
- today. The content that is compressed and stored uses sha256- content
185
- instead of sha1- content.
184
+ today. The content that is compressed and stored uses SHA-256 content
185
+ instead of SHA-1 content.
186
186
187
187
Pack index
188
188
~~~~~~~~~~
@@ -287,18 +287,18 @@ To remove entries (e.g. in "git pack-refs" or "git-prune"):
287
287
288
288
Translation table
289
289
~~~~~~~~~~~~~~~~~
290
- The index files support a bidirectional mapping between sha1- names
291
- and sha256- names. The lookup proceeds similarly to ordinary object
292
- lookups. For example, to convert a sha1- name to a sha256- name:
290
+ The index files support a bidirectional mapping between SHA-1 names
291
+ and SHA-256 names. The lookup proceeds similarly to ordinary object
292
+ lookups. For example, to convert a SHA-1 name to a SHA-256 name:
293
293
294
294
1. Look for the object in idx files. If a match is present in the
295
- idx's sorted list of truncated sha1- names, then:
296
- a. Read the corresponding entry in the sha1- name order to pack
295
+ idx's sorted list of truncated SHA-1 names, then:
296
+ a. Read the corresponding entry in the SHA-1 name order to pack
297
297
name order mapping.
298
- b. Read the corresponding entry in the full sha1- name table to
298
+ b. Read the corresponding entry in the full SHA-1 name table to
299
299
verify we found the right object. If it is, then
300
- c. Read the corresponding entry in the full sha256- name table.
301
- That is the object's sha256- name.
300
+ c. Read the corresponding entry in the full SHA-256 name table.
301
+ That is the object's SHA-256 name.
302
302
2. Check for a loose object. Read lines from loose-object-idx until
303
303
we find a match.
304
304
@@ -312,10 +312,10 @@ Since all operations that make new objects (e.g., "git commit") add
312
312
the new objects to the corresponding index, this mapping is possible
313
313
for all objects in the object store.
314
314
315
- Reading an object's sha1- content
316
- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
317
- The sha1- content of an object can be read by converting all sha256- names
318
- its sha256- content references to sha1- names using the translation table.
315
+ Reading an object's SHA-1 content
316
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
317
+ The SHA-1 content of an object can be read by converting all SHA-256 names
318
+ its SHA-256 content references to SHA-1 names using the translation table.
319
319
320
320
Fetch
321
321
~~~~~
@@ -338,7 +338,7 @@ the following steps:
338
338
1. index-pack: inflate each object in the packfile and compute its
339
339
SHA-1. Objects can contain deltas in OBJ_REF_DELTA format against
340
340
objects the client has locally. These objects can be looked up
341
- using the translation table and their sha1- content read as
341
+ using the translation table and their SHA-1 content read as
342
342
described above to resolve the deltas.
343
343
2. topological sort: starting at the "want"s from the negotiation
344
344
phase, walk through objects in the pack and emit a list of them,
@@ -347,12 +347,12 @@ the following steps:
347
347
(This list only contains objects reachable from the "wants". If the
348
348
pack from the server contained additional extraneous objects, then
349
349
they will be discarded.)
350
- 3. convert to sha256 : open a new (sha256) packfile. Read the topologically
350
+ 3. convert to SHA-256 : open a new SHA-256 packfile. Read the topologically
351
351
sorted list just generated. For each object, inflate its
352
- sha1- content, convert to sha256- content, and write it to the sha256
353
- pack. Record the new sha1 <-->sha256 mapping entry for use in the idx.
352
+ SHA-1 content, convert to SHA-256 content, and write it to the SHA-256
353
+ pack. Record the new SHA-1 <-->SHA-256 mapping entry for use in the idx.
354
354
4. sort: reorder entries in the new pack to match the order of objects
355
- in the pack the server generated and include blobs. Write a sha256 idx
355
+ in the pack the server generated and include blobs. Write a SHA-256 idx
356
356
file
357
357
5. clean up: remove the SHA-1 based pack file, index, and
358
358
topologically sorted list obtained from the server in steps 1
@@ -377,16 +377,16 @@ experimenting to get this to perform well.
377
377
Push
378
378
~~~~
379
379
Push is simpler than fetch because the objects referenced by the
380
- pushed objects are already in the translation table. The sha1- content
380
+ pushed objects are already in the translation table. The SHA-1 content
381
381
of each object being pushed can be read as described in the "Reading
382
- an object's sha1- content" section to generate the pack written by git
382
+ an object's SHA-1 content" section to generate the pack written by git
383
383
send-pack.
384
384
385
385
Signed Commits
386
386
~~~~~~~~~~~~~~
387
387
We add a new field "gpgsig-sha256" to the commit object format to allow
388
388
signing commits without relying on SHA-1. It is similar to the
389
- existing "gpgsig" field. Its signed payload is the sha256- content of the
389
+ existing "gpgsig" field. Its signed payload is the SHA-256 content of the
390
390
commit object with any "gpgsig" and "gpgsig-sha256" fields removed.
391
391
392
392
This means commits can be signed
@@ -404,7 +404,7 @@ Signed Tags
404
404
~~~~~~~~~~~
405
405
We add a new field "gpgsig-sha256" to the tag object format to allow
406
406
signing tags without relying on SHA-1. Its signed payload is the
407
- sha256- content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP
407
+ SHA-256 content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP
408
408
SIGNATURE-----" delimited in-body signature removed.
409
409
410
410
This means tags can be signed
@@ -416,11 +416,11 @@ This means tags can be signed
416
416
417
417
Mergetag embedding
418
418
~~~~~~~~~~~~~~~~~~
419
- The mergetag field in the sha1- content of a commit contains the
420
- sha1- content of a tag that was merged by that commit.
419
+ The mergetag field in the SHA-1 content of a commit contains the
420
+ SHA-1 content of a tag that was merged by that commit.
421
421
422
- The mergetag field in the sha256- content of the same commit contains the
423
- sha256- content of the same tag.
422
+ The mergetag field in the SHA-256 content of the same commit contains the
423
+ SHA-256 content of the same tag.
424
424
425
425
Submodules
426
426
~~~~~~~~~~
@@ -495,7 +495,7 @@ Caveats
495
495
-------
496
496
Invalid objects
497
497
~~~~~~~~~~~~~~~
498
- The conversion from sha1- content to sha256- content retains any
498
+ The conversion from SHA-1 content to SHA-256 content retains any
499
499
brokenness in the original object (e.g., tree entry modes encoded with
500
500
leading 0, tree objects whose paths are not sorted correctly, and
501
501
commit objects without an author or committer). This is a deliberate
@@ -514,15 +514,15 @@ allow lifting this restriction.
514
514
515
515
Alternates
516
516
~~~~~~~~~~
517
- For the same reason, a sha256 repository cannot borrow objects from a
518
- sha1 repository using objects/info/alternates or
517
+ For the same reason, a SHA-256 repository cannot borrow objects from a
518
+ SHA-1 repository using objects/info/alternates or
519
519
$GIT_ALTERNATE_OBJECT_REPOSITORIES.
520
520
521
521
git notes
522
522
~~~~~~~~~
523
- The "git notes" tool annotates objects using their sha1- name as key.
523
+ The "git notes" tool annotates objects using their SHA-1 name as key.
524
524
This design does not describe a way to migrate notes trees to use
525
- sha256- names. That migration is expected to happen separately (for
525
+ SHA-256 names. That migration is expected to happen separately (for
526
526
example using a file at the root of the notes tree to describe which
527
527
hash it uses).
528
528
@@ -556,7 +556,7 @@ unclear:
556
556
557
557
Git 2.12
558
558
559
- Does this mean Git v2.12.0 is the commit with sha1- name
559
+ Does this mean Git v2.12.0 is the commit with SHA-1 name
560
560
e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7 or the commit with
561
561
new-40-digit-hash-name e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7?
562
562
@@ -676,7 +676,7 @@ The next step is supporting fetches and pushes to SHA-1 repositories:
676
676
- allow pushes to a repository using the compat format
677
677
- generate a topologically sorted list of the SHA-1 names of fetched
678
678
objects
679
- - convert the fetched packfile to sha256 format and generate an idx
679
+ - convert the fetched packfile to SHA-256 format and generate an idx
680
680
file
681
681
- re-sort to match the order of objects in the fetched packfile
682
682
@@ -748,38 +748,38 @@ using the old hash function.
748
748
Signed objects with multiple hashes
749
749
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
750
750
Instead of introducing the gpgsig-sha256 field in commit and tag objects
751
- for sha256- content based signatures, an earlier version of this design
752
- added "hash sha256 <sha256- name>" fields to strengthen the existing
753
- sha1- content based signatures.
751
+ for SHA-256 content based signatures, an earlier version of this design
752
+ added "hash sha256 <SHA-256 name>" fields to strengthen the existing
753
+ SHA-1 content based signatures.
754
754
755
755
In other words, a single signature was used to attest to the object
756
756
content using both hash functions. This had some advantages:
757
757
758
758
* Using one signature instead of two speeds up the signing process.
759
759
* Having one signed payload with both hashes allows the signer to
760
- attest to the sha1- name and sha256- name referring to the same object.
760
+ attest to the SHA-1 name and SHA-256 name referring to the same object.
761
761
* All users consume the same signature. Broken signatures are likely
762
762
to be detected quickly using current versions of git.
763
763
764
764
However, it also came with disadvantages:
765
765
766
- * Verifying a signed object requires access to the sha1- names of all
766
+ * Verifying a signed object requires access to the SHA-1 names of all
767
767
objects it references, even after the transition is complete and
768
768
translation table is no longer needed for anything else. To support
769
- this, the design added fields such as "hash sha1 tree <sha1- name>"
770
- and "hash sha1 parent <sha1- name>" to the sha256- content of a signed
769
+ this, the design added fields such as "hash sha1 tree <SHA-1 name>"
770
+ and "hash sha1 parent <SHA-1 name>" to the SHA-256 content of a signed
771
771
commit, complicating the conversion process.
772
- * Allowing signed objects without a sha1 (for after the transition is
772
+ * Allowing signed objects without a SHA-1 (for after the transition is
773
773
complete) complicated the design further, requiring a "nohash sha1"
774
- field to suppress including "hash sha1" fields in the sha256- content
774
+ field to suppress including "hash sha1" fields in the SHA-256 content
775
775
and signed payload.
776
776
777
777
Lazily populated translation table
778
778
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
779
779
Some of the work of building the translation table could be deferred to
780
780
push time, but that would significantly complicate and slow down pushes.
781
- Calculating the sha1- name at object creation time at the same time it is
782
- being streamed to disk and having its sha256- name calculated should be
781
+ Calculating the SHA-1 name at object creation time at the same time it is
782
+ being streamed to disk and having its SHA-256 name calculated should be
783
783
an acceptable cost.
784
784
785
785
Document History
@@ -801,7 +801,7 @@ Incorporated suggestions from jonathantanmy and sbeller:
801
801
802
802
803
803
* Use SHA3-256 instead of SHA2 (thanks, Linus and brian m. carlson).[1][2]
804
- * Make sha3 -based signatures a separate field, avoiding the need for
804
+ * Make SHA3 -based signatures a separate field, avoiding the need for
805
805
"hash" and "nohash" fields (thanks to peff[3]).
806
806
* Add a sorting phase to fetch (thanks to Junio for noticing the need
807
807
for this).
0 commit comments