Skip to content

Commit af9b1e9

Browse files
tacker66gitster
authored andcommitted
doc hash-function-transition: use SHA-1 and SHA-256 consistently
Use SHA-1 and SHA-256 instead of sha1 and sha256 when referring to the hash type. Signed-off-by: Thomas Ackermann <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent de82095 commit af9b1e9

File tree

1 file changed

+63
-63
lines changed

1 file changed

+63
-63
lines changed

Documentation/technical/hash-function-transition.txt

Lines changed: 63 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -107,15 +107,15 @@ mapping to allow naming objects using either their SHA-1 and SHA-256 names
107107
interchangeably.
108108

109109
"git cat-file" and "git hash-object" gain options to display an object
110-
in its sha1 form and write an object given its sha1 form. This
110+
in its SHA-1 form and write an object given its SHA-1 form. This
111111
requires all objects referenced by that object to be present in the
112112
object database so that they can be named using the appropriate name
113113
(using the bidirectional hash mapping).
114114

115115
Fetches from a SHA-1 based server convert the fetched objects into
116116
SHA-256 form and record the mapping in the bidirectional mapping table
117117
(see below for details). Pushes to a SHA-1 based server convert the
118-
objects being pushed into sha1 form so the server does not have to be
118+
objects being pushed into SHA-1 form so the server does not have to be
119119
aware of the hash function the client is using.
120120

121121
Detailed Design
@@ -151,38 +151,38 @@ repository extensions.
151151

152152
Object names
153153
~~~~~~~~~~~~
154-
Objects can be named by their 40 hexadecimal digit sha1-name or 64
155-
hexadecimal digit sha256-name, plus names derived from those (see
154+
Objects can be named by their 40 hexadecimal digit SHA-1 name or 64
155+
hexadecimal digit SHA-256 name, plus names derived from those (see
156156
gitrevisions(7)).
157157

158-
The sha1-name of an object is the SHA-1 of the concatenation of its
159-
type, length, a nul byte, and the object's sha1-content. This is the
158+
The SHA-1 name of an object is the SHA-1 of the concatenation of its
159+
type, length, a nul byte, and the object's SHA-1 content. This is the
160160
traditional <sha1> used in Git to name objects.
161161

162-
The sha256-name of an object is the SHA-256 of the concatenation of its
163-
type, length, a nul byte, and the object's sha256-content.
162+
The SHA-256 name of an object is the SHA-256 of the concatenation of its
163+
type, length, a nul byte, and the object's SHA-256 content.
164164

165165
Object format
166166
~~~~~~~~~~~~~
167167
The content as a byte sequence of a tag, commit, or tree object named
168-
by sha1 and sha256 differ because an object named by sha256-name refers to
169-
other objects by their sha256-names and an object named by sha1-name
170-
refers to other objects by their sha1-names.
168+
by SHA-1 and SHA-256 differ because an object named by SHA-256 name refers to
169+
other objects by their SHA-256 names and an object named by SHA-1 name
170+
refers to other objects by their SHA-1 names.
171171

172-
The sha256-content of an object is the same as its sha1-content, except
173-
that objects referenced by the object are named using their sha256-names
174-
instead of sha1-names. Because a blob object does not refer to any
175-
other object, its sha1-content and sha256-content are the same.
172+
The SHA-256 content of an object is the same as its SHA-1 content, except
173+
that objects referenced by the object are named using their SHA-256 names
174+
instead of SHA-1 names. Because a blob object does not refer to any
175+
other object, its SHA-1 content and SHA-256 content are the same.
176176

177-
The format allows round-trip conversion between sha256-content and
178-
sha1-content.
177+
The format allows round-trip conversion between SHA-256 content and
178+
SHA-1 content.
179179

180180
Object storage
181181
~~~~~~~~~~~~~~
182182
Loose objects use zlib compression and packed objects use the packed
183183
format described in Documentation/technical/pack-format.txt, just like
184-
today. The content that is compressed and stored uses sha256-content
185-
instead of sha1-content.
184+
today. The content that is compressed and stored uses SHA-256 content
185+
instead of SHA-1 content.
186186

187187
Pack index
188188
~~~~~~~~~~
@@ -287,18 +287,18 @@ To remove entries (e.g. in "git pack-refs" or "git-prune"):
287287

288288
Translation table
289289
~~~~~~~~~~~~~~~~~
290-
The index files support a bidirectional mapping between sha1-names
291-
and sha256-names. The lookup proceeds similarly to ordinary object
292-
lookups. For example, to convert a sha1-name to a sha256-name:
290+
The index files support a bidirectional mapping between SHA-1 names
291+
and SHA-256 names. The lookup proceeds similarly to ordinary object
292+
lookups. For example, to convert a SHA-1 name to a SHA-256 name:
293293

294294
1. Look for the object in idx files. If a match is present in the
295-
idx's sorted list of truncated sha1-names, then:
296-
a. Read the corresponding entry in the sha1-name order to pack
295+
idx's sorted list of truncated SHA-1 names, then:
296+
a. Read the corresponding entry in the SHA-1 name order to pack
297297
name order mapping.
298-
b. Read the corresponding entry in the full sha1-name table to
298+
b. Read the corresponding entry in the full SHA-1 name table to
299299
verify we found the right object. If it is, then
300-
c. Read the corresponding entry in the full sha256-name table.
301-
That is the object's sha256-name.
300+
c. Read the corresponding entry in the full SHA-256 name table.
301+
That is the object's SHA-256 name.
302302
2. Check for a loose object. Read lines from loose-object-idx until
303303
we find a match.
304304

@@ -312,10 +312,10 @@ Since all operations that make new objects (e.g., "git commit") add
312312
the new objects to the corresponding index, this mapping is possible
313313
for all objects in the object store.
314314

315-
Reading an object's sha1-content
316-
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
317-
The sha1-content of an object can be read by converting all sha256-names
318-
its sha256-content references to sha1-names using the translation table.
315+
Reading an object's SHA-1 content
316+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
317+
The SHA-1 content of an object can be read by converting all SHA-256 names
318+
its SHA-256 content references to SHA-1 names using the translation table.
319319

320320
Fetch
321321
~~~~~
@@ -338,7 +338,7 @@ the following steps:
338338
1. index-pack: inflate each object in the packfile and compute its
339339
SHA-1. Objects can contain deltas in OBJ_REF_DELTA format against
340340
objects the client has locally. These objects can be looked up
341-
using the translation table and their sha1-content read as
341+
using the translation table and their SHA-1 content read as
342342
described above to resolve the deltas.
343343
2. topological sort: starting at the "want"s from the negotiation
344344
phase, walk through objects in the pack and emit a list of them,
@@ -347,12 +347,12 @@ the following steps:
347347
(This list only contains objects reachable from the "wants". If the
348348
pack from the server contained additional extraneous objects, then
349349
they will be discarded.)
350-
3. convert to sha256: open a new (sha256) packfile. Read the topologically
350+
3. convert to SHA-256: open a new SHA-256 packfile. Read the topologically
351351
sorted list just generated. For each object, inflate its
352-
sha1-content, convert to sha256-content, and write it to the sha256
353-
pack. Record the new sha1<-->sha256 mapping entry for use in the idx.
352+
SHA-1 content, convert to SHA-256 content, and write it to the SHA-256
353+
pack. Record the new SHA-1<-->SHA-256 mapping entry for use in the idx.
354354
4. sort: reorder entries in the new pack to match the order of objects
355-
in the pack the server generated and include blobs. Write a sha256 idx
355+
in the pack the server generated and include blobs. Write a SHA-256 idx
356356
file
357357
5. clean up: remove the SHA-1 based pack file, index, and
358358
topologically sorted list obtained from the server in steps 1
@@ -377,16 +377,16 @@ experimenting to get this to perform well.
377377
Push
378378
~~~~
379379
Push is simpler than fetch because the objects referenced by the
380-
pushed objects are already in the translation table. The sha1-content
380+
pushed objects are already in the translation table. The SHA-1 content
381381
of each object being pushed can be read as described in the "Reading
382-
an object's sha1-content" section to generate the pack written by git
382+
an object's SHA-1 content" section to generate the pack written by git
383383
send-pack.
384384

385385
Signed Commits
386386
~~~~~~~~~~~~~~
387387
We add a new field "gpgsig-sha256" to the commit object format to allow
388388
signing commits without relying on SHA-1. It is similar to the
389-
existing "gpgsig" field. Its signed payload is the sha256-content of the
389+
existing "gpgsig" field. Its signed payload is the SHA-256 content of the
390390
commit object with any "gpgsig" and "gpgsig-sha256" fields removed.
391391

392392
This means commits can be signed
@@ -404,7 +404,7 @@ Signed Tags
404404
~~~~~~~~~~~
405405
We add a new field "gpgsig-sha256" to the tag object format to allow
406406
signing tags without relying on SHA-1. Its signed payload is the
407-
sha256-content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP
407+
SHA-256 content of the tag with its gpgsig-sha256 field and "-----BEGIN PGP
408408
SIGNATURE-----" delimited in-body signature removed.
409409

410410
This means tags can be signed
@@ -416,11 +416,11 @@ This means tags can be signed
416416

417417
Mergetag embedding
418418
~~~~~~~~~~~~~~~~~~
419-
The mergetag field in the sha1-content of a commit contains the
420-
sha1-content of a tag that was merged by that commit.
419+
The mergetag field in the SHA-1 content of a commit contains the
420+
SHA-1 content of a tag that was merged by that commit.
421421

422-
The mergetag field in the sha256-content of the same commit contains the
423-
sha256-content of the same tag.
422+
The mergetag field in the SHA-256 content of the same commit contains the
423+
SHA-256 content of the same tag.
424424

425425
Submodules
426426
~~~~~~~~~~
@@ -495,7 +495,7 @@ Caveats
495495
-------
496496
Invalid objects
497497
~~~~~~~~~~~~~~~
498-
The conversion from sha1-content to sha256-content retains any
498+
The conversion from SHA-1 content to SHA-256 content retains any
499499
brokenness in the original object (e.g., tree entry modes encoded with
500500
leading 0, tree objects whose paths are not sorted correctly, and
501501
commit objects without an author or committer). This is a deliberate
@@ -514,15 +514,15 @@ allow lifting this restriction.
514514

515515
Alternates
516516
~~~~~~~~~~
517-
For the same reason, a sha256 repository cannot borrow objects from a
518-
sha1 repository using objects/info/alternates or
517+
For the same reason, a SHA-256 repository cannot borrow objects from a
518+
SHA-1 repository using objects/info/alternates or
519519
$GIT_ALTERNATE_OBJECT_REPOSITORIES.
520520

521521
git notes
522522
~~~~~~~~~
523-
The "git notes" tool annotates objects using their sha1-name as key.
523+
The "git notes" tool annotates objects using their SHA-1 name as key.
524524
This design does not describe a way to migrate notes trees to use
525-
sha256-names. That migration is expected to happen separately (for
525+
SHA-256 names. That migration is expected to happen separately (for
526526
example using a file at the root of the notes tree to describe which
527527
hash it uses).
528528

@@ -556,7 +556,7 @@ unclear:
556556

557557
Git 2.12
558558

559-
Does this mean Git v2.12.0 is the commit with sha1-name
559+
Does this mean Git v2.12.0 is the commit with SHA-1 name
560560
e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7 or the commit with
561561
new-40-digit-hash-name e7e07d5a4fcc2a203d9873968ad3e6bd4d7419d7?
562562

@@ -676,7 +676,7 @@ The next step is supporting fetches and pushes to SHA-1 repositories:
676676
- allow pushes to a repository using the compat format
677677
- generate a topologically sorted list of the SHA-1 names of fetched
678678
objects
679-
- convert the fetched packfile to sha256 format and generate an idx
679+
- convert the fetched packfile to SHA-256 format and generate an idx
680680
file
681681
- re-sort to match the order of objects in the fetched packfile
682682

@@ -748,38 +748,38 @@ using the old hash function.
748748
Signed objects with multiple hashes
749749
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
750750
Instead of introducing the gpgsig-sha256 field in commit and tag objects
751-
for sha256-content based signatures, an earlier version of this design
752-
added "hash sha256 <sha256-name>" fields to strengthen the existing
753-
sha1-content based signatures.
751+
for SHA-256 content based signatures, an earlier version of this design
752+
added "hash sha256 <SHA-256 name>" fields to strengthen the existing
753+
SHA-1 content based signatures.
754754

755755
In other words, a single signature was used to attest to the object
756756
content using both hash functions. This had some advantages:
757757

758758
* Using one signature instead of two speeds up the signing process.
759759
* Having one signed payload with both hashes allows the signer to
760-
attest to the sha1-name and sha256-name referring to the same object.
760+
attest to the SHA-1 name and SHA-256 name referring to the same object.
761761
* All users consume the same signature. Broken signatures are likely
762762
to be detected quickly using current versions of git.
763763

764764
However, it also came with disadvantages:
765765

766-
* Verifying a signed object requires access to the sha1-names of all
766+
* Verifying a signed object requires access to the SHA-1 names of all
767767
objects it references, even after the transition is complete and
768768
translation table is no longer needed for anything else. To support
769-
this, the design added fields such as "hash sha1 tree <sha1-name>"
770-
and "hash sha1 parent <sha1-name>" to the sha256-content of a signed
769+
this, the design added fields such as "hash sha1 tree <SHA-1 name>"
770+
and "hash sha1 parent <SHA-1 name>" to the SHA-256 content of a signed
771771
commit, complicating the conversion process.
772-
* Allowing signed objects without a sha1 (for after the transition is
772+
* Allowing signed objects without a SHA-1 (for after the transition is
773773
complete) complicated the design further, requiring a "nohash sha1"
774-
field to suppress including "hash sha1" fields in the sha256-content
774+
field to suppress including "hash sha1" fields in the SHA-256 content
775775
and signed payload.
776776

777777
Lazily populated translation table
778778
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
779779
Some of the work of building the translation table could be deferred to
780780
push time, but that would significantly complicate and slow down pushes.
781-
Calculating the sha1-name at object creation time at the same time it is
782-
being streamed to disk and having its sha256-name calculated should be
781+
Calculating the SHA-1 name at object creation time at the same time it is
782+
being streamed to disk and having its SHA-256 name calculated should be
783783
an acceptable cost.
784784

785785
Document History
@@ -801,7 +801,7 @@ Incorporated suggestions from jonathantanmy and sbeller:
801801
802802

803803
* Use SHA3-256 instead of SHA2 (thanks, Linus and brian m. carlson).[1][2]
804-
* Make sha3-based signatures a separate field, avoiding the need for
804+
* Make SHA3-based signatures a separate field, avoiding the need for
805805
"hash" and "nohash" fields (thanks to peff[3]).
806806
* Add a sorting phase to fetch (thanks to Junio for noticing the need
807807
for this).

0 commit comments

Comments
 (0)