You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: extensions/sha2/circuit/README.md
+36-19Lines changed: 36 additions & 19 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,26 +1,42 @@
1
1
# SHA-2 VM Extension
2
2
3
-
This crate contains the circuit for the SHA-2 VM extension.
3
+
This crate contains circuits for the SHA-2 family of hash functions.
4
+
We support SHA-256, SHA-512, and SHA-384.
4
5
5
-
## SHA-256 Algorithm Summary
6
+
## SHA-2 Algorithms Summary
6
7
7
-
See the [FIPS standard](https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf), in particular, section 6.2 for reference.
8
+
The SHA-256, SHA-512, and SHA-384 algorithms are similar in structure.
9
+
We will first describe the SHA-256 algorithm, and then describe the differences between the three algorithms.
10
+
11
+
See the [FIPS standard](https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.180-4.pdf) for reference. In particular, sections 6.2, 6.4, and 6.5.
8
12
9
13
In short the SHA-256 algorithm works as follows.
10
14
1. Pad the message to 512 bits and split it into 512-bit 'blocks'.
11
-
2. Initialize a hash state consisting of eight 32-bit words.
15
+
2. Initialize a hash state consisting of eight 32-bit words to a specific constant value.
12
16
3. For each block,
13
-
1. split the message into 16 32-bit words and produce 48 more 'message schedule' words based on them.
14
-
2. apply 64 'rounds' to update the hash state based on the message schedule.
17
+
1. split the message into 16 32-bit words and produce 48 more words based on them. The 16 message word together with the 48 additional words are called the 'message schedule'.
18
+
2. apply a scrambling function 64 times to the hash state to update it based on the message schedule. We call each update a 'round'.
15
19
3. add the previous block's final hash state to the current hash state (modulo `2^32`).
16
20
4. The output is the final hash state
17
21
22
+
The differences with the SHA-512 algorithm are that:
23
+
- it uses 64-bit words, 1024-bit blocks, performs 80 rounds, and produces a 512-bit output.
24
+
- all the arithmetic is done modulo `2^64`.
25
+
- the initial hash state is different.
26
+
27
+
The SHA-384 algorithm is almost exactly a truncation of the SHA-512 output to 384 bits.
28
+
The only difference is that the initial hash state is different.
29
+
18
30
## Design Overview
19
31
20
-
This chip produces an AIR that consists of 17 rows for each block (512 bits) in the message, and no more rows.
21
-
The first 16 rows of each block are called 'round rows', and each of them represents four rounds of the SHA-256 algorithm.
22
-
Each row constrains updates to the working variables on each round, and it also constrains the message schedule words based on previous rounds.
23
-
The final row is called a 'digest row' and it produces a final hash for the block, computed as the sum of the working variables and the previous block's final hash.
32
+
We re-use the same AIR code to produce circuits for all three algorithms.
33
+
To achieve this, we parameterize the AIR by constants (such as the word size, number of rounds, and block size) that are specific to each algorithm.
34
+
35
+
This chip produces an AIR that consists of $R+1$ rows for each block of the message, and no more rows
36
+
(for SHA-256, $R = 16$ and for SHA-512 and SHA-384, $R = 20$).
37
+
The first $R$ rows of each block are called 'round rows', and each of them constrains four rounds of the hash algorithm.
38
+
Each row constrains updates to the working variables on each round, and also constrains the message schedule words based on previous rounds.
39
+
The final row of each block is called a 'digest row' and it produces a final hash for the block, computed as the sum of the working variables and the previous block's final hash.
24
40
25
41
Note that this chip only supports messages of length less than `2^29` bytes.
26
42
@@ -50,7 +66,7 @@ Since we can reliably constrain values from four rounds ago, we can build up `in
50
66
51
67
The last block of every message should have the `is_last_block` flag set to `1`.
52
68
Note that `is_last_block` is not constrained to be true for the last block of every message, instead it *defines* what the last block of a message is.
53
-
For instance, if we produce an air with 10 blocks and only the last block has `is_last_block = 1` then the constraints will interpret it as a single message of length 10 blocks.
69
+
For instance, if we produce a trace with 10 blocks and only the last block has `is_last_block = 1` then the constraints will interpret it as a single message of length 10 blocks.
54
70
If, however, we set `is_last_block` to true for the 6th block, the trace will be interpreted as hashing two messages, each of length 5 blocks.
55
71
56
72
Note that we do constrain, however, that the very last block of the trace has `is_last_block = 1`.
@@ -63,11 +79,11 @@ We use this trick in several places in this chip.
63
79
64
80
### Block index counter variables
65
81
66
-
There are two "block index" counter variables in each row of the air named `global_block_idx` and `local_block_idx`.
67
-
Both of these variables take on the same value on all 17 rows in a block.
82
+
There are two "block index" counter variables in each row named `global_block_idx` and `local_block_idx`.
83
+
Both of these variables take on the same value on all $R+1$ rows in a block.
68
84
69
85
The `global_block_idx` is the index of the block in the entire trace.
70
-
The very first 17 rows in the trace will have `global_block_idx = 1` and the counter will increment by 1 between blocks.
86
+
The very first block in the trace will have `global_block_idx = 1` on each row and the counter will increment by 1 between blocks.
71
87
The padding rows will all have `global_block_idx = 0`.
72
88
The `global_block_idx` is used in interaction constraints to constrain the value of `hash` between blocks.
73
89
@@ -79,15 +95,16 @@ The `local_block_idx` is used to calculate the length of the message processed s
79
95
80
96
### VM air vs SubAir
81
97
82
-
The SHA-256 VM extension chip uses the `Sha256Air` SubAir to help constrain the SHA-256 hash.
83
-
The VM extension air constrains the correctness of the SHA message padding, while the SubAir adds all other constraints related to the hash algorithm.
98
+
The SHA-2 VM extension chip uses the `Sha2Air` SubAir to help constrain the appropriate SHA-2 hash algorithm.
99
+
The SubAir is also parameterized by the specific SHA-2 variant's constants.
100
+
The VM extension AIR constrains the correctness of the message padding, while the SubAir adds all other constraints related to the hash algorithm.
84
101
The VM extension air also constrains memory reads and writes.
85
102
86
103
### A gotcha about padding rows
87
104
88
105
There are two senses of the word padding used in the context of this chip and this can be confusing.
89
-
First, we use padding to refer to the extra bits added to the message that is input to the SHA-256 algorithm in order to make the input's length a multiple of 512 bits.
90
-
So, we may use the term 'padding rows' to refer to round rows that correspond to the padded bits of a message (as in `Sha256VmAir::eval_padding_row`).
106
+
First, we use padding to refer to the extra bits added to the message that is input to the hash algorithm in order to make the input's length a multiple of the block size.
107
+
So, we may use the term 'padding rows' to refer to round rows that correspond to the padded bits of a message (as in `Sha2VmAir::eval_padding_row`).
91
108
Second, the dummy rows that are added to the trace to make the trace height a power of 2 are also called padding rows (see the `is_padding_row` flag).
92
109
In the SubAir, padding row probably means dummy row.
93
-
In the VM air, it probably refers to SHA-256 padding.
110
+
In the VM air, it probably refers to the message padding.
0 commit comments