[otbn/hw] Add HPC gadgets to prims#29642
Conversation
This commit adds two new HPC gadgets which implement a secure AND gate which operate on masked bits. Both HPC2 and HPC3 modules implement two versions of the gadget. One toffoli version where the output is equal to the masked operation of Z = W ^ (X & Y) And the straightforward implementation where the output is equal to the masked implementation of Z = X & Y Signed-off-by: Hakim Filali <hfilali@lowrisc.org>
This commit adds a wrapper file per HPC gadget and mode. These wrappers are needed for leakage analysis. Signed-off-by: Hakim Filali <hfilali@lowrisc.org>
This commit adds the synthesis scripts for the HPC gadgets. These scripts are needed for leakage analysis. Signed-off-by: Hakim Filali <hfilali@lowrisc.org>
This commit adds all the necessary scripts to run leakage analysis of the HPC gadgets on Coco Alma. Signed-off-by: Hakim Filali <hfilali@lowrisc.org>
Signed-off-by: Hakim Filali <hfilali@lowrisc.org>
9e20a8b to
edfade4
Compare
etterli
left a comment
There was a problem hiding this comment.
Would it be possible to add some ASCII diagrams showing what is computed where? I have a hard time mapping the RTL to the gadget algorithms.
| // halt data propagation. | ||
| // | ||
| // For details, see the following paper: | ||
| // Cassiers, Gaëtan, et al. "Compress: Generate small and fast masked pipelined circuits." available at |
| parameter bit EnW = 1'b0, | ||
| localparam int NumShares = 2 |
| prim_flop_en #( | ||
| .Width (2), | ||
| .ResetValue ('0) | ||
| ) u_prim_flop_en_y_masked ( | ||
| .clk_i(clk_i), | ||
| .rst_ni(rst_ni), | ||
| .en_i(en1_i), | ||
| .d_i('{y_masked_d[0], y_masked_d[1]}), | ||
| .q_o('{y_masked_q[0], y_masked_q[1]}) | ||
| ); |
There was a problem hiding this comment.
Shouldn't here the width be defined using NumShares to avoid a mismatch between the localparam and the actual width? Same goes for the in/output.
There was a problem hiding this comment.
NIT: Alignment of (). Should be the opposite way around, parameters have no space and signals have aligned ().
There was a problem hiding this comment.
There are several other flops which I think should be NumShares wide. I did not comment on all of them.
| prim_flop_en #( | ||
| .Width (1), | ||
| .ResetValue ('0) | ||
| ) u_prim_flop_en_r ( | ||
| .clk_i(clk_i), | ||
| .rst_ni(rst_ni), | ||
| .en_i(en1_i), | ||
| .d_i(r_i), | ||
| .q_o(r_q) | ||
| ); |
There was a problem hiding this comment.
Applies also to other module instantiations
| // HPC2o (EnW=1): XOR w and x_masked into the inner term before registering. | ||
| // HPC2 (EnW=0): register the inner term and x_masked separately, XOR after. | ||
| if (EnW) begin : gen_xor_w | ||
| logic xyw_masked[NumShares]; // x[i] & y_q[i] ^ w[i] |
There was a problem hiding this comment.
Is xy_masked = x[i] & y_q[i]? If so, (x[i] & y_q[i]) ^ w[i] would be helpful
| always_comb begin | ||
| unused_w = 1'b0; | ||
| for (int i = 0; i < NumShares; i++) unused_w ^= w_i[i]; | ||
| end |
There was a problem hiding this comment.
Replace with operator which combines all bits of w_i:
assign unused_w = ^w_i;
| // - Symmetric Latency: All inputs (x_i, y_i, w_i, r_i, rp_i) share a uniform 1-cycle latency. | ||
| // Unlike HPC2, no input needs to be presented early. | ||
| // - Stall Support: To support stallable pipelines, the `en_i` flip-flop | ||
| // enable signals can be safely deasserted to freeze the internal registers and |
There was a problem hiding this comment.
NIT: "enable signalS" -> "enable signal"
| prim_flop_en #( | ||
| .Width (2), | ||
| .ResetValue ('0) | ||
| ) u_prim_flop_en_y_masked ( | ||
| .clk_i(clk_i), | ||
| .rst_ni(rst_ni), | ||
| .en_i(en_i), | ||
| .d_i('{y_masked_d[0], y_masked_d[1]}), | ||
| .q_o('{y_masked_q[0], y_masked_q[1]}) | ||
| ); |
There was a problem hiding this comment.
Here I have the same remark about using NumShares instead of 2 as in prim_hpc2.sv. Applies to all flops in HPC3
| // HPC3o (EnW=1): XOR w into the inner term before registering. | ||
| // HPC3 (EnW=0): inner term is just (x&y_masked) ^ rp. | ||
| if (EnW) begin : gen_xor_w | ||
| logic xyw_masked[NumShares]; // x[i] & (y[i] ^ r) ^ w[i] |
There was a problem hiding this comment.
Add () to clearly state the operation order?
| always_comb begin | ||
| unused_w = 1'b0; | ||
| for (int i = 0; i < NumShares; i++) unused_w ^= w_i[i]; | ||
| end |
There was a problem hiding this comment.
use assign unused_w = ^w_i;
Description
This PR adds HPC gadgets to opentitan primitives.
For details, see the following paper:
Cassiers, Gaëtan, et al. "Compress: Generate small and fast masked pipelined circuits." available here.
The first commit adds the gadget themselves.
The following commits add scripts that are needed to evaluate the gadgets for leakage.
This PR provides the scripts to analyze the design using Coco Alma and PROLEAD.
In the following instructions
<gadget>can be one of the following:Run Synthesis
To synthesize the design run the following commands:
Run Leakage Analysis with Alma
Make sure you have Alma installed. A guide on how to install Alma can be found in the
Prerequisitessection here.To start the leakage analysis using alma run the following commands in your nix shell:
Run Leakage Analysis with PROLEAD
To start the leakage analysis using PROLEAD run the following commands: