Skip to content

[otbn/hw] Add HPC gadgets to prims#29642

Open
h-filali wants to merge 5 commits intolowRISC:masterfrom
h-filali:otbn-hpc-gadgets
Open

[otbn/hw] Add HPC gadgets to prims#29642
h-filali wants to merge 5 commits intolowRISC:masterfrom
h-filali:otbn-hpc-gadgets

Conversation

@h-filali
Copy link
Copy Markdown
Contributor

Description

This PR adds HPC gadgets to opentitan primitives.

For details, see the following paper:
Cassiers, Gaëtan, et al. "Compress: Generate small and fast masked pipelined circuits." available here.

The first commit adds the gadget themselves.
The following commits add scripts that are needed to evaluate the gadgets for leakage.

This PR provides the scripts to analyze the design using Coco Alma and PROLEAD.

In the following instructions <gadget> can be one of the following:

  • hpc2o
  • hpc2_and
  • hpc3o
  • hpc3_and

Run Synthesis

To synthesize the design run the following commands:

cd <REPO_TOP>/hw/ip/otbn/pre_syn
./syn_yosys_sec_add.sh <gadget>

Run Leakage Analysis with Alma

Make sure you have Alma installed. A guide on how to install Alma can be found in the Prerequisites section here.

To start the leakage analysis using alma run the following commands in your nix shell:

cd <REPO_TOP>
source util/build_consts.sh
cd ~/alma
source dev/bin/activate
~/opentitan/hw/ip/otbn/pre_sca/alma/verify_sec_add.sh <gadget>

Run Leakage Analysis with PROLEAD

To start the leakage analysis using PROLEAD run the following commands:

# Clone the PROLEAD repo
git clone https://github.com/ChairImpSec/PROLEAD.git

# Start the PROLEAD dev shell
cd PROLEAD
nix-shell
make release

# Make sure package metadata is up to date
nix-channel --update

# Set up shell variables
source ../opentitan/util/build_consts.sh
cd ../opentitan/hw/ip/otbn/pre_sca/prolead

# Run analysis
./evaluate.sh <gadget>

This commit adds two new HPC gadgets which implement a secure AND
gate which operate on masked bits.

Both HPC2 and HPC3 modules implement two versions of the gadget.
One toffoli version where the output is equal to the masked operation of
Z = W ^ (X & Y)

And the straightforward implementation where the output is equal to the
masked implementation of
Z = X & Y

Signed-off-by: Hakim Filali <hfilali@lowrisc.org>
@h-filali h-filali requested a review from a team as a code owner March 31, 2026 16:08
@h-filali h-filali requested review from andrea-caforio, etterli, nasahlpa, pamaury and vogelpi and removed request for a team and pamaury March 31, 2026 16:08
This commit adds a wrapper file per HPC gadget and mode.
These wrappers are needed for leakage analysis.

Signed-off-by: Hakim Filali <hfilali@lowrisc.org>
This commit adds the synthesis scripts for the HPC gadgets.
These scripts are needed for leakage analysis.

Signed-off-by: Hakim Filali <hfilali@lowrisc.org>
This commit adds all the necessary scripts to run leakage analysis
of the HPC gadgets on Coco Alma.

Signed-off-by: Hakim Filali <hfilali@lowrisc.org>
Signed-off-by: Hakim Filali <hfilali@lowrisc.org>
Copy link
Copy Markdown
Contributor

@etterli etterli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add some ASCII diagrams showing what is computed where? I have a hard time mapping the RTL to the gadget algorithms.

// halt data propagation.
//
// For details, see the following paper:
// Cassiers, Gaëtan, et al. "Compress: Generate small and fast masked pipelined circuits." available at
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Line length

Comment on lines +32 to +33
parameter bit EnW = 1'b0,
localparam int NumShares = 2
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: alignment of =

Comment on lines +83 to +92
prim_flop_en #(
.Width (2),
.ResetValue ('0)
) u_prim_flop_en_y_masked (
.clk_i(clk_i),
.rst_ni(rst_ni),
.en_i(en1_i),
.d_i('{y_masked_d[0], y_masked_d[1]}),
.q_o('{y_masked_q[0], y_masked_q[1]})
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't here the width be defined using NumShares to avoid a mismatch between the localparam and the actual width? Same goes for the in/output.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: Alignment of (). Should be the opposite way around, parameters have no space and signals have aligned ().

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several other flops which I think should be NumShares wide. I did not comment on all of them.

Comment on lines +94 to +103
prim_flop_en #(
.Width (1),
.ResetValue ('0)
) u_prim_flop_en_r (
.clk_i(clk_i),
.rst_ni(rst_ni),
.en_i(en1_i),
.d_i(r_i),
.q_o(r_q)
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: alignment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applies also to other module instantiations

// HPC2o (EnW=1): XOR w and x_masked into the inner term before registering.
// HPC2 (EnW=0): register the inner term and x_masked separately, XOR after.
if (EnW) begin : gen_xor_w
logic xyw_masked[NumShares]; // x[i] & y_q[i] ^ w[i]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is xy_masked = x[i] & y_q[i]? If so, (x[i] & y_q[i]) ^ w[i] would be helpful

Comment on lines +236 to +239
always_comb begin
unused_w = 1'b0;
for (int i = 0; i < NumShares; i++) unused_w ^= w_i[i];
end
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with operator which combines all bits of w_i:

assign unused_w = ^w_i;

// - Symmetric Latency: All inputs (x_i, y_i, w_i, r_i, rp_i) share a uniform 1-cycle latency.
// Unlike HPC2, no input needs to be presented early.
// - Stall Support: To support stallable pipelines, the `en_i` flip-flop
// enable signals can be safely deasserted to freeze the internal registers and
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: "enable signalS" -> "enable signal"

Comment on lines +74 to +83
prim_flop_en #(
.Width (2),
.ResetValue ('0)
) u_prim_flop_en_y_masked (
.clk_i(clk_i),
.rst_ni(rst_ni),
.en_i(en_i),
.d_i('{y_masked_d[0], y_masked_d[1]}),
.q_o('{y_masked_q[0], y_masked_q[1]})
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I have the same remark about using NumShares instead of 2 as in prim_hpc2.sv. Applies to all flops in HPC3

// HPC3o (EnW=1): XOR w into the inner term before registering.
// HPC3 (EnW=0): inner term is just (x&y_masked) ^ rp.
if (EnW) begin : gen_xor_w
logic xyw_masked[NumShares]; // x[i] & (y[i] ^ r) ^ w[i]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add () to clearly state the operation order?

Comment on lines +142 to +145
always_comb begin
unused_w = 1'b0;
for (int i = 0; i < NumShares; i++) unused_w ^= w_i[i];
end
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use assign unused_w = ^w_i;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants