|
| 1 | +# DFT / Scan in ORFS (OpenROAD-flow-scripts) — What We Changed + How To Reproduce |
| 2 | + |
| 3 | +Quickstart: `doc-DFT-howto.md` |
| 4 | + |
| 5 | +This document summarizes the DFT/scan-chain work done in this repo, with **OpenROAD commit `7bc521f36a` treated as the baseline** and a **vanilla OpenSTA** requirement (no `src/sta` parser changes needed). |
| 6 | + |
| 7 | +## Goal / Scope |
| 8 | + |
| 9 | +- Make OpenROAD’s DFT scan insertion usable in ORFS: |
| 10 | + - `scan_replace` converts functional flops → scan flops. |
| 11 | + - `execute_dft_plan` stitches scan chains using placement (wirelength-aware). |
| 12 | +- Ensure it works with **vanilla OpenSTA** (no OpenSTA parser patches required). |
| 13 | +- Provide a practical way to compare: |
| 14 | + - **7bc521 “baseline DFT”** (broken / mostly no-op) vs |
| 15 | + - **fixed DFT** (actually produces scan flops + stitched chains), |
| 16 | + - using QoR proxies and a scan-chain “TSP-like” cost metric. |
| 17 | + |
| 18 | +## Baselines, Branches, and Key Commits |
| 19 | + |
| 20 | +### OpenROAD submodule (`tools/OpenROAD`) |
| 21 | + |
| 22 | +- Baseline reference branch: `orfs-baseline-7bc521` |
| 23 | + - pinned at `7bc521f36a` |
| 24 | +- Fixed DFT/scan branch (vanilla OpenSTA): `orfs-dft-scan` |
| 25 | + - `5649f22868` |
| 26 | +- Older variant (kept for history): `orfs-dft-scan-with-opensta` |
| 27 | + - `5d3e1e243c` |
| 28 | + |
| 29 | +### OpenSTA submodule (`tools/OpenROAD/src/sta`) |
| 30 | + |
| 31 | +- Vanilla OpenSTA used by baseline and final solution: |
| 32 | + - `d7cb9be1` |
| 33 | +- A prior OpenSTA parser patch was made (not required for the final approach) and preserved: |
| 34 | + - branch `orfs-sta-scan-nextstate-6d62008a` at commit `6d62008a` |
| 35 | + |
| 36 | +### ORFS top-level (this repo) |
| 37 | + |
| 38 | +- ORFS commit `84cc6b71d`: |
| 39 | + - bumps `tools/OpenROAD` gitlink to `5649f22868` |
| 40 | + - adds DFT hook scripts under `flow/scripts/` |
| 41 | + |
| 42 | +## What Was Fixed in OpenROAD DFT |
| 43 | + |
| 44 | +### 1) Make scan-cell recognition work with vanilla OpenSTA |
| 45 | + |
| 46 | +Problem: |
| 47 | +- Libraries (e.g., Nangate45) tag scan pins via Liberty `nextstate_type` like `scan_in`, `scan_enable`. |
| 48 | +- Vanilla OpenSTA does **not** reliably surface those pins as `test_scan_*` scan-signal types. |
| 49 | +- Result: OpenROAD DFT often can’t identify scan pins → scan cells not recognized → chains not built. |
| 50 | + |
| 51 | +Fix (final approach): |
| 52 | +- In `tools/OpenROAD/src/dbSta/src/dbSta.cc`, added **fallback inference** by common pin names: |
| 53 | + - enable: `SE`, `SCE`, `SCAN_EN`, `SCAN_ENABLE`, `SCANENABLE` |
| 54 | + - in: `SI`, `SCD`, `SCAN_IN`, `SCANIN` |
| 55 | + - out: `SO`, `SCO`, `SCAN_OUT`, `SCANOUT` |
| 56 | +- This allows DFT to identify scan pins without requiring any `tools/OpenROAD/src/sta` changes. |
| 57 | + |
| 58 | +### 2) Fix scan stitching correctness |
| 59 | + |
| 60 | +Problem: |
| 61 | +- Baseline stitching logic had an iteration/pop bug in `ScanStitch.cpp` that could skip/omit links. |
| 62 | + |
| 63 | +Fix: |
| 64 | +- Rewrote the scan-cell linking loop to deterministically connect: |
| 65 | + - `scan_in_driver -> first.SI` |
| 66 | + - `cell[i-1].SO -> cell[i].SI` for all i |
| 67 | + - `last.SO -> scan_out_load` |
| 68 | + |
| 69 | +### 3) Remove reliance on `sta::TestCell` and improve scan-out handling |
| 70 | + |
| 71 | +Changes in the earlier “with-opensta” variant that were retained/improved: |
| 72 | +- Stop depending on `sta::TestCell` objects. |
| 73 | +- Use `getLibertyScanIn/Enable/Out()` helpers instead. |
| 74 | +- Add scan-out fallback to `Q` if scan-out pin metadata isn’t tagged. |
| 75 | + |
| 76 | +### 4) Add/enable regression coverage |
| 77 | + |
| 78 | +- Added a DFT regression `scan_architect_no_mix_nangate45` and wired it into OpenROAD’s CMake test setup. |
| 79 | +- Verified DFT tests pass in the fixed OpenROAD build (`ctest -R '^dft\.'`). |
| 80 | + |
| 81 | +## ORFS Integration (How DFT Is Hooked Into the Flow) |
| 82 | + |
| 83 | +Two ORFS hook scripts were added: |
| 84 | + |
| 85 | +- `flow/scripts/dft_scan_post_floorplan.tcl` |
| 86 | + - Intended to be set via `POST_FLOORPLAN_TCL=...` |
| 87 | + - Runs: |
| 88 | + - `set_dft_config -max_chains 1 -clock_mixing clock_mix` |
| 89 | + - `scan_replace` |
| 90 | + - creates ports: `scan_enable_0`, `scan_in_0`, `scan_out_0` |
| 91 | + - `set_case_analysis 0 [get_ports scan_enable_0]` (functional-mode assumption) |
| 92 | + |
| 93 | +- `flow/scripts/dft_scan_pre_global_route.tcl` |
| 94 | + - Intended to be set via `PRE_GLOBAL_ROUTE_TCL=...` |
| 95 | + - Runs: |
| 96 | + - `set_dft_config ...` (must match) |
| 97 | + - `set_case_analysis 0 ...` |
| 98 | + - `execute_dft_plan` (stitch chains) |
| 99 | + |
| 100 | +Notes: |
| 101 | +- This wiring is **opt-in**: you enable it by setting `POST_FLOORPLAN_TCL` and `PRE_GLOBAL_ROUTE_TCL` when you run `make -C flow ...`. |
| 102 | +- The scripts currently hardcode `-max_chains 1` to keep scan-port count stable for comparisons (this is intentionally worst-case for scan wiring impact). |
| 103 | + |
| 104 | +## Reproduction: Baseline vs Fixed DFT (QoR Proxy Comparison) |
| 105 | + |
| 106 | +### Design used |
| 107 | + |
| 108 | +- `nangate45/ibex` (`flow/designs/nangate45/ibex/config.mk`) |
| 109 | + |
| 110 | +### OpenROAD executables used |
| 111 | + |
| 112 | +- Fixed OpenROAD (DFT works): `tools/OpenROAD/build_gate7bc521/bin/openroad` (reports `v2.0-26262-g5649f22868`) |
| 113 | +- Baseline OpenROAD 7bc521 (DFT mostly broken): `tools/OpenROAD_7bc521/build_gate7bc521/bin/openroad` |
| 114 | + - built from a detached worktree at `7bc521f36a` (version string prints `HEAD-HASH-NOTFOUND` due to git-describe failure in that worktree) |
| 115 | + |
| 116 | +### Flow commands |
| 117 | + |
| 118 | +- Baseline (no DFT): |
| 119 | + - `make -C flow DESIGN_CONFIG=./designs/nangate45/ibex/config.mk FLOW_VARIANT=qor_scan_base_20260104 OPENROAD_EXE=$(pwd)/tools/OpenROAD/build_gate7bc521/bin/openroad finish` |
| 120 | +- Fixed DFT enabled: |
| 121 | + - `make -C flow DESIGN_CONFIG=./designs/nangate45/ibex/config.mk FLOW_VARIANT=qor_scan_dft_20260104 OPENROAD_EXE=$(pwd)/tools/OpenROAD/build_gate7bc521/bin/openroad POST_FLOORPLAN_TCL=$(pwd)/flow/scripts/dft_scan_post_floorplan.tcl PRE_GLOBAL_ROUTE_TCL=$(pwd)/flow/scripts/dft_scan_pre_global_route.tcl finish` |
| 122 | +- Baseline OpenROAD 7bc521 “DFT enabled” (shows it’s broken/no-op): |
| 123 | + - `make -C flow DESIGN_CONFIG=./designs/nangate45/ibex/config.mk FLOW_VARIANT=qor_scan_dft_or7bc521_20260104 OPENROAD_EXE=$(pwd)/tools/OpenROAD_7bc521/build_gate7bc521/bin/openroad POST_FLOORPLAN_TCL=$(pwd)/flow/scripts/dft_scan_post_floorplan.tcl PRE_GLOBAL_ROUTE_TCL=$(pwd)/flow/scripts/dft_scan_pre_global_route.tcl finish` |
| 124 | + |
| 125 | +### QoR snapshot (finish metrics) |
| 126 | + |
| 127 | +From `flow/logs/nangate45/ibex/<variant>/6_report.json` and `5_2_route.json`: |
| 128 | + |
| 129 | +- no DFT (`qor_scan_base_20260104`): |
| 130 | + - instance area `29091.6` |
| 131 | + - sequential area `10065.2` |
| 132 | + - total power `0.0960477` |
| 133 | + - setup WS `-0.0211463` |
| 134 | + - detailed-route WL `256015` |
| 135 | +- “DFT enabled” but OpenROAD 7bc521 broken (`qor_scan_dft_or7bc521_20260104`): |
| 136 | + - instance area `29106` |
| 137 | + - sequential area `10065.2` |
| 138 | + - total power `0.0961684` |
| 139 | + - setup WS `-0.0240458` |
| 140 | + - detailed-route WL `256612` |
| 141 | + - `report_dft_plan` shows **0 chains** (scan cells not recognized) |
| 142 | +- fixed DFT (`qor_scan_dft_20260104`): |
| 143 | + - instance area `31790.5` (+~9.3%) |
| 144 | + - sequential area `12702.3` (+~26.2%) |
| 145 | + - total power `0.0995975` (+~3.7%) |
| 146 | + - setup WS `-0.029858` |
| 147 | + - detailed-route WL `278902` (+~8.9%) |
| 148 | + - `report_dft_plan` shows **1 chain / 1931 scan cells** |
| 149 | + |
| 150 | +Interpretation: |
| 151 | +- Comparing DFT vs no-DFT: PPA generally degrades due to bigger flops + new scan nets. |
| 152 | +- The valid “DFT QoR” story is DFT-vs-DFT (reduce overhead vs naive chain ordering / too-few chains), not DFT vs no-DFT. |
| 153 | + |
| 154 | +## “Traveling Salesman”-Style Metric (Scan Chain Cost) |
| 155 | + |
| 156 | +To quantify “how good” a scan chain ordering is, we added: |
| 157 | + |
| 158 | +- `flow/util/scan_chain_cost.py` |
| 159 | + |
| 160 | +What it computes: |
| 161 | +- Parses OpenROAD `report_dft_plan -verbose` to get the scan-cell order per chain. |
| 162 | +- Extracts placed instance locations (DEF `PLACED` coordinates) by writing a DEF (`write_def`) and parsing the `COMPONENTS` section. |
| 163 | +- Computes: |
| 164 | + - chain path length = sum Manhattan distance between consecutive scan cells in that order |
| 165 | + - a naive baseline = same cost for lexicographic instance order (`sorted(inst_names)`) |
| 166 | + |
| 167 | +Note: |
| 168 | +- The metric intentionally uses DEF `PLACED` coordinates (OpenDB `dbInst::getLocation()`), since `dbInst::getOrigin()` is orientation-dependent (e.g. MX/MY) and can skew comparisons/optimization. |
| 169 | + |
| 170 | +### Example usage |
| 171 | + |
| 172 | +- On a DFT-run placed DB: |
| 173 | + - `python3 flow/util/scan_chain_cost.py --openroad tools/OpenROAD/build_gate7bc521/bin/openroad --liberty flow/platforms/nangate45/lib/NangateOpenCellLibrary_typical.lib --odb flow/results/nangate45/ibex/qor_scan_dft_20260104/3_5_place_dp.odb --sdc flow/results/nangate45/ibex/qor_scan_dft_20260104/3_place.sdc` |
| 174 | +- With a simple “TSP-ish” nearest-neighbor baseline: |
| 175 | + - `python3 flow/util/scan_chain_cost.py --nearest-neighbor --openroad tools/OpenROAD/build_gate7bc521/bin/openroad --liberty flow/platforms/nangate45/lib/NangateOpenCellLibrary_typical.lib --odb flow/results/nangate45/ibex/qor_scan_dft_20260104/3_5_place_dp.odb --sdc flow/results/nangate45/ibex/qor_scan_dft_20260104/3_place.sdc` |
| 176 | +- On a no-DFT placed DB (compute hypothetical scan cost by doing `scan_replace` in-memory, without re-placement): |
| 177 | + - `python3 flow/util/scan_chain_cost.py --openroad tools/OpenROAD/build_gate7bc521/bin/openroad --liberty flow/platforms/nangate45/lib/NangateOpenCellLibrary_typical.lib --odb flow/results/nangate45/ibex/qor_scan_base_20260104/3_5_place_dp.odb --sdc flow/results/nangate45/ibex/qor_scan_base_20260104/3_place.sdc --scan-replace` |
| 178 | + |
| 179 | +### Observed results (ibex, 1 chain) |
| 180 | + |
| 181 | +- DFT placed: `manhattan_um=11990.000`, naive lexicographic `90306.230` (ratio `7.532x`) |
| 182 | +- no-DFT placed + hypothetical scan: `manhattan_um=11982.700`, naive `93778.840` (ratio `7.826x`) |
| 183 | + |
| 184 | +Why DFT vs no-DFT chain cost is similar here: |
| 185 | +- The scan chain cost is dominated by **where flops are placed** in the design. |
| 186 | +- For `ibex` at this utilization, scan insertion didn’t significantly perturb placement, so the chain path length barely changes. |
| 187 | + |
| 188 | +What *does* show up clearly: |
| 189 | +- The new scan/control nets. Example (fixed DFT, routed DB): |
| 190 | + - `report_wire_length -net {scan_enable_0} -detailed_route` → `8033.45um` |
| 191 | + |
| 192 | +### Optimizer benchmark (OpenROAD opt vs nearest-neighbor) |
| 193 | + |
| 194 | +On the 9-design suite (`aes/ibex/jpeg × nangate45/asap7/sky130hd`), using `flow/util/scan_chain_cost.py --scan-replace --nearest-neighbor` on placed ODBs (so scan flops are inserted in-memory, then the chain is planned/ordered from the placement database): |
| 195 | + |
| 196 | +| platform | design | cells | OpenROAD opt (um) | NN (um) | opt/NN | |
| 197 | +| --- | --- | ---: | ---: | ---: | ---: | |
| 198 | +| nangate45 | aes | 562 | 3571.680 | 4178.080 | 0.855 | |
| 199 | +| nangate45 | ibex | 1931 | 9197.880 | 10545.640 | 0.872 | |
| 200 | +| nangate45 | jpeg | 4390 | 17903.670 | 20815.750 | 0.860 | |
| 201 | +| asap7 | aes | 562 | 1053.810 | 1222.344 | 0.862 | |
| 202 | +| asap7 | ibex | 273 | 428.652 | 514.404 | 0.833 | |
| 203 | +| asap7 | jpeg | 4325 | 5045.058 | 5709.204 | 0.884 | |
| 204 | +| sky130hd | aes | 562 | 11050.640 | 13137.940 | 0.841 | |
| 205 | +| sky130hd | ibex | 1931 | 21754.680 | 24411.360 | 0.891 | |
| 206 | +| sky130hd | jpeg | 4390 | 50973.380 | 57692.340 | 0.884 | |
| 207 | + |
| 208 | +Avg `opt/NN` = `0.865` (~`13.5%` shorter than NN). |
| 209 | + |
| 210 | +Reproduce (single design): |
| 211 | +- `python3 flow/util/scan_chain_cost.py --scan-replace --nearest-neighbor --openroad tools/OpenROAD/build_gate7bc521/bin/openroad --liberty flow/platforms/nangate45/lib/NangateOpenCellLibrary_typical.lib --odb flow/results/nangate45/ibex/cmp9_or0db856_rp100_20251229_022425/3_5_place_dp.odb --sdc flow/results/nangate45/ibex/cmp9_or0db856_rp100_20251229_022425/3_place.sdc` |
| 212 | + |
| 213 | +Notes: |
| 214 | +- ASAP7 needs multiple libs; pass them all, e.g. `--liberty flow/platforms/asap7/lib/NLDM/*_TT_*`. |
| 215 | + |
| 216 | +## Current Limitations / Known Gaps |
| 217 | + |
| 218 | +- `scan_opt` is implemented in OpenROAD DFT and re-stitches scan chains using the latest placement |
| 219 | + (without re-running `scan_replace`). The scan-chain optimizer uses NN + farthest-insertion + bounded 2-opt (with an rtree fallback for huge chains). |
| 220 | +- The ORFS hook scripts currently hardcode `-max_chains 1`. A next step is to sweep `-max_chains` and quantify overhead reduction (DFT-vs-DFT). |
| 221 | +- Clock-domain correctness constraints (lockups, strict no-mix, etc.) are not yet wired through ORFS configuration beyond `-clock_mixing`. |
| 222 | + |
| 223 | +## Scan-Chain Integrity Validation (Does it Actually Shift?) |
| 224 | + |
| 225 | +QoR deltas and plan reports are necessary but not sufficient; we also want a basic structural check that the scan path is one continuous chain from `scan_in_0` to `scan_out_0`. |
| 226 | + |
| 227 | +- `flow/util/scan_chain_validate.py` validates scan stitching from a gate-level netlist (or from an ODB by writing a temporary netlist via OpenROAD). |
| 228 | +- It treats `assign` + inserted `BUF*/CLKBUF*` as transparent, so post-P&R buffering doesn’t cause false failures. |
| 229 | + |
| 230 | +Example usage: |
| 231 | + |
| 232 | +- Validate a finished netlist: |
| 233 | + - `python3 flow/util/scan_chain_validate.py --verilog flow/results/nangate45/ibex/qor_scan_dft_20260104/6_final.v` |
| 234 | +- Validate from an ODB (writes a temp netlist first): |
| 235 | + - `python3 flow/util/scan_chain_validate.py --odb flow/results/nangate45/ibex/qor_scan_dft_20260104/6_final.odb --openroad tools/OpenROAD/build_gate7bc521/bin/openroad --liberty flow/platforms/nangate45/lib/NangateOpenCellLibrary_typical.lib --sdc flow/results/nangate45/ibex/qor_scan_dft_20260104/6_final.sdc --ensure-ports` |
0 commit comments