Skip to content

cts: Deleted dangling dbModNet for top-level clock port.#8241

Merged
maliberty merged 6 commits intoThe-OpenROAD-Project:masterfrom
The-OpenROAD-Project-staging:secure-fix-cts-del-modnet
Sep 10, 2025
Merged

cts: Deleted dangling dbModNet for top-level clock port.#8241
maliberty merged 6 commits intoThe-OpenROAD-Project:masterfrom
The-OpenROAD-Project-staging:secure-fix-cts-del-modnet

Conversation

@openroad-ci
Copy link
Collaborator

Problem

In hierarchical flow, a dangling dbModNet w/o any load can hang on a clock port during CTS.
In the post-CTS routing phase, STA must be performed after loading 4_cts.odb.
However, the dangling dbModNet blocks clock propagation on the clock port during STA, corrupting the report_checks results.

Why did the dangling dbModNet occur?

CTS steps:

  1. Disconnect all the load pins of the given clock port --> Dangling dbModNet is made on the clock port
  2. Create new buffers
  3. Create new nets
  4. Connect the new buffers and nets to build a clock tree

Solution

  1. Disconnect all the load pins of the given clock port
  2. (FIX) Remove the dangling dbModNet on the clock port because it is not needed (after CTS, the port drives just a few root clock buffers. No dbModIterm load pin.)
  3. Create new buffers
  4. Create new nets
  5. Connect the new buffers and nets to build a clock tree

- Fixes The-OpenROAD-Project#8166

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
@github-actions
Copy link
Contributor

github-actions bot commented Sep 5, 2025

clang-tidy review says "All clean, LGTM! 👍"

@jhkim-pii
Copy link
Contributor

jhkim-pii commented Sep 5, 2025

@oharboe bazel-orfs/mock-array test fails after the fix as follows.

[2025-09-04T02:03:17.698Z]      1020 |       0 |     828 |            0 |    +0.1% | -56.489 | -8191.072 | ces_2_1/io_ins_left[26]
[2025-09-04T02:03:17.698Z]      1030 |       0 |     860 |            0 |    +0.1% | -56.489 | -7866.236 | ces_2_1/io_ins_left[26]
[2025-09-04T02:03:17.698Z]      1040 |       0 |     880 |            0 |    +0.1% | -56.489 | -7665.313 | ces_2_1/io_ins_left[26]
[2025-09-04T02:03:17.698Z]     final |       0 |     896 |            0 |    +0.1% | -56.489 | -7505.704 | ces_2_1/io_ins_left[26]
[2025-09-04T02:03:17.698Z] --------------------------------------------------------------------------------------
[2025-09-04T02:03:17.698Z] [WARNING RSZ-0066] Unable to repair all hold violations.
[2025-09-04T02:03:17.698Z] [INFO RSZ-0032] Inserted 896 hold buffers.
[2025-09-04T02:03:17.698Z] [ERROR RSZ-0060] Max buffer count reached.
[2025-09-04T02:03:17.698Z] Error: cts.tcl, 74 RSZ-0060
[2025-09-04T02:03:17.698Z] Command exited with non-zero status 1
[2025-09-04T02:03:17.698Z] Elapsed time: 0:22.64[h:]min:sec. CPU time: user 24.32 sys 32.48 (250%). Peak memory: 877936KB.
[2025-09-04T02:03:17.698Z] (02:03:15) [12,492 / 12,504] 1004 / 1006 tests; checking cached actions
[2025-09-04T02:03:17.698Z] (02:03:16) ERROR: /tmp/workspace/rivate_secure-fix-cts-del-modnet/test/orfs/mock-array/BUILD:106:10 Middleman _middlemen/test_Sorfs_Smock-array_Smake_UMockArray_Utest_Ubase_Utest-runfiles failed: (Exit 2): bash failed: error executing Action command (from target //test/orfs/mock-array:MockArray_cts) /bin/bash -c ... (remaining 5 arguments skipped)

Many hold buffers are inserted, possibly due to clock skew.
Before the fix, # of hold violation was 0 due to the bug.

I think the bazel-orfs/mock-array test case should be rebased. What do you think about this?

  • Note that ORFS - asap7/mock-array does not have this issue (slightly different flow).

@oharboe
Copy link
Collaborator

oharboe commented Sep 5, 2025

@maliberty @jhkim-pii How can there be hold violations when there are no input pin timing constraints?

@oharboe
Copy link
Collaborator

oharboe commented Sep 5, 2025

This incantation(it is documented in the openroad bazel docs, just repeating here for convenience) will create a standalone issue:

bazelisk run test/orfs/mock-array:MockArray_cts_deps /tmp/cts
/tmp/cts/make do-cts cts_issue

Untar and run:

cts_MockArray_asap7_base_2025-09-05_08-35.tar.gz

@oharboe
Copy link
Collaborator

oharboe commented Sep 5, 2025

Yes... the skew with this PR is a problem... Normally this is more or less flat...

image

@jhkim-pii
Copy link
Contributor

@oharboe I reproduced the bazel-orfs/mock-array issue locally, thanks to your documentation https://github.com/The-OpenROAD-Project/bazel-orfs?tab=readme-ov-file#make-issue-floorplan-example.

How can there be hold violations when there are no input pin timing constraints?

Hmm, I don't get it.

Let me explain more about the issue.

[ Before fix ]

(Run CTS...)

repair_timing -verbose -setup_margin 0 -repair_tns 100
[INFO RSZ-0100] Repair move sequence: UnbufferMove SizeUpMove SwapPinsMove BufferMove CloneMove SplitLoadMove 
[INFO RSZ-0098] No setup violations found                      // No setup violation
[INFO RSZ-0033] No hold violations found.                       // No hold violation.
Placement Analysis
---------------------------------
total displacement          0.0 u
average displacement        0.0 u
max displacement            0.0 u
original HPWL           64106.1 u
legalized HPWL          64106.1 u
delta HPWL                    0 %

Report metrics stage 4, cts final...

==========================================================================
cts final report_design_area
--------------------------------------------------------------------------
Design area 119790 u^2 88% utilization.
  • No setup and hold violation. I believe this is wrong because..
report_checks -to [get_pins REG_56\$_DFF_P_/D]
No paths found.
  • It does not report any timing path to the flip-flop D pin.

[ After fix ]

(Run CTS...)

// This time, there are a lot of setup & hold violations.

[2025-09-04T02:03:17.698Z]     final |       0 |     896 |            0 |    +0.1% | -56.489 | -7505.704 | ces_2_1/io_ins_left[26]
[2025-09-04T02:03:17.698Z] --------------------------------------------------------------------------------------
[2025-09-04T02:03:17.698Z] [WARNING RSZ-0066] Unable to repair all hold violations.
[2025-09-04T02:03:17.698Z] [INFO RSZ-0032] Inserted 896 hold buffers.
[2025-09-04T02:03:17.698Z] [ERROR RSZ-0060] Max buffer count reached.
[2025-09-04T02:03:17.698Z] Error: cts.tcl, 74 RSZ-0060
[2025-09-04T02:03:17.698Z] Command exited with non-zero status

It inserted many hold buffers, and it issued RSZ-0060 error (max buffer count reached).

openroad> report_checks -to [get_pins REG_56\$_DFF_P_/D]
Startpoint: ces_7_4 (rising edge-triggered flip-flop clocked by clock)
Endpoint: REG_56$_DFF_P_ (rising edge-triggered flip-flop clocked by clock)
Path Group: reg2reg
Path Type: max

Delay    Time   Description
---------------------------------------------------------
0.00    0.00   clock clock (rise edge)
311.39  311.39   clock network delay (propagated)
0.00  311.39 ^ ces_7_4/clock (Element)
161.96  473.35 ^ ces_7_4/io_lsbOuts_3 (Element)
36.69  510.04 ^ ces_7_5/io_lsbOuts_2 (Element)
36.47  546.51 ^ ces_7_6/io_lsbOuts_1 (Element)
32.34  578.86 ^ ces_7_7/io_lsbOuts_0 (Element)
0.04  578.90 ^ REG_56$_DFF_P_/D (DFFHQNx1_ASAP7_75t_R)
578.90   data arrival time

250.00  250.00   clock clock (rise edge)
425.19  675.19   clock network delay (propagated)
4.20  679.39   clock reconvergence pessimism
679.39 ^ REG_56$_DFF_P_/CLK (DFFHQNx1_ASAP7_75t_R)
-3.24  676.15   library setup time
676.15   data required time
---------------------------------------------------------
676.15   data required time
-578.90   data arrival time
---------------------------------------------------------
97.26   slack (MET)
  • It can report a timing path to the flip-flop D pin.

Thus, I believe no setup/hold violation result is wrong in the "before fix" case.
After the fix, it can see the setup & hold timing violations.
But there are too many hold violations and OR inserted max number of buffers. So it issued RSZ-0060 error.

Do you think there should be no hold violations?

@oharboe
Copy link
Collaborator

oharboe commented Sep 5, 2025

Yes: I expect no hold cells to be necessary for this design and indeed the test case is working as intended, if you insert more than a few hold cells, it should fail.

I'm software engineer dabbling in EDA/hardware, but by the looks of it, the problem is the enormous amount of skew that this design has now. Try the same for master and you'll see a nearly horizontal line of the leaves of the clock tree.

This design is intended to check that the clock tree can get little skew with macros and flipflops combined.

@jhkim-pii
Copy link
Contributor

Yes. The clock skew is bad.
So how about to merge this fix w/ the test case rebase, and handle the clock skew issue later?

  • But I don't have a good idea about how to rebase.

@oharboe
Copy link
Collaborator

oharboe commented Sep 5, 2025

Good place to start tomorrow :-)

@maliberty Thoughts?

@maliberty
Copy link
Member

@jhkim-pii please upload an image of the clock tree viewer so we can get a sense of the skew visually.

@jhkim-pii
Copy link
Contributor

jhkim-pii commented Sep 5, 2025

Before fix, Hierarchical flow

image image
  • No clock path for hard macros. This is an obvious problem. So the fix is needed for hierarchical flow (the fix does not affect the flat flow).

After fix, Hierarchical flow

image image

Before fix, Flat flow

image image
  • The clock skew of After fix, Hierarchical flow matches that of Before fix, Flat flow.
  • Interestingly, Flat flow does not issue any error after inserting ~1700 hold buffers while After fix, Hierarchical flow fired RSZ-0060 (too many buffer) error after inserting ~900 hold buffers.
  • Let me find the reason.

[ Flat flow CTS log ]

     2520 |       0 |    1776 |            0 |    +0.1% | -56.485 | -5875.319 | ces_2_1/io_ins_left[18]
     2530 |       0 |    1776 |            0 |    +0.1% | -56.485 | -5875.319 | ces_2_1/io_ins_left[18]
    final |       0 |    1776 |            0 |    +0.1% | -56.485 | -5875.319 | ces_2_1/io_ins_left[18]
--------------------------------------------------------------------------------------
[WARNING RSZ-0066] Unable to repair all hold violations.
[INFO RSZ-0032] Inserted 1776 hold buffers.                           // Many buffers are inserted, but no RSZ-0060 error.
Placement Analysis
---------------------------------
total displacement        453.2 u
average displacement        0.0 u
max displacement            1.2 u
original HPWL           64496.9 u
legalized HPWL          64563.3 u
delta HPWL                    0 %

Report metrics stage 4, cts final...

==========================================================================
cts final report_design_area
--------------------------------------------------------------------------
Design area 119938 u^2 88% utilization.

@oharboe
Copy link
Collaborator

oharboe commented Sep 5, 2025

Would it be helpful to have both OPENROAD_HIERARCHICAL=0/1 tests in OpenROAD?

$ rg HIER
orfs/mock-array/BUILD
138:        "OPENROAD_HIERARCHICAL": "1",
185:        "OPENROAD_HIERARCHICAL": "1",

@jhkim-pii
Copy link
Contributor

@oharboe Yes. It would be helpful.

@oharboe
Copy link
Collaborator

oharboe commented Sep 5, 2025

@oharboe Yes. It would be helpful.

#8247

@maliberty
Copy link
Member

@arthurjolo please help @jhkim-pii with what is going wrong with cts. I'd like to merge a pr with reasonable results if it isn't a lot more effort

@jhkim-pii
Copy link
Contributor

jhkim-pii commented Sep 5, 2025

Instance count is different. It is because there is no TAPCELL in the hierarchical flow.

                              Flat flow        Hier flow
---------------------------------------------------------------
Instance count (pre-CTS)      10861            4475
TAPCELL count                 6382                0
Buffer percent limit          20%               20%
Max buffer limit              2172              895                    = instance count * buffer percent limit
Hold buffer count            ~1700              896
                                                ERROR (exceed the max buffer count limit)

[ flat.v ]
image

  • TAPCELLs exist in flat flow. But no such TAPCELL in hier flow.

It looks like TAPCELLs are inserted in floorplan stage.
Need to find the reason of no TAPCELL insertion in hierarchical flow.

@maliberty @arthurjolo So this is not related to CTS.

@jhkim-pii
Copy link
Contributor

jhkim-pii commented Sep 5, 2025

@arthurjolo However, this design shows poor CTS quality. It requires further investigation.

@jhkim-pii
Copy link
Contributor

@arthurjolo I created a new issue #8255 for the large clock skew issue.

…ate/OpenROAD into secure-fix-cts-del-modnet

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
…id the RSZ-0060 (max buffer count reached) error

- Due to the large clock skew, there are many hold violations. It raises RSZ-0060 (max buffer count reached) error.
- This issue should eventually be resolved by improving the CTS engine.

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
@jhkim-pii
Copy link
Contributor

The flat flow avoided the RSZ-0060 (max buffer count reached) error because many physical-only cells (TAPCELL) are counted wrongly.
So the flat flow should be enhanced not to dump physical-only cells in the verilog output.
For more information, refer to the issue #8254

@jhkim-pii
Copy link
Contributor

jhkim-pii commented Sep 7, 2025

I set HOLD_SLACK_MARGIN -30 for the problematic bazel/mock-array test to avoid the RSZ-0060 error temporarily.
We can remove the HOLD_SLACK_MARGIN once the poor clock skew is improved by CTS engine enhancement.

Reverted this because it is not necessary due to the code change (In hierarchical flow, dbModule includes physical-only cells to use the consistent behavior w/ flat flow).

@github-actions
Copy link
Contributor

github-actions bot commented Sep 7, 2025

clang-tidy review says "All clean, LGTM! 👍"

@jhkim-pii jhkim-pii requested a review from maliberty September 7, 2025 11:54
…behavior b/w flat and hier flows.

2. Removed the temporary HOLD_SLACK_MARGIN as it is no longer needed.

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
@github-actions
Copy link
Contributor

github-actions bot commented Sep 9, 2025

clang-tidy review says "All clean, LGTM! 👍"

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
@github-actions
Copy link
Contributor

github-actions bot commented Sep 9, 2025

clang-tidy review says "All clean, LGTM! 👍"

…ate/OpenROAD into secure-fix-cts-del-modnet

Signed-off-by: Jaehyun Kim <jhkim@precisioninno.com>
@github-actions
Copy link
Contributor

github-actions bot commented Sep 9, 2025

clang-tidy review says "All clean, LGTM! 👍"

Copy link
Member

@maliberty maliberty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks fine. I need

@maliberty maliberty merged commit d4a4962 into The-OpenROAD-Project:master Sep 10, 2025
11 checks passed
@maliberty maliberty deleted the secure-fix-cts-del-modnet branch September 10, 2025 00:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

reg2reg path group is missing with read_db -hier

4 participants