Skip to content

Cts: create new balance latencies method#7823

Merged
maliberty merged 77 commits intoThe-OpenROAD-Project:masterfrom
arthurjolo:cts_balance_latencies
Sep 22, 2025
Merged

Cts: create new balance latencies method#7823
maliberty merged 77 commits intoThe-OpenROAD-Project:masterfrom
arthurjolo:cts_balance_latencies

Conversation

@arthurjolo
Copy link
Contributor

Needs #7821

This PR creates a new clock latencies balancer. Currently CTS only balance latencies between macros and registers trees, but when we have multiple sub-trees (eg.: multiple clock gaters) the sub-trees are not balanced between them eg.:

unity test gated_clock3 with master
gated_clock3_master

This PR proposes a new balance latencies method, that takes into account all sub-trees connected to a same clock source, the LatenciesBalancer. This class builds a graph that connects all sub-trees of the clock and travels throw it to insert delay buffers where its needed. The delay buffers are inserted as high up in the tree as possible, and try to share the mod amount of buffers as possible as well. A more detailed explanation of the algorithm can be found here.

This changes help improve skew specially for designs with multiple clock gaters, where the problem mention tends to appear, eg.:

unity test gated_clock3 with this PR
gated_clock3_LatenciesBalancer

arthurjolo added 30 commits May 23, 2025 11:44
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
…avg arrival time

Signed-off-by: arthurjolo <arthurjolo@gmail.com>
…e the lantancies

Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
…tree to insert delay buffers

Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
…avg arrival time

Signed-off-by: arthurjolo <arthurjolo@gmail.com>
…e the lantancies

Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
…tree to insert delay buffers

Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
Signed-off-by: arthurjolo <arthurjolo@gmail.com>
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@maliberty
Copy link
Member

Is there a metrics update PR?

Comment on lines +29 to +33
gated_clock1
gated_clock2
gated_clock3
gated_clock4
gated_clock5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add to the BUILD file.

techChar_->getLengthUnit(),
capPerDBU);
totalDelayBuff += balancer.run();
parasitics_guard.update();
Copy link
Member

@maliberty maliberty Sep 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move the guard to just inside the if-statement and the end of the scope will handle the update. The goal of the guard is to avoid needing to manually call update and to be exception safe.

Comment on lines 514 to 523
std::string newNetName
= "delaynet_" + std::to_string(delayBufIndex_) + "_" + clockName;
odb::dbNet* newNet = odb::dbNet::create(block, newNetName.c_str());
newNet->setSigType(odb::dbSigType::CLOCK);

// create a new delay buffer
std::string newBufName
= "delaybuf_" + std::to_string(delayBufIndex_++) + "_" + clockName;
odb::dbMaster* master = db_->findMaster(options_->getRootBuffer().c_str());
odb::dbInst* newBuf = odb::dbInst::create(block, master, newBufName.c_str());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhkim-pii any concerns about hierarchy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code should be the same as what we already have in CTS for inserting delay buffers (TritonCTS::insertDelayBuffer), I see that this code has changed, probably to deal with hierarchy. So probably what we have on this PR should be changed as well.

Signed-off-by: arthurjolo <arthurjl@precisioninno.com>
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"


// Create a new delay buffer and connect output pin of driver to input pin of
// new buffer. Output pin of new buffer will be connected later.
odb::dbInst* LatencyBalancer::createDelayBuffer(odb::dbNet* driverNet,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jhkim-pii can you give your review on this function to insure hierarchy is fine.

I tried to maintain the idea you used on TritonCTS::insertDelayBuffer, so the delay buffers are on the same module as their drivers, this way we avoid creating any extra ports on the modules for the worst cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good. As you mentioned, this makes a flat connection b/w the driver and the new buffer. So it does not need any hierarchy handling.

Signed-off-by: arthurjolo <arthurjl@precisioninno.com>
Signed-off-by: arthurjolo <arthurjl@precisioninno.com>
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"


// If it is not a leaf node compute the amount of buffers needed for its
// children
std::vector<odb::dbITerm*> sinksInput;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In a hierarchical flow, sinksInput can be located at different hierarchies, right?
Then, a hierarchy handling is required when the sink pin is disconnected and reconnected to the output pin of the new buffer because the hierarchy of the driver and the sink instance are different.

A simple solution is to use dbNetwork::hierarchicalConnect(dbITerm* source_pin, dbITerm* dest_pin, const char* connection_name) for the connection b/w the buffer output and the sink. hierarchicalConnect() automatically handles the hierarchy.
But its drawback is that it does not reuse the existing hierarchical ports and punches many new hierarchical ports.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arthurjolo Could you please add a test case with hierarchical flow? I think we can find issues from the hierarchical test case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a testcase hier_insertion_delay but maybe it is too simple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to do anything else at the disconnect? If i disconnect the pins and create a new hierarchical port than the previous connection port is dangling right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. hierarchicalConnect() will create a new port and make the previous dangling. It is the drawback of the current hierarchicalConnect().
I will enhance hierarchicalConnect() to fully reuse the existing nets and ports to minimize the port punching.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified testcase gated_clock4 to have a simple hierarchy and added hierarchicalConnect() for the delay buffers and its sinks. On the verilog output of gated_clock4 it looks like there the dangling ports are deleted but @jhkim-pii if you could take a look to confirm it is ok.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It is the expected behavior. hierarchicalConnect() removes unused ports and pins in the end. I forgot to mention it yesterday.

void dbEditHierarchy::hierarchicalConnect(dbITerm* source_pin,
                                          dbITerm* dest_pin,
                                          const char* connection_name)
...                                         
    // 3.6. During the addition of new ports and new wiring we may
    // leave orphaned pins, clean them up.
    std::set<dbModInst*> cleaned_up;
    for (auto module_to_clean_up : source_parent_tree) {
      dbModInst* mi = module_to_clean_up->getModInst();
      if (mi) {
        dlogHierConnCleaningUpSrc(mi);
        mi->removeUnusedPortsAndPins();
        cleaned_up.insert(mi);
      }
    }
    for (auto module_to_clean_up : dest_parent_tree) {
      dbModInst* mi = module_to_clean_up->getModInst();
      if (mi) {
        if (cleaned_up.find(mi) == cleaned_up.end()) {
          dlogHierConnCleaningUpDst(mi);
          mi->removeUnusedPortsAndPins();
          cleaned_up.insert(mi);
        }
      }
    }
  }

  dlogHierConnDone();    
}

Signed-off-by: arthurjolo <arthurjl@precisioninno.com>
Signed-off-by: arthurjolo <arthurjl@precisioninno.com>
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

Signed-off-by: arthurjolo <arthurjl@precisioninno.com>
Signed-off-by: arthurjolo <arthurjl@precisioninno.com>
@github-actions
Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@arthurjolo
Copy link
Contributor Author

Started a new CI to update the metrics. I was having an error in DPL after grt for one design, but it looks like it was fixed.

@maliberty
Copy link
Member

If the metrics are fine we can merge

@arthurjolo
Copy link
Contributor Author

Metrics are fine, there is one PR to update some small failures. However, I have one timeout in the private CI.

@maliberty maliberty merged commit 0974240 into The-OpenROAD-Project:master Sep 22, 2025
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants