Skip to content

Conversation

@PrecisEDAnon
Copy link

@PrecisEDAnon PrecisEDAnon commented Dec 29, 2025

This PR gates behind a toggle a power-saving resizer that cuts, on average, 8.5% power across various designs (range : 2-20%).

The following content was AI generated.

Optional “More Recover Power” engine for repair_timing -recover_power

Summary

This PR adds an optional alternative implementation of recover_power in the resizer. The new engine (RecoverPowerMore) is designed to recover additional power by systematically trading slack headroom for power via controlled cell swaps (size and VT) and (non-clock) buffer removal, while enforcing timing and DRV guardrails.

Default behavior is unchanged. The new engine is gated behind a runtime toggle so existing flows match the legacy recover_power unless explicitly enabled.

Motivation / When It Helps

repair_timing -recover_power is commonly used to reduce post-route power once timing is reasonably stable. In practice, additional power recovery is often available in non-critical regions (and sometimes within clock networks) but requires:

  • Better prioritization of “high power × high slack headroom” opportunities.
  • Iteration (recompute timing, spend slack again) rather than a single pass.
  • Guardrails to avoid regressions in setup/hold and DRV (slew/cap/fanout).

The new engine targets these gaps while remaining opt-in.

What This PR Changes

  • New engine: RecoverPowerMore (C++) as an alternate recover_power implementation.
  • Runtime selection (no rebuild needed): legacy vs. new engine is chosen at runtime via a toggle.
  • Tcl integration: adds a new flag to repair_timing:
    • repair_timing ... -recover_power <pct> -more_recover_power
  • Environment fallback: MORE_RECOVER_POWER=1 enables the new engine when the Tcl flag is not set.
    • Accepted “true” values: 1, true, yes, on
    • Accepted “false” values: 0, false, no, off, empty
    • Unrecognized values warn and default to false.

Usage / Toggle Behavior

This PR is intentionally modular:

  • If your flow does not call repair_timing -recover_power, nothing changes.
  • If -recover_power is used and the toggle is off, behavior matches the legacy engine.
  • If -recover_power is used and the toggle is on, the new RecoverPowerMore engine runs.

Enable / Disable

Enable per call:

repair_timing -recover_power 100 -more_recover_power

Enable via environment (useful for CI/harness A/B):

set ::env(MORE_RECOVER_POWER) 1
repair_timing -recover_power 100

Disable:

repair_timing -recover_power 100

Technical Details (High Level)

The new engine is a bounded, multi-pass greedy optimizer with rollback:

  1. Baseline capture (guardrails):
    • Records initial setup/hold WNS.
    • Records initial DRV violation counts (slew/cap/fanout).
  2. Timing floors:
    • If setup is closed (WNS ≥ 0): enforce WNS ≥ 0.
    • If setup is already failing: allow a small, bounded additional WNS degradation budget proportional to clock period and the requested -recover_power percent.
    • Enforce hold floor similarly (do not introduce hold failures when closed).
  3. Candidate selection and ranking:
    • Considers leaf logic standard cells (excluding dont_touch) and includes clock-driving instances for swap opportunities.
    • Ranks candidates by (instance power × available slack headroom), prioritizing high-impact, non-critical instances.
  4. Per-instance optimization (bounded effort):
    • Optionally tries buffer removal for non-clock buffers (clock buffers are not removed).
    • Attempts a limited number of swaps:
      • Downsize / area reduction first (dynamic power).
      • Lower leakage VT swaps next (leakage power).
  5. Acceptance criteria and rollback:
    • Each tentative move runs under Resizer journal, updates parasitics, and re-runs STA requireds.
    • A move is accepted only if:
      • Setup/hold remain above their floors, and
      • Estimated global DRV violation counts do not exceed the recorded baseline.
    • Rejected moves are cleanly rolled back via journal restore.
  6. Progress logging:
    • Prints a periodic progress table (area delta, swap/remove counts, WNS, total power, DRV counts).
    • -verbose emits per-move accept/reject debug prints.

Validation Recommendation

  1. Run a baseline with legacy repair_timing -recover_power <pct>.
  2. Re-run with -more_recover_power enabled under identical conditions.
  3. Compare at minimum: total power, WNS (setup/hold), DRV counts, and runtime.
  4. Start with smaller -recover_power values (e.g. 25/50) if runtime is a concern, then scale up.

CI Note

Default is off to preserve legacy behavior. It is recommended to exercise the new path in some CI configuration (e.g., nightly) by enabling MORE_RECOVER_POWER=1 so the alternate code path is continuously built and tested.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new power-saving resizer, RecoverPowerMore, which is gated behind a command-line flag and an environment variable. The changes are well-structured and consistent across the build system, C++ source code, and Tcl scripts.

My review focuses on the new implementation in RecoverPowerMore.cc and the Tcl integration. I've identified a few areas for improvement:

  • A potential semantic issue in instanceWorstSlack's return value for unconstrained paths.
  • An opportunity to simplify resistance value clamping.
  • A suggestion to improve the safety of the new Tcl environment variable handling.
  • A note on the use of const_cast, which appears to be a pattern to work with a legacy API but is worth discussing.

Overall, this is a solid contribution that adds a significant new capability. The code is complex but generally clear. Addressing the suggested points will enhance its correctness and maintainability.

Comment on lines 363 to 364
count += sta_->checkSlewLimits(const_cast<Net*>(net), true, nullptr, max_)
.size();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using const_cast can be risky and might indicate an issue with const correctness in the API. The checkSlewLimits function should ideally take a const Net*. If changing the sta API is not feasible, consider adding a comment explaining why const_cast is safe to use here (i.e., the function does not actually modify the Net object). This applies to other const_cast usages in this file as well.

found = true;
}

return found ? worst_slack : -std::numeric_limits<Slack>::infinity();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The function returns -std::numeric_limits<Slack>::infinity() when no driver pin is found. Semantically, an instance with no driver pins has no output paths to analyze, so its slack should be considered infinitely good (positive infinity), not infinitely bad. Returning std::numeric_limits<Slack>::infinity() would be more correct and align with the behavior for unconstrained paths.

Suggested change
return found ? worst_slack : -std::numeric_limits<Slack>::infinity();
return found ? worst_slack : std::numeric_limits<Slack>::infinity();

Comment on lines 641 to 648
float ra = resizer_->cellDriveResistance(a);
float rb = resizer_->cellDriveResistance(b);
if (ra <= 0.0f) {
ra = 0.0f;
}
if (rb <= 0.0f) {
rb = 0.0f;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic to ensure drive resistances ra and rb are non-negative can be simplified using std::max. This also makes the intent clearer.

Suggested change
float ra = resizer_->cellDriveResistance(a);
float rb = resizer_->cellDriveResistance(b);
if (ra <= 0.0f) {
ra = 0.0f;
}
if (rb <= 0.0f) {
rb = 0.0f;
}
float ra = std::max(0.0f, resizer_->cellDriveResistance(a));
float rb = std::max(0.0f, resizer_->cellDriveResistance(b));

Comment on lines 369 to 371
default {
set more_recover_power 1
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The default case for the environment variable MORE_RECOVER_POWER sets more_recover_power to 1. This could be surprising if the user makes a typo when setting the environment variable (e.g., MORE_RECOVER_POWER=ture). It might be safer to default to 0 and issue a warning for unrecognized values.

      default {
        utl::warn RSZ 0 "Unrecognized value for MORE_RECOVER_POWER: '$env_more_recover_power'. Defaulting to false."
        set more_recover_power 0
      }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant