Skip to content

Conversation

@AcKoucher
Copy link
Contributor

@AcKoucher AcKoucher commented Apr 7, 2025

Context and Investigation Details

When running a Secure-CI for #6967 some results got much worse which didn't seem to make sense as we're literally making a change that should only have a positive impact w.r.t. the annealing result.

As an example, here's what happened to ngt45/ariane133:

base branch
image image

The run-time went through the roof suggesting that the new result we got was from the usage of an extreme value of utilization (this is the last resource MPL uses in order to attempt convergence - see HierRTLMP::placeChildrenUsingMinimumTargetUtil). I.e., none of the SA runs succeeded in finding a valid result.

After some painful debugging, I realized that there were situations in which the best valid result we kept was being evaluated as an invalid result. More investigation showed that the result being kept during SA was correct. However, when the time came to actually use the best valid result, we couldn't get the exact state that we wanted. That is because the results' shapes are stored as widths in a map to avoid an additional cache of the SoftMacros and the way to set the SoftMacro's shape is to call SoftMacro::setWidth() (that automatically sets the height) which is failing to set the width to the value we want.

Problems Addressed

1) The order of the bounds of the std cell clusters' width curves is inverted.

Std Cell and Mixed clusters have shapes curves:

The width curves are std::pairs in which first is the lower bound and second is the upper bound.
The height curves are std::pairs in which second is the lower bound and first is the upper bound.

The problem is that the order of the bounds for std cell clusters width curves are being set inverted and this messes up completely how we set the shapes for these clusters later on.

2) The best valid result approach requires the shapes curves of mixed clusters to be sorted during coarse shaping.

The simulated annealing performed during Coarse Shaping stage relies on the macro tilings that are generated for the macro clusters. Currently, in this stage, we generated the shape curves (that are discrete at this point) that will be used in SA for any kind of cluster without caring about sorting. However, for the best valid result to be also used in this stage, it requires the curves list to be sorted.

3) The shrink & restart mechanism should also update the SoftMacro shape curves.

Even fixing the previously cited problems, some best valid results were still being evaluated as invalid, because the mechanism to shrink the clusters when restarting SA was not updating the width/height curves along with the current width/height/area. So, if we were already at the lowest width and we shrank the cluster, we might end up generating a better result and keeping a width which we cannot set the cluster back to.

4) We should shrink & restart only if no valid result was generated at all.

If we have already found a valid solution, we should not shrink the clusters and anneal more. This is specially important when comparing final results in a batch of annealings. If a certain SA (e.g., id = 0) restarted twice and a different SA (e.g., id = 1) didn't have to rely on the shrink and restart mechanism, we'll be comparing costs between SAs with very different std cell area. As each shrinking reduces all the std cell clusters by 20% and we can restart twice, SA 0 will have only 64% of the originally computed std cell area and, if the the solution of SA 0 is the chosen one, GPL will have to deal with a solution made for a much smaller utilization!

Changes

  1. Fix lower/upper bounds' order of Std Cell SoftMacro's width curve.
  2. Ensure width curves of mixed/std cell clusters are sorted during coarse shaping.
  3. Ensure that the width/height lists are updated when shrinking clusters during a restart process inside annealing.
  4. Only shrink & restart SA if no valid result was generated.

    1) Fix lower/upper bounds' order of Std Cell SoftMacro's width curve.
    2) Ensure that the width/height lists are updated when shrinking
       clusters during a restart process inside annealing.

Signed-off-by: Arthur Koucher <[email protected]>
@AcKoucher
Copy link
Contributor Author

I'll start a Secure-CI and I really hope nothing blows up.

@github-actions
Copy link
Contributor

github-actions bot commented Apr 7, 2025

clang-tidy review says "All clean, LGTM! 👍"

@maliberty
Copy link
Member

@QuantamHD more progress towards your PR

@AcKoucher
Copy link
Contributor Author

I'm converting this to a draft as there's more that needs to be addressed here. Secure-CI showed that apparently SoftMacro::setWidth() might still not work correctly for coarse shaping annealing, because at that point we don't sort the width/height lists.

@AcKoucher AcKoucher marked this pull request as draft April 8, 2025 17:01
   1) Ensure width curves of mixed/std cell clusters are sorted
      during coarse shaping.
   2) Only shrink & restart SA if no valid result was generated.

Signed-off-by: Arthur Koucher <[email protected]>
Signed-off-by: Arthur Koucher <[email protected]>
@AcKoucher AcKoucher changed the title mpl: fix shape curves and ensure they're updated when shriking clusters mpl: fix shape curves and annealing behavior Apr 9, 2025
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@github-actions
Copy link
Contributor

github-actions bot commented Apr 9, 2025

clang-tidy review says "All clean, LGTM! 👍"

1 similar comment
@github-actions
Copy link
Contributor

github-actions bot commented Apr 9, 2025

clang-tidy review says "All clean, LGTM! 👍"

@AcKoucher AcKoucher marked this pull request as ready for review April 10, 2025 02:10
@AcKoucher
Copy link
Contributor Author

I pushed new changes and updated the description accordingly.
Running another Secure-CI.

@AcKoucher AcKoucher requested a review from maliberty April 10, 2025 02:12
@AcKoucher
Copy link
Contributor Author

Secure-CI had only some metrics degradation which are being addressed in the mentioned ORFS PR.

@maliberty maliberty merged commit 715cf2d into The-OpenROAD-Project:master Apr 10, 2025
12 checks passed
@AcKoucher AcKoucher deleted the mpl-fix-shape-curves branch April 10, 2025 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants