Skip to content

Utage improve perf#781

Open
CJ362ff wants to merge 10 commits intoxs-devfrom
utage-improve-perf
Open

Utage improve perf#781
CJ362ff wants to merge 10 commits intoxs-devfrom
utage-improve-perf

Conversation

@CJ362ff
Copy link
Collaborator

@CJ362ff CJ362ff commented Mar 11, 2026

Summary by CodeRabbit

  • Configuration Updates

    • Increased MicroTAGE history lengths for improved branch prediction accuracy.
    • Removed MicroTAGE base table size parameter and adjusted MBTB delay from 2 to 0.
  • Bug Fix / Behavior

    • Adjusted predictor component registration which may affect internal predictor composition and runtime behavior.
  • Metrics

    • Added per-table allocation/commit/misprediction statistics for finer-grained prediction insights.

Cao Jiaming added 2 commits March 10, 2026 17:13
Change-Id: I04dee1db8d1dab65dca9ff775d7128e7cee33ed8
Change-Id: I608fe975cf564882034081f632cbf72ef1d78181
@coderabbitai
Copy link

coderabbitai bot commented Mar 11, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

MBTB numDelay set 2→0; MicroTAGE histLengths changed to [5,29,57,107] and baseTableSize removed in src/cpu/pred/BranchPredictor.py. decoupled_bpred.cc no longer registers ubtb and abtb. MicroTAGE prediction/commit paths now track provider-table and per-table allocation/commit stats.

Changes

Cohort / File(s) Summary
Branch predictor config
src/cpu/pred/BranchPredictor.py
MBTB numDelay 2→0; MicroTAGE histLengths updated to [5, 29, 57, 107]; removed MicroTAGE baseTableSize.
Decoupled BPU components
src/cpu/pred/btb/decoupled_bpred.cc
Removed registration of ubtb and abtb from the components list; component-add ordering adjusted.
MicroTAGE prediction plumbing
src/cpu/pred/btb/microtage.cc, src/cpu/pred/btb/microtage.hh
Propagate provider_table in TagePrediction; add per-table stats (updateTableAllocSuccess/Failure, commitTableCorrect/Wrong) and per-table mispredict accounting; initialise new stats and increment counters at prediction/update/commit points.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

perf

Suggested reviewers

  • jensen-yan
  • Yakkhini

Poem

🐇 I hopped through predictor fields at night,
Tuned histories long and delays to light.
Tables whisper which bit gave way,
Stats now count each provider's play.
A tiny tweak — the rabbit's delight.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title is too vague and generic. 'Utage improve perf' uses non-descriptive terms that don't clearly convey what specific changes were made to improve performance. Use a more descriptive title that explains the key optimization. For example: 'MicroTAGE: Enhance with provider table tracking and per-table statistics' or 'Optimize MicroTAGE branch predictor configuration and metrics'.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch utage-improve-perf

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Change-Id: Ica748d2d78c9414b4c5f36e0a8ccf5379b2cfabb
@github-actions
Copy link

🚀 Coremark Smoke Test Results

Branch IPC Change
Base (xs-dev) 2.2607 -
This PR 2.2658 📈 +0.0051 (+0.23%)

✅ Difftest smoke test passed!

@XiangShanRobot
Copy link

[Generated by GEM5 Performance Robot]
commit: 9670cfd
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

PR Master Diff(%)
Score 20.74 20.68 +0.26 🟢

[Generated by GEM5 Performance Robot]
commit: 9670cfd
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

PR Previous Commit Diff(%)
Score 20.74 20.79 -0.25 🔴

Cao Jiaming added 2 commits March 11, 2026 15:40
Change-Id: Ie1ddc306b3dec2b6d5f261fecd6bba7be9e8794d
Change-Id: Ifb85845ae323d6214e858c54193894ef30152509
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/cpu/pred/BranchPredictor.py (1)

1079-1079: Please re-validate this MicroTAGE retune with a broader perf sample.

The config is internally consistent, but the PR’s robot reports are mixed for the same Ideal BTB workflow (+0.26% vs master, -0.25% vs the previous commit). For a tuning-only change, that is too close to noise to establish a stable gain; please attach multi-run or multi-workload data for these new history lengths before merging.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/BranchPredictor.py` at line 1079, The MicroTAGE history-length
change (histLengths = VectorParam.Unsigned([5, 29, 57, 107], ...) in
BranchPredictor.py) needs a broader validation: run multiple independent
measurement trials and/or across multiple representative workloads for the Ideal
BTB workflow to reduce noise and confirm the effect is stable; collect and
attach aggregated statistics (mean, stddev, and confidence intervals) or per-run
results showing consistency versus both master and the previous commit before
merging so the tuning change is substantiated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/cpu/pred/BranchPredictor.py`:
- Line 1079: The MicroTAGE history-length change (histLengths =
VectorParam.Unsigned([5, 29, 57, 107], ...) in BranchPredictor.py) needs a
broader validation: run multiple independent measurement trials and/or across
multiple representative workloads for the Ideal BTB workflow to reduce noise and
confirm the effect is stable; collect and attach aggregated statistics (mean,
stddev, and confidence intervals) or per-run results showing consistency versus
both master and the previous commit before merging so the tuning change is
substantiated.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e1c9a639-2479-4d00-99ce-d576c8bdc98d

📥 Commits

Reviewing files that changed from the base of the PR and between 9670cfd and 444f903.

📒 Files selected for processing (1)
  • src/cpu/pred/BranchPredictor.py

@github-actions
Copy link

🚀 Coremark Smoke Test Results

Branch IPC Change
Base (xs-dev) 2.2607 -
This PR 2.2645 📈 +0.0038 (+0.17%)

✅ Difftest smoke test passed!

@XiangShanRobot
Copy link

[Generated by GEM5 Performance Robot]
commit: 444f903
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

PR Master Diff(%)
Score 20.65 20.68 -0.17 🔴

[Generated by GEM5 Performance Robot]
commit: 444f903
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

PR Previous Commit Diff(%)
Score 20.65 20.74 -0.43 🔴

Change-Id: I65f24ad3248d174598d0f09d46757527fc58041d
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cpu/pred/btb/decoupled_bpred.cc`:
- Around line 58-61: The ABTB instance was removed from the components vector
causing its recoverHist() not to be invoked during squash recovery, which leaves
aheadReadBtbEntries uncleared and leads to stale state; to fix it either add
ABTB (abtb) back into the components list where mbtb and microtage are pushed,
or, if you intentionally omit abtb from components, call abtb->recoverHist()
explicitly during the same recovery path (the squash recovery routine that
currently iterates components and calls recoverHist), ensuring
aheadReadBtbEntries is cleared; updateUsingS3Pred() should not be relied on for
this, so reference the abtb object and its recoverHist() method (or re-add abtb
to the components vector) to restore correct recovery behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5fc5c7a2-2d88-4838-b7ac-c39e758a9515

📥 Commits

Reviewing files that changed from the base of the PR and between 444f903 and a3d686b.

📒 Files selected for processing (2)
  • src/cpu/pred/BranchPredictor.py
  • src/cpu/pred/btb/decoupled_bpred.cc

Comment on lines +58 to +61
// if (ubtb->isEnabled()) components.push_back(ubtb);
// if (abtb->isEnabled()) components.push_back(abtb);
if (mbtb->isEnabled()) components.push_back(mbtb);
if (microtage->isEnabled()) components.push_back(microtage);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check if ubtb/abtb maintain any speculative history state
# that would need recovery during squashes

# Search for history-related state in UBTB and AheadBTB implementations
rg -n -C5 'specUpdateHist|recoverHist|history' src/cpu/pred/btb/btb_ubtb.cc src/cpu/pred/btb/btb_ubtb.hh src/cpu/pred/btb/abtb.cc src/cpu/pred/btb/abtb.hh 2>/dev/null || echo "Files not found, searching with fd..."

fd -e cc -e hh 'ubtb|abtb' src/cpu/pred/btb --exec rg -l 'history|specUpdate|recover' {} \;

Repository: OpenXiangShan/GEM5

Length of output: 6523


🏁 Script executed:

# Verify aheadReadBtbEntries is used/declared in abtb
rg -n 'aheadReadBtbEntries' src/cpu/pred/btb/abtb.hh src/cpu/pred/btb/abtb.cc

Repository: OpenXiangShan/GEM5

Length of output: 869


🏁 Script executed:

# Check if updateUsingS3Pred in ABTB does anything with the queue
rg -n -A 20 'AheadBTB::updateUsingS3Pred' src/cpu/pred/btb/abtb.cc

# Also check if there's any other recovery mechanism for ABTB
rg -n 'updateUsingS3Pred' src/cpu/pred/btb/decoupled_bpred.cc

Repository: OpenXiangShan/GEM5

Length of output: 1247


🏁 Script executed:

# Check if any code path still calls recoverHist for abtb directly (outside of components iteration)
rg -n 'abtb.*recoverHist|\.recoverHist.*abtb' src/cpu/pred/btb/decoupled_bpred.cc

# Check the actual components iteration to confirm abtb won't have recoverHist called
rg -n -B 5 -A 10 'for.*components' src/cpu/pred/btb/decoupled_bpred.cc | grep -A 10 'recoverHist'

Repository: OpenXiangShan/GEM5

Length of output: 44


🏁 Script executed:

# Check what recoverHist calls look like in decoupled_bpred
rg -n 'recoverHist' src/cpu/pred/btb/decoupled_bpred.cc

# Also check the exact component loop that calls history recovery
rg -n -B 3 -A 3 'for.*components' src/cpu/pred/btb/decoupled_bpred.cc | head -40

Repository: OpenXiangShan/GEM5

Length of output: 316


🏁 Script executed:

# Check the recoverHistoryForSquash function and the components loop
sed -n '977,1020p' src/cpu/pred/btb/decoupled_bpred.cc | cat -n

# Also verify that abtb is no longer in components by checking lines 58-62
sed -n '55,65p' src/cpu/pred/btb/decoupled_bpred.cc | cat -n

Repository: OpenXiangShan/GEM5

Length of output: 2712


ABTB removal from components breaks the history recovery mechanism.

While UBTB can be safely excluded from the components list (its recoverHist is explicitly marked "not used" and empty), ABTB has an active recoverHist() implementation (abtb.cc:330-335) that clears the aheadReadBtbEntries queue.

By commenting out ABTB from line 58-59, its recoverHist will not be called during squash recovery (line 1007 in decoupled_bpred.cc). The aheadReadBtbEntries queue will accumulate stale entries across branch mispredictions, creating inconsistent state. The updateUsingS3Pred() calls do not manage this queue.

Either restore ABTB to the components list or implement explicit queue recovery outside the component iteration.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/decoupled_bpred.cc` around lines 58 - 61, The ABTB instance
was removed from the components vector causing its recoverHist() not to be
invoked during squash recovery, which leaves aheadReadBtbEntries uncleared and
leads to stale state; to fix it either add ABTB (abtb) back into the components
list where mbtb and microtage are pushed, or, if you intentionally omit abtb
from components, call abtb->recoverHist() explicitly during the same recovery
path (the squash recovery routine that currently iterates components and calls
recoverHist), ensuring aheadReadBtbEntries is cleared; updateUsingS3Pred()
should not be relied on for this, so reference the abtb object and its
recoverHist() method (or re-add abtb to the components vector) to restore
correct recovery behavior.

Cao Jiaming added 3 commits March 11, 2026 17:48
Change-Id: Ibb8f3a35b17c62d2ee69707b0c0f9602830feb36
Change-Id: Ib50725992e93e3e72aae63f44ddd817371837e38
@XiangShanRobot
Copy link

[Generated by GEM5 Performance Robot]
commit: d3a66fd
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

PR Master Diff(%)
Score 21.07 20.68 +1.86 🟢

[Generated by GEM5 Performance Robot]
commit: d3a66fd
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

PR Previous Commit Diff(%)
Score 21.07 20.85 +1.05 🟢

Change-Id: I07769c0d6cc61d3720feb3ea2a23fece6f29c1dc
@XiangShanRobot
Copy link

[Generated by GEM5 Performance Robot]
commit: 5142616
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

PR Master Diff(%)
Score 20.76 20.68 +0.38 🟢

[Generated by GEM5 Performance Robot]
commit: 5142616
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

PR Previous Commit Diff(%)
Score 20.76 20.79 -0.13 🔴

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants