Skip to content

cpu: Align BTB TAGE allocation with RTL#800

Open
jensen-yan wants to merge 1 commit intoxs-devfrom
btb-tage-rtl-alloc-align
Open

cpu: Align BTB TAGE allocation with RTL#800
jensen-yan wants to merge 1 commit intoxs-devfrom
btb-tage-rtl-alloc-align

Conversation

@jensen-yan
Copy link
Collaborator

@jensen-yan jensen-yan commented Mar 18, 2026

Background

This PR aligns the BTB TAGE allocation policy in gem5 with the RTL fix from XiangShan PR #5677.

During perf comparison against RTL, gem5 showed significantly higher allocation failure and noticeably worse MPKI. The main mismatch was in the replacement eligibility of TAGE entries during allocation.

Root Cause

Before this patch, gem5 only allowed allocation into:

  • an invalid entry, or
  • an entry with useful == 0 and a weak prediction counter.

This is too restrictive.

After global useful reset, many entries may become useful == 0 but still keep a strong counter. In that case, they should become replaceable again. However, the old gem5 logic still blocked replacing them, so useful reset could not effectively free space for new allocations.

This leads to two visible problems:

  • updateAllocFailureNoValidTable stays unnecessarily high
  • longer-history entries are harder to allocate, which can hurt prediction accuracy and increase MPKI

gem5 also had an extra "age penalty" path that slowly weakened strong-but-not-useful entries instead of allowing immediate replacement. That behavior does not match the RTL after the fix.

Fix

This patch changes the allocation priority to match RTL:

  1. prefer an invalid entry
  2. otherwise prefer an entry with useful == 0 and a weak counter
  3. otherwise allow replacing any entry with useful == 0

In other words, strong && !useful entries are now directly replaceable, which is the key behavior change.

The same allocation rule is applied to both:

  • BTBTAGE
  • MicroTAGE

The old age-penalty path is removed because it is no longer needed and does not match RTL behavior.

Validation

Added/updated unit tests to cover the new behavior, including the case where a strong-but-not-useful entry must be replaceable.

Local checks performed:

  • scons build/RISCV/cpu/pred/btb/test/tage.test.debug --unit-test -j100
  • ./build/RISCV/cpu/pred/btb/test/tage.test.debug --gtest_filter=BTBTAGETest.AllocationBehaviorWithMultipleWays:BTBTAGETest.AllocationReplacesStrongNotUsefulEntry

Expected Impact

This patch is expected to:

  • reduce allocation failure caused by stale strong-but-not-useful entries
  • make useful reset effective again
  • bring gem5 BTB TAGE behavior closer to RTL
  • potentially improve MPKI on perf workloads

Change-Id: Id1b9d9e656ce068f27a04ec885344c5f325019d6
@coderabbitai
Copy link

coderabbitai bot commented Mar 18, 2026

📝 Walkthrough

Walkthrough

This PR refactors the allocation policy in BTB TAGE and MicroTAGE predictors, replacing the previous logic with a three-tier RTL-aligned priority: first prefer invalid ways, then weak and not-useful ways, finally any not-useful way. The changes introduce a selected_way mechanism for single candidate selection and consolidate the allocation path. Tests are updated to reflect the new policy.

Changes

Cohort / File(s) Summary
TAGE Allocation Logic Refactor
src/cpu/pred/btb/btb_tage.cc, src/cpu/pred/btb/microtage.cc
Replaced per-table allocation logic with three-tier RTL-aligned priority (invalid ways → weak not-useful ways → any not-useful way). Introduced selected_way mechanism for single candidate selection. Consolidated allocation path: identify suitable candidate, log/allocate if found, otherwise record failure. Removed prior age-penalty and per-way counter-based gating.
Test Updates and Documentation
src/cpu/pred/btb/test/btb_tage.test.cc
Updated test documentation to reflect new allocation policy. Added test case AllocationReplacesStrongNotUsefulEntry verifying allocation can replace strong but not-useful entries (appears duplicated in diff). Test initializes 2-way, 1-table setup and asserts replacement occurs with correct counter updates.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Possibly related PRs

  • PR #775: Directly modifies MicroTAGE/TAGE allocation logic in handleNewEntryAllocation and per-way allocation policy in the same source files.
  • PR #627: Also modifies TAGE allocation code in src/cpu/pred/btb/btb_tage.cc and microtage.cc with related allocation logic changes.

Suggested labels

align-kmhv3

Suggested reviewers

  • CJ362ff
  • Yakkhini
  • tastynoob

Poem

🐰✨ Three tiers of wisdom, a priority dance,
Invalid first, then weak—give each entry a chance,
The TAGE predictor now picks with grace,
No more confusion, just one chosen place! 🎯

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'cpu: Align BTB TAGE allocation with RTL' accurately summarizes the main change: refactoring the BTB TAGE allocation logic to match RTL behavior with a three-tier priority system.
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch btb-tage-rtl-alloc-align
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can use TruffleHog to scan for secrets in your code with verification capabilities.

Add a TruffleHog config file (e.g. trufflehog-config.yml, trufflehog.yml) to your project to customize detectors and scanning behavior. The tool runs only when a config file is present.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cpu/pred/btb/test/btb_tage.test.cc`:
- Around line 876-880: The test leaks the existing fixture predictor by
overwriting the pointer `tage` with a new BTBTAGE in the test; before
reassigning `tage = new BTBTAGE(1, 2, 10)` delete or reset the existing instance
created in SetUp() (or replace the raw pointer with a smart pointer) so you
don't leak memory—e.g., call delete on the current `tage` (or use tage.reset())
before creating the new BTBTAGE; ensure this change references the same `tage`
pointer used in SetUp() and preserves subsequent uses of `tage`, and update any
teardown if needed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 9893ea15-eb17-4469-8707-648d7d1a2d92

📥 Commits

Reviewing files that changed from the base of the PR and between 6b87b99 and c7e1e3b.

📒 Files selected for processing (3)
  • src/cpu/pred/btb/btb_tage.cc
  • src/cpu/pred/btb/microtage.cc
  • src/cpu/pred/btb/test/btb_tage.test.cc

Comment on lines +876 to +880
tage = new BTBTAGE(1, 2, 10); // only 1 predictor table, 2 ways
memset(&tage->tageStats, 0, sizeof(BTBTAGE::TageStats));
history.resize(64, false);
stagePreds.resize(2);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid leaking the fixture predictor when reinitializing tage.

On Line 876, tage is overwritten without deleting the instance created in SetUp(), which leaks one object per test run.

💡 Minimal fix
 TEST_F(BTBTAGETest, AllocationReplacesStrongNotUsefulEntry) {
-    tage = new BTBTAGE(1, 2, 10); // only 1 predictor table, 2 ways
+    delete tage;
+    tage = new BTBTAGE(1, 2, 10); // only 1 predictor table, 2 ways
     memset(&tage->tageStats, 0, sizeof(BTBTAGE::TageStats));
     history.resize(64, false);
     stagePreds.resize(2);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/test/btb_tage.test.cc` around lines 876 - 880, The test
leaks the existing fixture predictor by overwriting the pointer `tage` with a
new BTBTAGE in the test; before reassigning `tage = new BTBTAGE(1, 2, 10)`
delete or reset the existing instance created in SetUp() (or replace the raw
pointer with a smart pointer) so you don't leak memory—e.g., call delete on the
current `tage` (or use tage.reset()) before creating the new BTBTAGE; ensure
this change references the same `tage` pointer used in SetUp() and preserves
subsequent uses of `tage`, and update any teardown if needed.

@github-actions
Copy link

🚀 Coremark Smoke Test Results

Branch IPC Change
Base (xs-dev) 2.2665 -
This PR 2.2752 📈 +0.0086 (+0.38%)

✅ Difftest smoke test passed!

@XiangShanRobot
Copy link

[Generated by GEM5 Performance Robot]
commit: c7e1e3b
workflow: gem5 Align BTB Performance Test(0.3c)

Align BTB Performance

Overall Score

PR Master Diff(%)
Score 18.21 18.27 -0.31 🔴

1 similar comment
@XiangShanRobot
Copy link

[Generated by GEM5 Performance Robot]
commit: c7e1e3b
workflow: gem5 Align BTB Performance Test(0.3c)

Align BTB Performance

Overall Score

PR Master Diff(%)
Score 18.21 18.27 -0.31 🔴

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants