Skip to content

Bpu add percep perf#774

Open
ProgramGP wants to merge 26 commits intoxs-devfrom
bpu-add-percep-perf
Open

Bpu add percep perf#774
ProgramGP wants to merge 26 commits intoxs-devfrom
bpu-add-percep-perf

Conversation

@ProgramGP
Copy link

@ProgramGP ProgramGP commented Mar 6, 2026

Try to incorporate a perceptron-based branch prediction module into the existing decoupled btb to enhance the prediction accuracy for hard-to-predict branches.

Summary by CodeRabbit

  • New Features

    • Perception-based prediction added to MGSC to improve branch prediction accuracy.
    • Global Backward History (GBHR) support integrated across BTB predictors.
  • Configuration

    • New tunables for perception: table size, weight width, GBHR length, and confidence threshold.
    • Example configuration updated to enable MGSC perception settings.
  • Chores

    • Updated repository ignore patterns.
  • Notes

    • Perceptron predictor scaffold added as a commented placeholder.

Change-Id: I59a2e380d18892bb1c1d2fcf76370cf95fec49f8
Change-Id: Ic545b267d0ee527800aa5c591bbfa4f25ba786f3
@coderabbitai
Copy link

coderabbitai bot commented Mar 6, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds GBHR hooks and perception-based prediction to MGSC, exposes needGBHR across BTB predictors and timed base interface, wires GBHR/perception calls into decoupled BPU, enables MGSC perception in example config, and adds commented perception skeleton files plus .gitignore entries.

Changes

Cohort / File(s) Summary
Build/config
\.gitignore
Added ignore patterns: NEMU, workrecord.txt.
Example config
configs/example/kmhv3.py
Enabled MGSC and added perception params (enablePerceptionPred, percepTableEntryNum, gbhrLen, percepTableWidth, percepThres).
Predictor param defs
src/cpu/pred/BranchPredictor.py
Exposed needGBHR for BTBTAGE/BTBITTAGE/BTBMGSC and added MGSC perception config parameters.
Predictor interface
src/cpu/pred/btb/timed_base_pred.hh
Added virtual hooks specUpdateGBHR and recoverGBHR and needGBHR flag to timed base predictor interface.
TAGE/ITTAGE variants
src/cpu/pred/btb/btb_tage.cc, src/cpu/pred/btb/btb_ittage.cc
Initialize needGBHR from Params; minor formatting updates.
Decoupled BPU wiring
src/cpu/pred/btb/decoupled_bpred.cc
Call specUpdateGBHR and recoverGBHR when components report needGBHR.
MGSC predictor (core)
src/cpu/pred/btb/btb_mgsc.hh, src/cpu/pred/btb/btb_mgsc.cc
Integrated perception: GBHR storage, percep weight table, calculatePercepSum, updatePercepTable, specUpdateGBHR, recoverGBHR; extended MgscPrediction/MgscMeta/MgscStats and propagated percep fields through predict/update/recover flows; added test accessors and config init.
Perception placeholders
src/cpu/pred/btb/btb_perceprion.hh, src/cpu/pred/btb/btb_perception.cc
Added skeleton/commented perceptron predictor header and cc files (no active code).

Sequence Diagram(s)

sequenceDiagram
    rect rgba(200,200,255,0.5)
    participant Client as Client
    end
    rect rgba(200,255,200,0.5)
    participant MGSC as BTB MGSC
    participant GBHR as GBHR Storage
    participant Percep as Perception Table
    participant Stats as MgscStats
    end

    Client->>MGSC: predict(pc)
    MGSC->>GBHR: read gbhr
    MGSC->>Percep: calculatePercepSum(pc, gbhr)
    Percep-->>MGSC: percep_sum
    MGSC->>MGSC: combine percep_sum with MGSC score
    alt use perception
        MGSC->>Stats: increment usePercepPred
    else fallback
        MGSC->>Stats: record normal MGSC decision
    end
    MGSC-->>Client: return prediction

    Client->>MGSC: update(actual_taken)
    MGSC->>GBHR: specUpdateGBHR(pred)
    MGSC->>Percep: updatePercepTable(percep_sum, actual_taken)
    Percep->>Percep: adjust weights
    MGSC->>Stats: record percepPredCorrect/Wrong

    Client->>MGSC: recover(mispred)
    MGSC->>GBHR: recoverGBHR(entry, shamt, taken)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

  • Mgsc2 align #686: Overlaps MGSC predictor changes and perception/SC-related fields and statistics.
  • Sc ut #710: Related MGSC internals, constructors, and spec/recover signature updates.
  • Mgsc5 align #768: Related BTBMGSC internals and GBHR/needGBHR initialization across predictors.

Suggested labels

align-kmhv3

Suggested reviewers

  • jensen-yan
  • Yakkhini

Poem

🐇 I hopped through weights and a GBHR trail,
Tuned tiny neurons where branches prevail,
MGSC listens now, weighing each leap,
Bytes and burrows where predictions sleep,
A rabbit’s cheer for branches grown hale.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.37% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Bpu add percep perf' is vague and uses non-descriptive abbreviations; it is unclear what 'percep' and 'perf' mean without examining the PR details. Use a more descriptive title such as 'Add perceptron-based perception predictor to BTB' or 'Incorporate perception module into decoupled branch predictor' to clearly communicate the main change.
✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bpu-add-percep-perf
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can disable sequence diagrams in the walkthrough.

Disable the reviews.sequence_diagrams setting to disable sequence diagrams in the walkthrough.

@ProgramGP ProgramGP added the perf label Mar 6, 2026
@github-actions
Copy link

github-actions bot commented Mar 6, 2026

🚀 Performance test triggered: spec06-0.8c

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (5)
src/cpu/pred/btb/btb_perception.cc (1)

1-31: Consider removing entirely commented-out placeholder file.

This file contains only commented-out skeleton code with no functional logic. Committing commented-out code adds maintenance burden without providing value. Either:

  1. Remove this file until the implementation is ready
  2. Uncomment and implement a minimal stub that compiles
  3. Add a clear TODO comment at the top indicating when this will be activated
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/btb_perception.cc` around lines 1 - 31, This file
(btb_perception.cc) only contains commented-out skeleton code (namespaces
gem5::branch_prediction::btb_pred and the UNIT_TEST blocks); remove the file
entirely if not needed, or replace the contents with a minimal compilable stub
(e.g., provide empty definitions for the intended symbols/namespaces or a simple
no-op class/function) or add a clear top-line TODO comment stating when it will
be activated; ensure any stub compiles with the build by matching expected
symbol names (gem5, branch_prediction, btb_pred, UNIT_TEST) so the rest of the
codebase does not reference missing symbols.
src/cpu/pred/btb/btb_mgsc.cc (4)

306-318: Unnecessary vector copy in calculatePercepSum.

Line 310 copies the entire weight vector. Use a const reference to avoid allocation overhead on every prediction lookup.

♻️ Proposed fix
 int
 BTBMGSC::calculatePercepSum(Addr pc)
 {
     int index = pc % percepTableEntryNum;
-    auto weight = percepWeightTable[index];
+    const auto& weight = percepWeightTable[index];
 
     int bias = 1;
     int percep_sum = bias * weight[0];
     for (int i=0;i<gbhrLen;i++){
         percep_sum += ((gbhr[i]?1:-1)*weight[i+1]);
     }
     return percep_sum;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/btb_mgsc.cc` around lines 306 - 318, The calculatePercepSum
function is currently copying the weight vector from percepWeightTable (auto
weight = percepWeightTable[index]) causing unnecessary allocations; change this
to bind a const reference (e.g., const auto& weight = percepWeightTable[index])
so the weights are accessed by reference without copying, keeping the rest of
BTBMGSC::calculatePercepSum logic (bias, gbhr loop) unchanged.

1488-1490: Redundant ternary operator.

Line 1489 uses an unnecessary ternary. The comparison already produces a bool.

♻️ Simplify
-        percep_taken = (percep_sum >= 0) ? true : false;
+        percep_taken = percep_sum >= 0;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/btb_mgsc.cc` around lines 1488 - 1490, The assignment to
percep_taken uses a redundant ternary; replace "percep_taken = (percep_sum >= 0)
? true : false;" with a direct boolean expression "percep_taken = (percep_sum >=
0);" in the code around pred_it->second.percep_sum handling (where percep_sum,
percep_taken and pred_hit are set) so the comparison itself yields the bool
value instead of using ? true : false.

1241-1252: Inefficient GBHR shift operation.

Using pop_back() followed by insert(gbhr.begin(), ...) on a std::vector<bool> is O(n) for both operations. Consider using std::rotate or changing the data structure to std::deque<bool> for O(1) front insertion, especially if gbhrLen is large.

Also, unlike other specUpdate* methods, this doesn't check if shamt == 0 before updating.

♻️ Alternative using rotate
 void
 BTBMGSC::specUpdateGBHR(FullBTBPrediction &pred)
 {
     int shamt;
     bool cond_taken;
     std::tie(shamt, cond_taken) = pred.getHistInfo();
+    if (shamt == 0) {
+        return;
+    }
     if (!gbhr.empty()) {
-        gbhr.pop_back();
-        gbhr.insert(gbhr.begin(), cond_taken);
+        std::rotate(gbhr.begin(), gbhr.end() - 1, gbhr.end());
+        gbhr[0] = cond_taken;
     }
-
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/btb_mgsc.cc` around lines 1241 - 1252, The specUpdateGBHR
implementation in BTBMGSC::specUpdateGBHR should first skip updates when shamt
== 0 (use shamt from FullBTBPrediction::getHistInfo) to match other specUpdate*
methods, and replace the pop_back/insert pattern on gbhr with an O(n)‑avoiding
approach: either use std::rotate to shift the vector<bool> one position and
assign gbhr[0] = cond_taken, or change gbhr to std::deque<bool> and do
gbhr.push_front(cond_taken) followed by pop_back() for O(1) operations; update
any uses of gbhrLen accordingly.

1390-1402: Simplify GBHR recovery with direct assignment.

The element-by-element loop at lines 1396-1397 can be replaced with direct vector assignment. Same inefficiency note about pop_back()/insert() applies here.

♻️ Proposed simplification
 void
 BTBMGSC::recoverGBHR(const FetchTarget &entry, int shamt, bool cond_taken){
     if (!isEnabled()) {
         return;  // No recover when disabled
     }
     std::shared_ptr<MgscMeta> predMeta = std::static_pointer_cast<MgscMeta>(entry.predMetas[getComponentIdx()]);
-    for (int i=0;i<gbhrLen;i++)
-        gbhr[i] = predMeta->gbhr[i];
+    gbhr = predMeta->gbhr;
     if (!gbhr.empty()) {
         gbhr.pop_back();
         gbhr.insert(gbhr.begin(), cond_taken);
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/btb_mgsc.cc` around lines 1390 - 1402, Replace the
element-wise copy in BTBMGSC::recoverGBHR by assigning the stored vector
directly (gbhr = predMeta->gbhr) and avoid the pop_back()/insert() pair by
shifting the vector in-place (e.g., use std::rotate on gbhr to move elements one
position toward the end and then set gbhr[0] = cond_taken); keep the existing
guards (isEnabled(), FetchTarget/predMetas[getComponentIdx()], and empty check)
and ensure <algorithm> is available for rotate.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cpu/pred/btb/btb_mgsc.cc`:
- Around line 448-451: The percep_index calculation uses an undefined identifier
`pc`; change it to use the BTB entry's PC to match the preceding line—i.e.,
replace the use of `pc` in the `percep_index` computation with `btb_entry.pc` so
it parallels the call to `calculatePercepSum(btb_entry.pc)` and correctly
computes `percep_index` modulo `percepTableEntryNum`.
- Around line 669-689: The bias weight at weightTable[index][0] in
BTBMGSC::updatePercepTable is updated without saturation while other weights use
updateCounter; change the bias update to use the same saturation logic (either
call updateCounter with the same percepTableWidth or apply equivalent clamping
to int16_t bounds derived from percepTableWidth) so weightTable[index][0] cannot
overflow/underflow when incrementing/decrementing the bias.

In `@src/cpu/pred/btb/btb_perceprion.hh`:
- Around line 1-133: The file has a filename typo (btb_perceprion.hh →
btb_perceptron.hh) and contains only commented-out skeleton code; either delete
both the misspelled header and its .cc counterpart or rename the header to
btb_perceptron.hh and replace the commented content with a minimal compilable
stub that declares the PerceptronBP class (include the PerceptronBP symbol,
WeightEntry, PerceptronPrediction, and required public virtual overrides used by
TimedBaseBTBPredictor) with empty method bodies/inline defaults so the build can
succeed until the full implementation is added. Ensure the header guard and
namespace (gem5::branch_prediction::btb_pred) are correct and match any existing
references.

In `@workrecord`:
- Around line 1-99: The file workrecord contains sensitive/internal notes and
must be removed from the commit: delete it from the index (e.g., git rm --cached
workrecord) so it is not tracked, add a line "workrecord" to .gitignore to
prevent re-adding, and create a follow-up commit (or amend the current commit if
not yet pushed) that removes the file from the repository; reference the
filenames workrecord and .gitignore when making these changes.

---

Nitpick comments:
In `@src/cpu/pred/btb/btb_mgsc.cc`:
- Around line 306-318: The calculatePercepSum function is currently copying the
weight vector from percepWeightTable (auto weight = percepWeightTable[index])
causing unnecessary allocations; change this to bind a const reference (e.g.,
const auto& weight = percepWeightTable[index]) so the weights are accessed by
reference without copying, keeping the rest of BTBMGSC::calculatePercepSum logic
(bias, gbhr loop) unchanged.
- Around line 1488-1490: The assignment to percep_taken uses a redundant
ternary; replace "percep_taken = (percep_sum >= 0) ? true : false;" with a
direct boolean expression "percep_taken = (percep_sum >= 0);" in the code around
pred_it->second.percep_sum handling (where percep_sum, percep_taken and pred_hit
are set) so the comparison itself yields the bool value instead of using ? true
: false.
- Around line 1241-1252: The specUpdateGBHR implementation in
BTBMGSC::specUpdateGBHR should first skip updates when shamt == 0 (use shamt
from FullBTBPrediction::getHistInfo) to match other specUpdate* methods, and
replace the pop_back/insert pattern on gbhr with an O(n)‑avoiding approach:
either use std::rotate to shift the vector<bool> one position and assign gbhr[0]
= cond_taken, or change gbhr to std::deque<bool> and do
gbhr.push_front(cond_taken) followed by pop_back() for O(1) operations; update
any uses of gbhrLen accordingly.
- Around line 1390-1402: Replace the element-wise copy in BTBMGSC::recoverGBHR
by assigning the stored vector directly (gbhr = predMeta->gbhr) and avoid the
pop_back()/insert() pair by shifting the vector in-place (e.g., use std::rotate
on gbhr to move elements one position toward the end and then set gbhr[0] =
cond_taken); keep the existing guards (isEnabled(),
FetchTarget/predMetas[getComponentIdx()], and empty check) and ensure
<algorithm> is available for rotate.

In `@src/cpu/pred/btb/btb_perception.cc`:
- Around line 1-31: This file (btb_perception.cc) only contains commented-out
skeleton code (namespaces gem5::branch_prediction::btb_pred and the UNIT_TEST
blocks); remove the file entirely if not needed, or replace the contents with a
minimal compilable stub (e.g., provide empty definitions for the intended
symbols/namespaces or a simple no-op class/function) or add a clear top-line
TODO comment stating when it will be activated; ensure any stub compiles with
the build by matching expected symbol names (gem5, branch_prediction, btb_pred,
UNIT_TEST) so the rest of the codebase does not reference missing symbols.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4a0c661c-10a4-483e-b4cd-227017a2b3d7

📥 Commits

Reviewing files that changed from the base of the PR and between 318c57c and 5164f71.

📒 Files selected for processing (12)
  • .gitignore
  • configs/example/kmhv3.py
  • src/cpu/pred/BranchPredictor.py
  • src/cpu/pred/btb/btb_ittage.cc
  • src/cpu/pred/btb/btb_mgsc.cc
  • src/cpu/pred/btb/btb_mgsc.hh
  • src/cpu/pred/btb/btb_perceprion.hh
  • src/cpu/pred/btb/btb_perception.cc
  • src/cpu/pred/btb/btb_tage.cc
  • src/cpu/pred/btb/decoupled_bpred.cc
  • src/cpu/pred/btb/timed_base_pred.hh
  • workrecord

Comment on lines +1 to +133
// #ifndef __CPU_PRED_BTB_PERCEPTRON_BP_HH__
// #define __CPU_PRED_BTB_PERCEPTRON_BP_HH__

// #include <cstdint>
// #include <deque>
// #include <map>
// #include <utility>
// #include <vector>

// #include <unordered_map>
// #include "base/types.hh"
// #include "cpu/inst_seq.hh"
// #include "cpu/pred/btb/common.hh"
// #include "cpu/pred/btb/timed_base_pred.hh"

// namespace gem5 {
// namespace branch_prediction {
// namespace btb_pred {

// // 感知机分支预测器类
// class PerceptronBP : public TimedBaseBTBPredictor {
// public:
// // 定义权重表中单个权重的结构(可以是整数,但通常用饱和计数器)
// struct WeightEntry {
// int weight; // 权重值,可以是8位有符号整数
// WeightEntry() : weight(0) {}
// WeightEntry(int w) : weight(w) {}
// };

// // 定义感知机预测结果
// struct PerceptronPrediction {
// bool taken; // 最终预测结果
// int sum; // 点积和值,用于置信度判断
// unsigned index; // 权重表索引
// std::vector<int> inputs; // 输入历史向量

// PerceptronPrediction() : taken(false), sum(0), index(0) {}
// PerceptronPrediction(bool taken, int sum, unsigned index,
// const std::vector<int>& inputs)
// : taken(taken), sum(sum), index(index), inputs(inputs) {}
// };

// // 构造函数(生产环境)
// PerceptronBP(const Params& p);

// // 析构函数
// ~PerceptronBP();

// // 重写TimedBaseBTBPredictor的接口函数
// void tickStart() override;
// void tick() override;
// void dryRunCycle(Addr startPC) override;
// void putPCHistory(Addr startPC, const boost::dynamic_bitset<> &history,
// std::vector<FullBTBPrediction> &stagePreds) override;
// std::shared_ptr<void> getPredictionMeta() override;
// void specUpdatePHist(const boost::dynamic_bitset<> &history,
// FullBTBPrediction &pred) override;
// void recoverPHist(const boost::dynamic_bitset<> &history,
// const FetchTarget &entry, int shamt, bool cond_taken) override;
// void update(const FetchTarget &stream) override;
// bool canResolveUpdate(const FetchTarget &stream) override;
// void doResolveUpdate(const FetchTarget &stream) override;
// void commitBranch(const FetchTarget &stream, const DynInstPtr &inst) override;

// // 设置跟踪信息
// void setTrace() override;

// // 检查折叠历史(用于调试)
// void checkFoldedHist(const boost::dynamic_bitset<> &history, const char *when);

// private:
// // 辅助函数:根据PC获取权重表索引
// unsigned getWeightIndex(Addr pc) const;

// // 辅助函数:获取分支历史向量(双极性表示)
// std::vector<int> getBranchHistoryVector() const;

// // 辅助函数:计算点积和
// int calculateSum(const std::vector<int>& inputs,
// const std::vector<WeightEntry>& weights) const;

// // 辅助函数:更新全局分支历史寄存器
// void updateGBHR(bool taken);

// // 辅助函数:更新权重表(用于学习)
// void updateWeightTable(unsigned index, const std::vector<int>& inputs,
// int actual_outcome);

// // 辅助函数:饱和算术(防止权重溢出)
// int saturateWeight(int value) const;

// // 预测单个BTB条目
// PerceptronPrediction generateSinglePrediction(const BTBEntry &btb_entry,
// const Addr &startPC);

// // 准备更新的BTB条目
// std::vector<BTBEntry> prepareUpdateEntries(const FetchTarget &stream);

// // 主要的预测逻辑
// void lookupHelper(const Addr &startPC, const std::vector<BTBEntry> &btbEntries,
// std::unordered_map<Addr, TageInfoForMGSC> &tageInfoForMgscs,
// CondTakens& results);

// // 权重表:二维向量,[分支PC索引][历史位]
// std::vector<std::vector<WeightEntry>> weightTable;

// // 全局分支历史寄存器 (GBHR)
// std::vector<int> gbhr; // 双极性表示:1 = Taken, -1 = Not Taken

// // 全局分支历史长度
// unsigned historyLength;

// // 每个权重的位宽
// unsigned weightBits;

// // 预测置信度阈值
// int threshold;

// // 权重表大小
// unsigned tableSize;

// // 统计信息
// struct Stats {
// // 添加必要的统计项
// // ...
// } stats;
// };

// } // namespace btb_pred
// } // namespace branch_prediction
// } // namespace gem5

// #endif // __CPU_PRED_BTB_PERCEPTRON_BP_HH__ No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix filename typo and remove commented-out code.

Two issues with this file:

  1. Filename typo: btb_perceprion.hh should be btb_perceptron.hh
  2. Entirely commented-out code: Like btb_perception.cc, this file contains only commented skeleton code

Either:

  • Remove both files until the implementation is ready
  • Rename to correct spelling and provide a minimal compilable stub
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/btb_perceprion.hh` around lines 1 - 133, The file has a
filename typo (btb_perceprion.hh → btb_perceptron.hh) and contains only
commented-out skeleton code; either delete both the misspelled header and its
.cc counterpart or rename the header to btb_perceptron.hh and replace the
commented content with a minimal compilable stub that declares the PerceptronBP
class (include the PerceptronBP symbol, WeightEntry, PerceptronPrediction, and
required public virtual overrides used by TimedBaseBTBPredictor) with empty
method bodies/inline defaults so the build can succeed until the full
implementation is added. Ensure the header guard and namespace
(gem5::branch_prediction::btb_pred) are correct and match any existing
references.

workrecord Outdated
Comment on lines +1 to +99
5.提示缺失libpython3.8.so.1.0时,执行export LD_LIBRARY_PATH=$CONDA_PREFIX/lib:$LD_LIBRARY_PATH
6.需要连接github时需要代理:export https_proxy=http://172.38.11.182:10808
export http_proxy=http://172.38.11.182:10808

工作备忘-2.13:
1.根据老师的指导,感知机预测器的工作应该在sc模块之前完成,最好可以与tage模块一同完成
2.预测器整体预期工作流程:对于给定pc,先ubtb进行预测,然后tage和感知机进行预测,s2仍然输出tage预测结果,
然后两者预测结果共同传给sc,由sc根据置信度判断是否抛弃tage预测结果转而接受感知机预测结果。
3.下一步方向:阅读官方文档,确认目前关于tage-sc的工作原理理解没有问题;继续查找文献,看看有没有把感知机预测器跟tage对比的论文或实验结果;
查找文献,看看有没有感知机预测器与sc协作的案例;阅读感知机文献,尝试找对应代码,先进行编写感知机代码
4.编写代码前置任务:阅读tage代码,为什么会写那么多???按照老师的指导,目前tage应该也会输出置信度相关数据,找到sc根据置信度判断的位置,
留下修改标记
5.注意:预测器同样是流水线工作的,每个周期都会有s1,s2,s3结果输出,但都针对的是不同的分支指令
6.分支预测器源码目录:src/cpu/pred/btb
7.pred目录下,有multiperspective_perception.cc,目测为gem5提供的感知机预测器代码,可以先尝试把这部分接入原本处理器中
或者在编写自己的代码时可以参考这部分代码

2.15:
1.样例采用的配置文件在example/kmhv3.py
2.在xiangshan.py中显式限制了,所用的分支预测器必须为解耦型的,在考虑修改配置文件前,或者详细了解配置文件前,
可以先放弃直接用感知机预测器取代解耦预测器的实验方案了。
3.根据数据分析,当前的预测准确率仅有50%多?MGSC似乎没有被使用,需要详细阅读解耦分支预测器部分内容。

2.23:
1.一些需要关注的参数:直观参数condNum,condMiss
2.根据千问提供的内容,目前的mgsc中自带机器学习的内容,以及在配置中设置mgsc=true之后,ipc反而下降了
目前需要加强理解感知机预测器的工作原理,以及mgsc预测器的工作原理,弄清楚两者差别。

2.27:
1.mgsc就是需要综合的sc预测器,把感知机当做其中的一个预测表项进行使用,预测和更新都由sc进行驱动,可以仿照某一表项进行编写
2.mgsc的配置变量就是enableBwTable等,目前尚未确定是在哪里进行具体的修改,在BranchPredictor.py中有赋值,但似乎没有影响。
3.每次修改源码之后,必须重新构建一下gem5才能使源码的修改生效。
4.加入mgsc后,ipc略微减少,预测准确率略微上升

3.1:
1.记录需要配置的变量:
/** perception pred param*/
unsigned percepTableEntryNum; // 权重组的总数
unsigned gbhrLen;
unsigned percepTableWidth;
unsigned PercepThres;

bool enablePerceptionPred;

2.新的结构:权重表,全局分支历史寄存器

3.2:
1.recordPredictionStats函数中更新了mgscStats,涉及计数器相关内容,完成主体部分代码后,可以增加相关计数器
2.代码中有关recover的部分暂未了解清楚,meta中存储的内容比我认为的要少很多。(并非,meta存储了整个pred,实际上
把预测所用到的大部分数据都进行了存储)
3.某个地方设置了MgscTrace,目前还没有更改这部分内容。

3.3:
1.gbhr理应跟随foldHist更新、恢复
2.为了保证GBHR更新只会在mgsc更新时更新,声明了新变量needGBHR,在其他结构中也加入了相关的赋值逻辑,声明在timed_base_pred.hh中
3.代码基本完成,下面需要学习如何调试gem5以及提高效率。
4.在启用UNIT_TEST的情况下,可能还需要对test脚本进行更改。
5.代码TODO:计数器,mgscTrace
6.因为目前服务器上未配置user.email和name,无法进行commit

3.4:
1.进入tmux之后,要先进行source ~/.bashrc
2.使用/nfs/home/share/gem5_ci/checkpoints/coremark-riscv64-xs.bin跑修改前的结构可以保证ipc在2.218566,应该是正常情况。此时condmiss/condnum 约为4.194%
3.加入之后预测准确率狂掉,不太对啊朋友
4.目前加入的计数器:
Scalar percepPredCorrect{};
Scalar percepPredWrong{};
Scalar usePercepPred{};
Scalar percepHighConf{};
5.目前何时使用感知机预测器结果部分的逻辑似乎有问题,目前完全覆盖掉了其他表项的逻辑。此外,有关如何计数部分的逻辑,仍然有待修改。

3.5:
1.提高update的门槛,即当预测错误或者正确但把握很低(低于阈值的一半)时更新,可以提高正确率,但会降低percep被使用的次数。
2.可能需要自己找时间阅读原版论文,ai阅读得出的结果存在一定问题,且不全面。另外,需要找一些相关优化的论文。
3.目前优化方向:增加gbhr长度,但这样必定会增加硬件负荷;尝试用pc hash取代单纯地pc低d位进行索引;尝试动态阈值
4.针对mcf,启用percep《4096 8 23 200》
system.cpu.ipc 0.421617 # IPC: Instructions Per Cycle ((Count/Cycle))
system.cpu.branchPred.mgsc.percepPredCorrect 3333593 # number of perception prediction correct (Count)
system.cpu.branchPred.mgsc.percepPredWrong 3444180 # number of perception prediction wrong (Count)
system.cpu.branchPred.mgsc.usePercepPred 79845
启用percep前
system.cpu.ipc 0.418220
system.cpu.branchPred.mgsc.scPredCorrect 6291103 # number of sc prediction correct (Count)
system.cpu.branchPred.mgsc.scPredWrong 486670 # number of sc prediction wrong (Count)
5.在xs-gem5对应github 跑pr的脚本,可以找到对应路径,找到对应ci脚本。
应该是github线上跑的内容,用到了服务器上的某个脚本,根据对应脚本位置可以直接跑
6.异或问题?文献过于古早,可能要找更新的文献探索相关的逻辑
7.ai 工具 cli,copilet
8.从源头理解感知机预测器,本质是一个学习率为1(?)的单层感知机,对于异或这种无法线性分割的问题难以处理。而无法断言(或者基本可以否定)
分支预测是一个无法线性分割的问题,单一的进行线性分割无法完全适应整体的预测结果。单从论文来看,并没有说明异或问题,或者学习率相关内容
优化猜想:
修改学习率?把weight变成浮点数据?感觉很荒谬,考虑硬件更荒谬
可变阈值?可能会导致perception更容易参与最终决策,但如果能把perception的正确率提高,感觉也有戏。
pc hash?类似采用foldhist逻辑,为了避免不同pc索引到相同权重表项,交叉感染。感觉有戏
多层感知机?找的论文中有一篇描述多层感知机预测器,
或许想办法搞到内部调试数据,分析对应的分支模式
能不能单一统计不同pc的分支预测准确率?看看在哪些分支预测上存在优势?

3.6 No newline at end of file
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Remove this file from the commit - contains sensitive internal information.

This work diary should not be committed to the repository:

  1. Security concern: Contains internal proxy IP address (172.38.11.182:10808) that exposes internal infrastructure
  2. Inappropriate content: Personal work notes, debugging steps, and experimental observations belong in personal documentation, not version control
  3. Internal paths: References internal server paths (/nfs/home/share/gem5_ci/checkpoints/...)

Add workrecord to .gitignore and remove it from this commit.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@workrecord` around lines 1 - 99, The file workrecord contains
sensitive/internal notes and must be removed from the commit: delete it from the
index (e.g., git rm --cached workrecord) so it is not tracked, add a line
"workrecord" to .gitignore to prevent re-adding, and create a follow-up commit
(or amend the current commit if not yet pushed) that removes the file from the
repository; reference the filenames workrecord and .gitignore when making these
changes.

@XiangShanRobot
Copy link

[Generated by GEM5 Performance Robot]
commit: 5164f71
workflow: gem5 Ideal BTB Performance Test

Ideal BTB Performance

Overall Score

PR Master Diff(%)
Score 20.37 20.36 +0.04 🟢

@XiangShanRobot
Copy link

[Generated by GEM5 Performance Robot]
commit: 5164f71
workflow: On-Demand SPEC Test (Tier 1.5)

Ideal BTB Performance

Overall Score

PR Master Diff(%)
Score 20.37 20.36 +0.05 🟢

Change-Id: I52a7f8a72616da1e6efa189dccdb911cce007b2f
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
.gitignore (1)

73-75: Reasonable gitignore additions.

The new patterns for NEMU and workrecord.txt are sensible—NEMU is likely a test/emulator artifact and workrecord.txt appears to be a personal tracking file. Both follow existing repository-wide pattern conventions.

Note: These changes appear unrelated to the perceptron prediction feature. While not problematic, consider grouping infrastructure/tooling updates separately from functional changes in future PRs for clearer change tracking.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.gitignore around lines 73 - 75, The .gitignore additions for the patterns
"NEMU" and "workrecord.txt" are fine but belong to infrastructure/tooling
changes rather than the perceptron prediction feature; please move these two
entries into a separate commit or PR (or at minimum a separate logical commit in
this branch) so feature changes stay focused—identify the entries "NEMU" and
"workrecord.txt" in the .gitignore and extract them into their own commit/PR
with a descriptive message like "chore: add gitignore entries for NEMU and
workrecord.txt".
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.gitignore:
- Around line 73-75: The .gitignore additions for the patterns "NEMU" and
"workrecord.txt" are fine but belong to infrastructure/tooling changes rather
than the perceptron prediction feature; please move these two entries into a
separate commit or PR (or at minimum a separate logical commit in this branch)
so feature changes stay focused—identify the entries "NEMU" and "workrecord.txt"
in the .gitignore and extract them into their own commit/PR with a descriptive
message like "chore: add gitignore entries for NEMU and workrecord.txt".

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b2cfdc7b-5cb5-4f84-b405-a0404787e035

📥 Commits

Reviewing files that changed from the base of the PR and between 5164f71 and 7e1c74d.

📒 Files selected for processing (2)
  • .gitignore
  • configs/example/kmhv3.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • configs/example/kmhv3.py

@XiangShanRobot
Copy link

[Generated by GEM5 Performance Robot]
commit: 7e1c74d
workflow: On-Demand SPEC Test (Tier 1.5)

Ideal BTB Performance

Overall Score

PR Master Diff(%)
Score 20.37 20.42 -0.28 🔴

[Generated by GEM5 Performance Robot]
commit: 7e1c74d
workflow: On-Demand SPEC Test (Tier 1.5)

Ideal BTB Performance

Overall Score

PR Previous Commit Diff(%)
Score 20.37 20.37 -0.02 🔴

Change-Id: I5dd678ee569f7a8fc0b7cf0a5c410a8383b3e11c
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cpu/pred/btb/btb_mgsc.cc (1)

131-183: ⚠️ Potential issue | 🔴 Critical

Initialize the new perceptron parameters in the UNIT_TEST constructor.

initStorage() now consumes percepTableEntryNum, gbhrLen, percepTableWidth, and percepThres, but this constructor never initializes them before calling initStorage(). That leaves UNIT_TEST builds using indeterminate values in allocation and resize paths.

🐛 Suggested change
       biasTableNum(1),
       biasTableIdxWidth(5),
+      percepTableEntryNum(8),
+      gbhrLen(8),
+      percepTableWidth(8),
+      percepThres(8),
       scCountersWidth(6),
       thresholdTablelogSize(4),
       updateThresholdWidth(12),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/btb_mgsc.cc` around lines 131 - 183, The UNIT_TEST BTBMGSC
constructor calls initStorage() before initializing the new perceptron-related
fields, causing indeterminate values; set sensible defaults for
percepTableEntryNum, gbhrLen, percepTableWidth, and percepThres in the
BTBMGSC::BTBMGSC initializer list (alongside existing fields like pTableNum,
pTableIdxWidth, etc.) before calling initStorage(), so initStorage() receives
valid values for allocation/resizing.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/cpu/pred/btb/btb_mgsc.cc`:
- Around line 742-757: The perceptron-related stats are being updated even when
the perceptron feature is disabled or no perceptron prediction exists; wrap the
perceptron logic so it only runs when the global flag (enablePerceptionPred) is
true and the current prediction produced/used perceptron state (e.g.,
pred.use_percep or a percep_valid field). Specifically, only compute
percep_pred_taken and update mgscStats.percepHighConf,
mgscStats.percepHighConfCorrect, percep_conf_high_correct (and later
percepPredCorrect/percepPredWrong and mgscStats.usePercepPred) when
enablePerceptionPred && pred.use_percep (or the appropriate per-pred flag) is
true; leave non-perceptron paths unchanged. Ensure the same gating is applied in
the other occurrences you noted (the other blocks around the perceptron-stat
updates).
- Around line 672-689: updatePercepTable currently reads the live member gbhr
when applying perceptron updates, which can differ from the GBHR used at
prediction time; change updatePercepTable to accept and use the snapshot GBHR
stored in the MgscMeta (the vector saved by putPCHistory) instead of accessing
the mutable member gbhr so the feature vector used to compute percep_sum matches
the one used to update weights (also apply the same fix for the similar code at
the other occurrence around the 930-931 region); keep references to percep_sum,
percep_taken, actual_taken, weightTable, updateCounter and ensure the function
signature and callers (e.g., where MgscMeta is available) pass the MgscMeta
snapshot GBHR into updatePercepTable.
- Around line 867-869: The update path currently computes sc_pred_taken from
total_sum >= 0 even when pred.use_percep is true; change the logic so that when
pred.use_percep is set the final single-chooser decision uses percep_sum (i.e.,
set sc_pred_taken = percep_taken) rather than total_sum, so that threshold
updates, retraining and stats (where sc_pred_taken is used) reflect the actual
decision produced by generateSinglePrediction(); update any branches that
reference sc_pred_taken to honor pred.use_percep and use percep_taken when true.
- Around line 112-125: initStorage() currently always allocates
percepWeightTable and gbhr even when enablePerceptionPred is false; guard the
allocation by checking enablePerceptionPred before calling allocWeightTable and
resizing gbhr (use enablePerceptionPred to skip
allocWeightTable(percepWeightTable, percepTableEntryNum) and avoid
gbhr.resize(gbhrLen,1)), and ensure percepWeightTable and percepWeightTableSize
default to empty/0 when disabled; reference allocWeightTable, percepWeightTable,
percepWeightTableSize, gbhr, gbhrLen, percepTableEntryNum, and
enablePerceptionPred in the change.

---

Outside diff comments:
In `@src/cpu/pred/btb/btb_mgsc.cc`:
- Around line 131-183: The UNIT_TEST BTBMGSC constructor calls initStorage()
before initializing the new perceptron-related fields, causing indeterminate
values; set sensible defaults for percepTableEntryNum, gbhrLen,
percepTableWidth, and percepThres in the BTBMGSC::BTBMGSC initializer list
(alongside existing fields like pTableNum, pTableIdxWidth, etc.) before calling
initStorage(), so initStorage() receives valid values for allocation/resizing.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bf56c1cf-fa0f-4862-8893-14ac31cb7db6

📥 Commits

Reviewing files that changed from the base of the PR and between 7e1c74d and 03a071e.

📒 Files selected for processing (3)
  • configs/example/kmhv3.py
  • src/cpu/pred/btb/btb_mgsc.cc
  • src/cpu/pred/btb/btb_mgsc.hh

Comment on lines +112 to +125
//perception weightTable init
auto allocWeightTable = [&](std::vector<std::vector<int16_t>> &table,unsigned numEntries) -> uint64_t {
assert(isPowerOf2(numEntries));
// if (!isPowerOf2(numEntries)) {
// panic("numEntries (%d) is not a power of 2!", numEntries);
// }
table.resize(numEntries);
for (unsigned int i = 0; i<numEntries;++i){
table[i].resize(gbhrLen+1,0);
}
return gbhrLen+1;
};
auto percepWeightTableSize = allocWeightTable(percepWeightTable,percepTableEntryNum);
gbhr.resize(gbhrLen,1);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard perceptron storage behind enablePerceptionPred.

initStorage() always allocates percepWeightTable and gbhr. In the default example config, configs/example/kmhv3.py Lines 112-116 leave enablePerceptionPred = False but still set percepTableEntryNum = 1024 and gbhrLen = 4096, so every MGSC instance pays the full memory/init cost for a disabled feature.

♻️ Suggested change
-    auto percepWeightTableSize = allocWeightTable(percepWeightTable,percepTableEntryNum);
-    gbhr.resize(gbhrLen,1);
+    if (enablePerceptionPred) {
+        allocWeightTable(percepWeightTable, percepTableEntryNum);
+        gbhr.resize(gbhrLen, true);
+    } else {
+        percepWeightTable.clear();
+        gbhr.clear();
+    }
🧰 Tools
🪛 Cppcheck (2.19.0)

[error] 119-119: Reference to temporary returned.

(returnTempReference)


[error] 125-125: Shifting 64-bit value by 64 bits is undefined behaviour

(shiftTooManyBits)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/btb_mgsc.cc` around lines 112 - 125, initStorage() currently
always allocates percepWeightTable and gbhr even when enablePerceptionPred is
false; guard the allocation by checking enablePerceptionPred before calling
allocWeightTable and resizing gbhr (use enablePerceptionPred to skip
allocWeightTable(percepWeightTable, percepTableEntryNum) and avoid
gbhr.resize(gbhrLen,1)), and ensure percepWeightTable and percepWeightTableSize
default to empty/0 when disabled; reference allocWeightTable, percepWeightTable,
percepWeightTableSize, gbhr, gbhrLen, percepTableEntryNum, and
enablePerceptionPred in the change.

@XiangShanRobot
Copy link

[Generated by GEM5 Performance Robot]
commit: 03a071e
workflow: On-Demand SPEC Test (Tier 1.5)

Ideal BTB Performance

Overall Score

PR Master Diff(%)
Score 20.38 20.42 -0.24 🔴

[Generated by GEM5 Performance Robot]
commit: 03a071e
workflow: On-Demand SPEC Test (Tier 1.5)

Ideal BTB Performance

Overall Score

PR Previous Commit Diff(%)
Score 20.38 20.37 +0.04 🟢

Change-Id: Ibdf67bb5e7d069e015743a928b4314a1f4ad1fcf
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
configs/example/kmhv3.py (1)

118-122: Consider whether enableGTable should also be set.

The perception prediction code in btb_mgsc.cc (context snippet 3, line ~432) evaluates enableGTable alongside the other table enables being configured here (BwTable, LTable, ITable, PTable, BiasTable). If not explicitly set, it will use the default value.

For configuration completeness and clarity, consider adding cpu.branchPred.mgsc.enableGTable = True/False to make the intent explicit.

Proposed addition for completeness
             cpu.branchPred.mgsc.enableBwTable = True
             cpu.branchPred.mgsc.enableLTable = True
             cpu.branchPred.mgsc.enableITable = True
             cpu.branchPred.mgsc.enablePTable = True
             cpu.branchPred.mgsc.enableBiasTable = True
+            cpu.branchPred.mgsc.enableGTable = False  # or True, depending on intent
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@configs/example/kmhv3.py` around lines 118 - 122, The MGSC branch predictor
config currently sets cpu.branchPred.mgsc.enableBwTable, enableLTable,
enableITable, enablePTable, and enableBiasTable but omits
cpu.branchPred.mgsc.enableGTable; add an explicit
cpu.branchPred.mgsc.enableGTable = True or = False to the same config block to
make intent explicit and match the checks in btb_mgsc.cc (which reads
enableGTable alongside the other enables), ensuring configuration completeness
and avoiding reliance on defaults.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@configs/example/kmhv3.py`:
- Around line 118-122: The MGSC branch predictor config currently sets
cpu.branchPred.mgsc.enableBwTable, enableLTable, enableITable, enablePTable, and
enableBiasTable but omits cpu.branchPred.mgsc.enableGTable; add an explicit
cpu.branchPred.mgsc.enableGTable = True or = False to the same config block to
make intent explicit and match the checks in btb_mgsc.cc (which reads
enableGTable alongside the other enables), ensuring configuration completeness
and avoiding reliance on defaults.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 931c0471-c3eb-4bc6-b1dd-f36aaa3adf65

📥 Commits

Reviewing files that changed from the base of the PR and between 03a071e and 15528be.

📒 Files selected for processing (1)
  • configs/example/kmhv3.py

@XiangShanRobot
Copy link

[Generated by GEM5 Performance Robot]
commit: 15528be
workflow: On-Demand SPEC Test (Tier 1.5)

Ideal BTB Performance

Overall Score

PR Master Diff(%)
Score 20.37 20.42 -0.24 🔴

[Generated by GEM5 Performance Robot]
commit: 15528be
workflow: On-Demand SPEC Test (Tier 1.5)

Ideal BTB Performance

Overall Score

PR Previous Commit Diff(%)
Score 20.37 20.38 -0.01 🔴

Change-Id: Ibb636a66fa2b889b6ba1f1678363e62cb80ac13c
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/cpu/pred/btb/btb_mgsc.cc (1)

1549-1565: ⚠️ Potential issue | 🟡 Minor

Reconsider perceptron stat updates when prediction missed.

When pred_hit is false (lines 1549-1566), the code updates percepPredWrong or percepPredCorrect based on actual_taken even though no perceptron prediction was made. This conflates "prediction miss" with "perceptron accuracy" and may skew metrics.

Consider only updating perceptron accuracy stats when a valid perceptron prediction existed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/btb_mgsc.cc` around lines 1549 - 1565, The perceptron
accuracy counters (mgscStats.percepPredWrong and mgscStats.percepPredCorrect)
are being updated even when no perceptron prediction was made (the pred_hit ==
false branch); change the logic so those increments only happen when a
perceptron prediction actually existed—i.e., wrap the updates to
percepPredWrong/percepPredCorrect in a guard that checks the
perceptron-prediction flag (use pred_hit if that denotes perceptron prediction
or replace with the actual percep-valid flag if present), leaving the sc and
tage counters unchanged.
🧹 Nitpick comments (2)
src/cpu/pred/btb/btb_mgsc.cc (2)

1265-1276: Consider using a more efficient data structure for GBHR.

Using pop_back() + insert(begin()) on a std::vector is O(n) per call. With gbhrLen defaulting to 1024, this could become a performance bottleneck since it's called on every speculative update.

Consider using std::deque (O(1) push_front/pop_back) or a circular buffer with head/tail pointers.

♻️ Alternative: circular buffer approach
// In header: add member
// int gbhrHead = 0;  // points to oldest entry

void BTBMGSC::specUpdateGBHR(FullBTBPrediction &pred)
{
    int shamt;
    bool cond_taken;
    std::tie(shamt, cond_taken) = pred.getHistInfo();
    if (!gbhr.empty()) {
        // Move head back (wrapping) and overwrite oldest with newest
        gbhrHead = (gbhrHead - 1 + gbhrLen) % gbhrLen;
        gbhr[gbhrHead] = cond_taken;
    }
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/btb_mgsc.cc` around lines 1265 - 1276, The current
BTBMGSC::specUpdateGBHR uses gbhr.pop_back() + gbhr.insert(gbhr.begin(), ...) on
a std::vector which is O(n) per call; change gbhr to a std::deque (use
push_front/pop_back) or implement a circular buffer using an int gbhrHead (e.g.,
gbhrHead points at oldest entry, decrement/wrap and overwrite gbhr[gbhrHead]
with cond_taken) to make updates O(1); update the BTBMGSC::specUpdateGBHR
implementation to use the chosen structure and adjust any other code that
indexes gbhr (reads/writes or gbhrLen usage) to account for the deque or
head-based circular indexing.

1414-1426: Simplify GBHR recovery and address O(n) shift.

The element-by-element copy can be simplified to a direct assignment, and the same O(n) shift concern applies here.

♻️ Proposed simplification
 void
 BTBMGSC::recoverGBHR(const FetchTarget &entry, int shamt, bool cond_taken){
     if (!isEnabled()) {
         return;  // No recover when disabled
     }
     std::shared_ptr<MgscMeta> predMeta = std::static_pointer_cast<MgscMeta>(entry.predMetas[getComponentIdx()]);
-    for (int i=0;i<gbhrLen;i++)
-        gbhr[i] = predMeta->gbhr[i];
+    gbhr = predMeta->gbhr;  // Direct vector assignment
     if (!gbhr.empty()) {
         gbhr.pop_back();
         gbhr.insert(gbhr.begin(), cond_taken);
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/cpu/pred/btb/btb_mgsc.cc` around lines 1414 - 1426, Replace the manual
element-wise copy and the pop_back/insert front (which causes a potential
reallocation and extra work) in BTBMGSC::recoverGBHR: assign the vector directly
with gbhr = predMeta->gbhr (instead of the for loop), then shift the contents
right in-place with std::copy_backward (e.g. std::copy_backward(gbhr.begin(),
gbhr.end()-1, gbhr.end()); gbhr[0] = cond_taken;) to avoid extra allocations
from insert and make the rotation intent explicit; keep the early isEnabled()
return and the use of getComponentIdx(), gbhrLen, predMeta->gbhr and cond_taken
as-is.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@src/cpu/pred/btb/btb_mgsc.cc`:
- Around line 1549-1565: The perceptron accuracy counters
(mgscStats.percepPredWrong and mgscStats.percepPredCorrect) are being updated
even when no perceptron prediction was made (the pred_hit == false branch);
change the logic so those increments only happen when a perceptron prediction
actually existed—i.e., wrap the updates to percepPredWrong/percepPredCorrect in
a guard that checks the perceptron-prediction flag (use pred_hit if that denotes
perceptron prediction or replace with the actual percep-valid flag if present),
leaving the sc and tage counters unchanged.

---

Nitpick comments:
In `@src/cpu/pred/btb/btb_mgsc.cc`:
- Around line 1265-1276: The current BTBMGSC::specUpdateGBHR uses
gbhr.pop_back() + gbhr.insert(gbhr.begin(), ...) on a std::vector which is O(n)
per call; change gbhr to a std::deque (use push_front/pop_back) or implement a
circular buffer using an int gbhrHead (e.g., gbhrHead points at oldest entry,
decrement/wrap and overwrite gbhr[gbhrHead] with cond_taken) to make updates
O(1); update the BTBMGSC::specUpdateGBHR implementation to use the chosen
structure and adjust any other code that indexes gbhr (reads/writes or gbhrLen
usage) to account for the deque or head-based circular indexing.
- Around line 1414-1426: Replace the manual element-wise copy and the
pop_back/insert front (which causes a potential reallocation and extra work) in
BTBMGSC::recoverGBHR: assign the vector directly with gbhr = predMeta->gbhr
(instead of the for loop), then shift the contents right in-place with
std::copy_backward (e.g. std::copy_backward(gbhr.begin(), gbhr.end()-1,
gbhr.end()); gbhr[0] = cond_taken;) to avoid extra allocations from insert and
make the rotation intent explicit; keep the early isEnabled() return and the use
of getComponentIdx(), gbhrLen, predMeta->gbhr and cond_taken as-is.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 35064d9f-ca90-4605-a7df-7c47657e1f3e

📥 Commits

Reviewing files that changed from the base of the PR and between 15528be and 2e4b8cf.

📒 Files selected for processing (2)
  • src/cpu/pred/BranchPredictor.py
  • src/cpu/pred/btb/btb_mgsc.cc

Change-Id: Ic7a05b96e56fa90ad5730915480776b0e6c7f171
@XiangShanRobot
Copy link

[Generated by GEM5 Performance Robot]
commit: ea4f22b
workflow: On-Demand SPEC Test (Tier 1.5)

Ideal BTB Performance

Overall Score

PR Master Diff(%)
Score 20.37 20.63 -1.27 🔴

[Generated by GEM5 Performance Robot]
commit: ea4f22b
workflow: On-Demand SPEC Test (Tier 1.5)

Ideal BTB Performance

Overall Score

PR Previous Commit Diff(%)
Score 20.37 20.35 +0.08 🟢

Change-Id: I4dff39690de5f4c97ddc478cc6a2c4ba78f73c1c
Change-Id: I656083f01a7c76daaa8ff319e8f6f19ab404177b
@XiangShanRobot
Copy link

[Generated by GEM5 Performance Robot]
commit: 6d36aef
workflow: On-Demand SPEC Test (Tier 1.5)

Ideal BTB Performance

Overall Score

PR Master Diff(%)
Score 20.38 20.63 -1.23 🔴

[Generated by GEM5 Performance Robot]
commit: 6d36aef
workflow: On-Demand SPEC Test (Tier 1.5)

Ideal BTB Performance

Overall Score

PR Previous Commit Diff(%)
Score 20.38 20.37 +0.01 🟢

Change-Id: Iabf4f6f465f20abe8009dd275a6c95b63f7f4860
Change-Id: I9ee2cf86648c3a250b42bfec3413e50a9220fdf9
Change-Id: I20793693aa748df7850573c7849b372b8abe6f98
Change-Id: Ib067dedd32b1201f9009fb94020b7ee38360b7ec
Change-Id: Id7a3ebe55794c56f8a0d8f6c67d6be3325375d93
Change-Id: I95130ddfe9ff742537b6ec5e0a4633bda5d49dd1
Change-Id: I2d7b30df5ed818451f018e07e725d1969c421690
Change-Id: I86c3f7b617cd2c2c03dcc018e86b473e2f30b20b
Change-Id: I2abde69bb0b553eb8fd6ac063f96dee0e0af7849
Change-Id: I214aaa529b774cc2665d7541a7e57f8ee6b1ba28
Change-Id: I7d7221f94a52534aee237a4e619e12eeef561647
Change-Id: I59014dfab1f9abcc50bc068b545189a8cd6d4203
Change-Id: I41e8279bfda51b020057471777ccdd854824a350
Change-Id: Ic425e9c82989ec40db83f44a80dea0ecd809edc4
Change-Id: I8169dea27f9df81084650085a88b77dba70a69fc
Change-Id: I63e3e2935247d9528e487a811bea72fc0413c5cc
@ProgramGP ProgramGP force-pushed the bpu-add-percep-perf branch from 482fa2c to bd047bb Compare March 18, 2026 08:22
Change-Id: If7b1bd460d58bb6ad56c287c026039ff0d3df797
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants