Skip to content

sjo99-kr/CVA6-branch-predictor-extension

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7,482 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🚀 CVA6 Branch Predictor Extension (GShare & TAGE)

This repository implements an enhanced Branch Prediction Unit for the CVA6 (formerly Ariane) open-source RISC-V core.

The project extends the CVA6 frontend architecture by integrating high-performance GShare and TAGE (Tagged Geometric History Length) branch predictors.

The predictors are evaluated using both microbenchmarks and application benchmarks to analyze their impact on branch prediction accuracy and overall CPU performance.

The goal is to improve processor performance by increasing IPC (Instructions Per Cycle) and reducing branch misprediction penalties.


🧩 Overview

This project extends the branch prediction unit of the CVA6 open-source RISC-V processor by implementing GShare and TAGE branch predictors.

The implementation focuses primarily on modifications to the frontend stage of the processor pipeline.

For detailed architectural specifications of the TAGE branch predictor, refer to the README.md in the core directory.

For details about the original architecture, refer to the official CVA6 repository.

Original CVA6 repository:
https://github.com/openhwgroup/cva6


⚙️ Modified Components

The branch predictor implementation required modifications to several frontend and core pipeline components:

  • core/cva6.sv
  • core/branch_unit.sv
  • core/include/ariane_pkg.sv
  • core/include/build_config_pkg.sv
  • core/include/config_pkg.sv
  • core/frontend/frontend.sv
  • core/frontend/instr_queue.sv
  • core/frontend/GShareTable.sv
  • core/frontend/TageBaseTable.sv
  • core/frontend/TageTable.sv

🔧 Branch Predictor Configuration

Branch predictor parameters can be configured by modifying the following files:

  • core/include/cv32a60x_config_pkg.sv
  • core/include/cv32a65x_config_pkg.sv

Currently, the implementation supports the cv32a60x and cv32a65x configurations.

Support for additional CVA6 configurations will be extended in future work.


📊 Evaluation

🧮 Hardware Cost Comparison

Branch Predictor Num. of Tables Num. of Entries Bits per Entry Total Cost (Bytes)
Baseline (BHT) 1 8192 2-bit counter + 1-bit valid 3,072 B
GShare 1 8192 2-bit counter + 1-bit valid 3,072 B
TAGE 4 64 / 256 / 512 / 1024 2-bit counter/useful + 1-bit valid + tag (0/4/6/8 bit) 2,696 B

🧪 Benchmark Methodology

Two types of benchmarks are used for evaluation.

Microbenchmarks

Synthetic workloads designed to stress specific branch behavior patterns.

  • Aliasing
    Evaluates how different predictors handle destructive aliasing caused by multiple branches mapping to the same prediction entries.

  • Alternating Pattern
    Tests the ability of predictors to learn regular alternating branch patterns (taken/not-taken).

  • Correlated Branch
    Measures how well predictors capture inter-branch correlations using global history information.

  • Correlated Periodic
    Evaluates the effectiveness of predictors in recognizing long periodic branch patterns generated from multiple correlated conditions.


Application Benchmarks

Real workloads with control-flow intensive behavior.

  • Binary Search
    Evaluates prediction accuracy for nested conditional branches commonly found in search algorithms.

  • N-Queens
    Generates complex branch behavior due to recursive backtracking and deep control-flow exploration.

  • QuickSort
    Tests predictor performance on recursive divide-and-conquer algorithms with data-dependent branching.


📈 Performance Results

⚡Normalized IPC

image

The figure shows the normalized IPC improvement of different branch predictors across microbenchmarks and application benchmarks.

GShare and TAGE significantly improve performance compared to the baseline BHT, particularly for structured branch patterns such as alternating and correlated branches.

The effectiveness of TAGE can vary depending on the hashing scheme and the number of tag/index bits, which influence the level of aliasing in prediction tables.


📉Branch Miss Rate

image

The figure shows the branch miss rate across different predictors.

GShare and TAGE significantly outperform the baseline BHT for structured branch patterns, demonstrating the benefits of history-based branch prediction mechanisms.

The performance gap varies across applications due to differences in branch patterns and aliasing behavior within the predictor tables.


💾 Hardware-Cost-Normalized MPKI

To evaluate prediction efficiency relative to hardware overhead, we introduce a hardware-cost-normalized MPKI metric.

Cost-Normalized MPKI = (Branch Misses / Total Instructions) × 1000 / Hardware Cost (Bytes)

This metric highlights the performance of branch predictors relative to its hardware cost.

Branch Predictor Aliasing Alternating Correlated Branch Correlated Periodic Binary Search N-Queens QuickSort
Baseline (BHT) 15.93 14.14 11.41 10.60 5.79 3.43 0.58
GShare 3.15 0.0028 0.0081 2.57 5.03 3.59 0.24
TAGE 11.48 0.0032 0.00074 0.5106 6.35 4.37 0.41

▶️ How to Run Benchmarks

To run the branch prediction benchmarks, execute the following commands:

export DV_SIMULATORS=veri-testharness
cd verif/sim
./TageBenchmarks.sh

About

Implementation of TAGE and GShare branch predictors for the CVA6 (Ariane) open-source RISC-V core.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Assembly 55.0%
  • SystemVerilog 26.1%
  • Tcl 7.6%
  • C 3.8%
  • Python 2.8%
  • Shell 1.2%
  • Other 3.5%