Skip to content

Commit 56f1078

Browse files
WoniconIvyfeather
authored andcommitted
Bump coupledL2 upstream at Dec 19, 2025
Merge coupledL2 commits up to 95e8de (2025-12-01), which is tracking by XiangShan commit c06233 (2025-12-08). Squashed commit of the following: commit 95e8de9 Author: lwd <liuweiding10@outlook.com> Date: Mon Dec 1 11:44:38 2025 +0800 fix(AsyncBridge): revert (OpenXiangShan#433) (OpenXiangShan#446) Revert "fix(AsyncBridge): add l-credit manager in AsyncBridge to fix performance (OpenXiangShan#433)" This reverts commit 4fc8217. commit cfc9235 Author: zhanglinjuan <zhanglinjuan16@mails.ucas.ac.cn> Date: Tue Nov 25 16:54:40 2025 +0800 Timing(TL2CHICoupledL2): remove ICG of 'pCrdGrantType' and 'pCrdGrantSrcID' (OpenXiangShan#444) Co-authored-by: Zhu Yu <yulightenyu@gmail.com> commit 7ea02e3 Author: Yanqin Li <maxpicca@qq.com> Date: Tue Nov 25 15:03:42 2025 +0800 time(bop): fix long path due to clock gating (OpenXiangShan#442) commit a9990de Author: Ma-YX <71326427+Ma-YX@users.noreply.github.com> Date: Mon Nov 24 15:11:48 2025 +0800 fix(MainPipe, MSHR, RXDAT): not report DataCheck error to BEU (OpenXiangShan#443) When detected Data Check error in RXDAT, coupledL2 reports corrupt = 1(TL) to L1 $, but not reports L2 Error to BEU. Prioritize triggering synchronous exceptions over asynchronous ones. commit d00d32f Author: yanyiming <139243183+Yan-Yiming@users.noreply.github.com> Date: Mon Nov 10 22:21:19 2025 +0800 perf: add perfevent (OpenXiangShan#441) add perfevent for last level cache event: l2_cache_prefetch_{access, miss} l2_cache_{access, miss}_{rd, wr} add perfevent for cache hit/miss in L2: l2_cache_hit, l2_cache_miss commit 4fc8217 Author: yulightenyu <145419941+yulightenyu@users.noreply.github.com> Date: Fri Nov 7 17:23:32 2025 +0800 fix(AsyncBridge): add l-credit manager in AsyncBridge to fix performance (OpenXiangShan#433) - For rx channel, add l-credit manager module to generate 'lcrdv' right after rx flit received - For tx channel, add l-credit manager module to generate 'ready' to block tx flit, using l-credit number more than Maximum tx l-credit in CoupledL2 to cover lcrdv sync delay commit f4b7db3 Author: Ding Haonan <kumonda@kucro3.org> Date: Fri Nov 7 17:21:51 2025 +0800 fix(MainPipe, MSHR): decode SnpPreferUnique as SnpUnique (OpenXiangShan#438) commit c1022d9 Author: Ding Haonan <kumonda@kucro3.org> Date: Fri Nov 7 17:18:16 2025 +0800 timing(LinkLayer, RXSNP): simplified CMO on RXSNP and pipelined RXRSP, RXDAT (OpenXiangShan#436) * timing(RXSNP): CMO blocks RXSNP regardless of PA for better timing ### 1. **In the past**: - RXSNP (with same address) was blocked for CMO when: - In progress of Probing L1 - Ready to send ```WriteBackFull```/```WriteCleanFull```/```Evict```, but not sent yet (more specifically, went through S3 of MainPipe) - In progress of compensational meta write (more specifically, changing ```TRUNK``` to ```TIP```) - **```blockRefill```** means: ready to send ```WriteBackFull```/ ```WriteCleanFull```/```Evict```, but not sent yet (more specifically, went through S3 of MainPipe) - **```w_releaseack```** means: ready to receive ```CompDBIDResp``` from downstream - **```s_cmometaw```** means compensational meta write, changing ```TRUNK``` to ```TIP``` in L2 and not initiating any ```WriteBackFull```/```WriteCleanFull```/```Evict``` (CoupledL2 now is not sending any ```Probe toT``` to L1, so no ```TRUNK``` state could be kept in L2 after any CMO) - By **```reqBlockSnpMask```**, blocking RXSNP (with same address) for CMO when: - Before ```WriteBackFull```/```WriteCleanFull```/```Evict``` sent, or after ```CompDBIDResp``` received, in progress of Probing L1 - Before ```WriteBackFull```/```WriteCleanFull```/```Evict``` sent, or after ```CompDBIDResp``` received, in progress of compensational meta write (more specifically, changing ```TRUNK``` to ```TIP```) - By **```cmoBlockSnpMask```**, blocking RXSNP (with same address) for CMO when: - Ready to send ```WriteBackFull```/```WriteCleanFull```/```Evict```, but not sent yet (more specifically, went through S3 of MainPipe) - In progress of Probing L1 - In **```reqBlockSnpMask```** and **```cmoBlockSnpMask```**: - Blockage by progress of Probing L1 was duplicated - Blockage by compensational meta write was duplicated - Blockage constraint (Before ```WriteBackFull```/```WriteCleanFull``` /```Evict``` sent, After ```CompDBIDResp``` received) was unnecessary, since: - The first constraint (Before ```WriteBackFull```/ ```WriteCleanFull```/```Evict``` sent) was unnecessary since: - RXSNP must always be blocked in progress of Probing L1 - The compensational meta write ```cmometaw``` activity is equal to the attempt to send ```WriteBackFull```/```WriteCleanFull```/ ```Evict``` in MSHR sequence (```cmometaw``` would only and immediately be activated after MSHR decided not to send any ```WriteBackFull```/```WriteCleanFull```/```Evict```) - The second constraint (After ```CompDBIDResp``` received) was kept by HN / L3 ### 2. **After this PR**: - All duplicated blocking logic would be removed, and all blocking logic would be merged into **```cmoBlockSnpMask```**, blocking RXSNP for CMO when: - In progress of Probing L1 - Ready to send ```WriteBackFull```/```WriteCleanFull```/```Evict```, but not sent yet (more specifically, went through S3 of MainPipe) - In progress of compensational meta write (more specifically, changing ```TRUNK``` to ```TIP```) - PA would be no longer taken into consideration for timing reason, but would still be dead-lock free, since: - Progress of Probing L1 is not depending on any downstream sequence - Progress of sending TXREQ is not depending on any downstream sequence * timing(TL2CHICoupledL2): add pipeline(deppth=1, pipe=1) on rxdat/rxrsp channel --------- Co-authored-by: Zhu Yu <yulightenyu@gmail.com> commit 826415a Author: yanyiming <139243183+Yan-Yiming@users.noreply.github.com> Date: Fri Nov 7 17:09:38 2025 +0800 perf: add perfevent (OpenXiangShan#437) add perfevent for last level cache event: l2_cache_prefetch_{access, miss} l2_cache_{access, miss}_{rd, wr} add perfevent for cache hit/miss in L2: l2_cache_hit, l2_cache_miss
1 parent 047334b commit 56f1078

30 files changed

+252
-261
lines changed

Makefile

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@ NUM_SLICE ?= 4
55
WITH_CHISELDB ?= 1
66
WITH_TLLOG ?= 1
77
WITH_CHILOG ?= 1
8+
BY_ETIME ?= 1
9+
BY_VTIME ?= 0
810
FPGA ?= 0
911

1012
init:
@@ -16,11 +18,13 @@ compile:
1618

1719
CHI_PASS_ARGS = ISSUE=$(ISSUE) NUM_CORE=$(NUM_CORE) NUM_TL_UL=$(NUM_TL_UL) NUM_SLICE=$(NUM_SLICE) \
1820
WITH_CHISELDB=$(WITH_CHISELDB) WITH_TLLOG=$(WITH_TLLOG) WITH_CHILOG=$(WITH_CHILOG) \
21+
BY_ETIME=$(BY_ETIME) BY_VTIME=$(BY_VTIME) \
1922
FPGA=$(FPGA)
2023

2124
TOP = TestTop
2225
CHI_TOP_ARGS = --issue $(ISSUE) --core $(NUM_CORE) --tl-ul $(NUM_TL_UL) --bank $(NUM_SLICE) \
2326
--chiseldb $(WITH_CHISELDB) --tllog $(WITH_TLLOG) --chilog $(WITH_CHILOG) \
27+
--etime $(BY_ETIME) --vtime $(BY_VTIME) \
2428
--fpga $(FPGA)
2529
BUILD_DIR = ./build
2630
TOP_V = $(BUILD_DIR)/$(TOP).sv

build.sc

Lines changed: 8 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,8 @@ import $file.`rocket-chip`.hardfloat.build
1212
val defaultScalaVersion = "2.13.15"
1313

1414
def defaultVersions = Map(
15-
"chisel" -> ivy"org.chipsalliance::chisel:6.6.0",
16-
"chisel-plugin" -> ivy"org.chipsalliance:::chisel-plugin:6.6.0",
17-
"chiseltest" -> ivy"edu.berkeley.cs::chiseltest:6.0.0"
15+
"chisel" -> ivy"org.chipsalliance::chisel:7.0.0",
16+
"chisel-plugin" -> ivy"org.chipsalliance:::chisel-plugin:7.0.0"
1817
)
1918

2019
trait HasChisel extends ScalaModule {
@@ -69,7 +68,11 @@ object utility extends SbtModule with HasChisel {
6968
override def millSourcePath = os.pwd / "utility"
7069

7170
override def moduleDeps = super.moduleDeps ++ Seq(rocketchip)
72-
}
71+
72+
override def ivyDeps = super.ivyDeps() ++ Agg(
73+
ivy"com.lihaoyi::sourcecode:0.4.4",
74+
)
75+
}
7376

7477
object huancun extends SbtModule with HasChisel {
7578
override def millSourcePath = os.pwd / "HuanCun"
@@ -88,11 +91,7 @@ object CoupledL2 extends SbtModule with HasChisel with millbuild.common.CoupledL
8891

8992
def huancunModule: ScalaModule = huancun
9093

91-
object test extends SbtModuleTests with TestModule.ScalaTest {
92-
override def ivyDeps = super.ivyDeps() ++ Agg(
93-
defaultVersions("chiseltest"),
94-
)
95-
}
94+
object test extends SbtModuleTests with TestModule.ScalaTest
9695

9796
override def scalacOptions = super.scalacOptions() ++ Agg("-deprecation", "-feature")
9897

src/main/scala/coupledL2/BaseSlice.scala

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ import chisel3.util._
2222
import org.chipsalliance.cde.config.Parameters
2323
import freechips.rocketchip.tilelink.TLBundle
2424
import utility._
25-
import coupledL2.prefetch.PrefetchIO
25+
import coupledL2.prefetch._
2626

2727
trait BaseOuterBundle
2828

@@ -36,7 +36,7 @@ abstract class BaseSliceIO[T_OUT <: BaseOuterBundle](implicit p: Parameters) ext
3636
val prefetch = prefetchOpt.map(_ => Flipped(new PrefetchIO))
3737
// val msStatus = topDownOpt.map(_ => Vec(mshrsAll, ValidIO(new MSHRStatus)))
3838
val dirResult = topDownOpt.map(_ => ValidIO(new DirResult))
39-
val latePF = topDownOpt.map(_ => Output(Bool()))
39+
val latePF = topDownOpt.map(_ => ValidIO(UInt(PfSource.pfSourceBits.W)))
4040
val error = DecoupledIO(new L2CacheErrorInfo())
4141
val l2Miss = Output(Bool())
4242
val l2Flush = Option.when(cacheParams.enableL2Flush) (Input(Bool()))

src/main/scala/coupledL2/Common.scala

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -185,11 +185,10 @@ class TaskBundle(implicit p: Parameters) extends L2Bundle
185185
val allowRetry = chiOpt.map(_ => Bool())
186186
val memAttr = chiOpt.map(_ => new MemAttr)
187187
val traceTag = chiOpt.map(_ => Bool())
188-
val dataCheckErr = chiOpt.map(_ => Bool())
189188

190189
def toCHIREQBundle(): CHIREQ = {
191190
val req = WireInit(0.U.asTypeOf(new CHIREQ()))
192-
req.qos := Fill(QOS_WIDTH, 1.U(1.W)) // TODO
191+
req.qos := Fill(QOS_WIDTH, 1.U(1.W)) - 1.U // TODO
193192
req.tgtID := tgtID.getOrElse(0.U)
194193
req.srcID := srcID.getOrElse(0.U)
195194
req.txnID := txnID.getOrElse(0.U)
@@ -294,7 +293,6 @@ class RespInfoBundle(implicit p: Parameters) extends L2Bundle
294293
val pCrdType = chiOpt.map(_ => UInt(PCRDTYPE_WIDTH.W))
295294
val respErr = chiOpt.map(_ => UInt(RESPERR_WIDTH.W))
296295
val traceTag = chiOpt.map(_ => Bool())
297-
val dataCheckErr = chiOpt.map(_ => Bool())
298296
}
299297

300298
class RespBundle(implicit p: Parameters) extends L2Bundle {
@@ -387,6 +385,7 @@ class PrefetchCtrlFromCore extends Bundle {
387385
val l2_pbop_en = Bool()
388386
val l2_vbop_en = Bool()
389387
val l2_tp_en = Bool()
388+
val l2_pf_delay_latency = UInt(10.W)
390389
}
391390

392391
class PrefetchRecv extends Bundle {

src/main/scala/coupledL2/CoupledL2.scala

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ trait HasCoupledL2Parameters {
4040
// val tl2tlParams: HasTLL2Parameters = p(L2ParamKey)
4141
def enableCHI = p(EnableCHI)
4242
def cacheParams = p(L2ParamKey)
43+
def EnablePrivateClint = cacheParams.EnablePrivateClint
4344

4445
def XLEN = 64
4546
def blocks = cacheParams.sets * cacheParams.ways
@@ -90,12 +91,12 @@ trait HasCoupledL2Parameters {
9091
def eccTagBankBits = encTagBankBits - tagBankBits
9192
def enableDataECC = cacheParams.enableDataECC
9293
def dataBankSplit = 4
93-
def dataSRAMSplit = 8
94+
def dataSRAMSplit = 4
9495
def wordBits = 64
9596
def bankWords = blockBits / wordBits / dataBankSplit
9697
def dataBankBits = wordBits * bankWords
9798
def encBankBits = cacheParams.dataCode.width(dataBankBits)
98-
def encDataPadBits = 4 // recaculate if any split changes
99+
def encDataPadBits = 0 // recaculate if any split changes
99100

100101
// Prefetch
101102
def prefetchers = cacheParams.prefetch
@@ -145,7 +146,7 @@ trait HasCoupledL2Parameters {
145146
def sam = cacheParams.sam
146147

147148
// Hardware Performance Monitor
148-
def numPCntHc: Int = 12
149+
def numPCntHc: Int = 17
149150

150151
def getClientBitOH(sourceId: UInt): UInt = {
151152
if (clientBits == 0) {

src/main/scala/coupledL2/Directory.scala

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,8 @@ import chisel3._
2121
import chisel3.util._
2222
import utility.mbist.MbistPipeline
2323
import coupledL2.utils._
24-
import utility.{ParallelPriorityMux, RegNextN, XSPerfAccumulate, Code, SRAMTemplate}
24+
import utility.{ParallelPriorityMux, RegNextN, XSPerfAccumulate, Code}
25+
import utility.sram.SRAMTemplate
2526
import org.chipsalliance.cde.config.Parameters
2627
import coupledL2.prefetch.PfSource
2728
import freechips.rocketchip.tilelink.TLMessages._

src/main/scala/coupledL2/L2Param.scala

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -146,6 +146,9 @@ case class L2Param(
146146
// Enable sram test support
147147
hasMbist: Boolean = false,
148148
hasSramCtl: Boolean = false,
149+
150+
// Enable new clint
151+
EnablePrivateClint: Boolean = false
149152
) {
150153
def toCacheParams: CacheParameters = CacheParameters(
151154
name = name,

src/main/scala/coupledL2/RequestArb.scala

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -96,9 +96,18 @@ class RequestArb(implicit p: Parameters) extends L2Module
9696
mshr_task_s1.bits.opcode === Grant ||
9797
mshr_task_s1.bits.opcode === GrantData ||
9898
mshr_task_s1.bits.opcode === AccessAckData ||
99-
mshr_task_s1.bits.opcode === HintAck && mshr_task_s1.bits.dsWen
99+
mshr_task_s1.bits.opcode === HintAck
100100
)
101101

102+
assert(!s1_needs_replRead || mshr_task_s1.bits.opcode =/= GrantData || mshr_task_s1.bits.dsWen,
103+
"replTask of GrantData with no DataStorage write was not expected")
104+
105+
assert(!s1_needs_replRead || mshr_task_s1.bits.opcode =/= AccessAckData || mshr_task_s1.bits.dsWen,
106+
"replTask of AccessAckData with no DataStorage write was not expected")
107+
108+
assert(!s1_needs_replRead || mshr_task_s1.bits.opcode =/= HintAck || mshr_task_s1.bits.dsWen,
109+
"replTask of HintAck with no DataStorage write was not expected")
110+
102111
/* ======== Stage 0 ======== */
103112
// if mshr_task_s1 is replRead, it might stall and wait for dirRead.ready, so we block new mshrTask from entering
104113
// TODO: will cause msTask path vacant for one-cycle after replRead, since not use Flow so as to avoid ready propagation

0 commit comments

Comments
 (0)