-
Notifications
You must be signed in to change notification settings - Fork 71
Description
环境描述:
GEM5:xs-dev f231aa5
NEMU:master cf24515c85f5be898687959ab299ea276dbd7c56
DRAMsim3:master 29817593b3389f1337235d63cac515024ab8fd6e
LibCheckpointAlpha:main c5c2fef74133fb2b8ef8642633f60e0996493f29
工具链:15.0.0(/nfs/home/hebo/TOOL/gnu-riscv64-toolchain/riscv-toolchain-gcc15-240613-noFFnoSeg2)
我遇到的问题:gem5.opt 执行 checkpoint 时 Difftest failed
我想要的:我想用最新的 GEM5 跑我的 checkpoint(gcb)
现象描述
在用 gem5.opt 执行 checkpoint 时 Difftest failed ,部分日志如下所示:
**** REAL SIMULATION ****
build/RISCV/sim/simulate.cc:194: info: Entering event queue @ 0. Starting simulation...
build/RISCV/cpu/base.cc:1412: warn: Start memcpy to NEMU from 0x7fd92f9b8000, size=8589934592
build/RISCV/cpu/base.cc:1415: warn: Start regcpy to NEMU
build/RISCV/cpu/base.cc:1440: panic: Difftest failed!
Memory Usage: 17138044 KBytes
Program aborted at tick 61938
Aborted (core dumped)感觉像是我的 difftest 启动失败了,我怀疑是我 riscv64-nemu-interpreter-so 或者 gcpt.bin 有问题,于是我尝试使用 GEM5/releases 里的编译好的文件:
riscv64-nemu-interpreter-4332a525-so + 我自己的 gcpt.bin ,出现同样的现象
nemu-gcbv-ref.so + gcb-restorer.bin ,出现同样的现象
riscv64-nemu-interpreter-so + gcpt.bin ,出现同样的现象
nemu-gcbv-ref.so + gcb-restorer.bin ,出现同样的现象
riscv64-nemu-interpreter-231008.so + gcpt-restorer-231016.bin ,出现如下所示的现象,但还是没跑起来:
build/RISCV/mem/physical.cc:707: warn: Overriding Gcpt restorer
build/RISCV/mem/physical.cc:708: warn: gCptRestorerPath: /nfs/home/haokangda/hebo/TOOL/GEM5/GEM5-reuslt/test/gcpt-restorer-231016.bin
build/RISCV/mem/physical.cc:723: warn: Gcpt restorer file size 4352 is larger than limit 1792, is partially loaded
build/RISCV/mem/physical.cc:731: warn: gcpt restore size: 1792
build/RISCV/sim/system.cc:561: info: Restored from Xiangshan RISC-V Checkpoint
**** REAL SIMULATION ****
build/RISCV/sim/simulate.cc:194: info: Entering event queue @ 0. Starting simulation...
build/RISCV/cpu/base.cc:1412: warn: Start memcpy to NEMU from 0x7fec8e83e000, size=8589934592
build/RISCV/cpu/base.cc:1415: warn: Start regcpy to NEMU
address (0x0000000010000000) is out of bound {???} [0x0000000000000000, 0x0000000000000000] at pc = 0x0000000010000000
$0: 0x0000000000000000 ra: 0x0000000000000000 sp: 0x0000000000000000 gp: 0x0000000000000000
tp: 0x0000000000000000 t0: 0x0000000000000000 t1: 0x0000000000000000 t2: 0x0000000000000000
s0: 0x0000000000080000 s1: 0x0000000000000000 a0: 0x0000000000000000 a1: 0x0000000000000000
a2: 0x0000000000000000 a3: 0x0000000000000000 a4: 0x0000000000000000 a5: 0x0000000000000000
a6: 0x0000000000000000 a7: 0x0000000000000000 s2: 0x0000000000000000 s3: 0x0000000000000000
s4: 0x0000000000000000 s5: 0x0000000000000000 s6: 0x0000000000000000 s7: 0x0000000000000000
s8: 0x0000000000000000 s9: 0x0000000000000000 s10: 0x0000000000000000 s11: 0x0000000000000000
t3: 0x0000000000000000 t4: 0x0000000000000000 t5: 0x0000000000000000 t6: 0x0000000000000000
ft0: 0x0000000000000000 ft1: 0x0000000000000000 ft2: 0x0000000000000000 ft3: 0x0000000000000000
ft4: 0x0000000000000000 ft5: 0x0000000000000000 ft6: 0x0000000000000000 ft7: 0x0000000000000000
fs0: 0x0000000000000000 fs1: 0x0000000000000000 fa0: 0x0000000000000000 fa1: 0x0000000000000000
fa2: 0x0000000000000000 fa3: 0x0000000000000000 fa4: 0x0000000000000000 fa5: 0x0000000000000000
fa6: 0x0000000000000000 fa7: 0x0000000000000000 fs2: 0x0000000000000000 fs3: 0x0000000000000000
fs4: 0x0000000000000000 fs5: 0x0000000000000000 fs6: 0x0000000000000000 fs7: 0x0000000000000000
fs8: 0x0000000000000000 fs9: 0x0000000000000000 fs10: 0x0000000000000000 fs11: 0x0000000000000000
ft8: 0x0000000000000000 ft9: 0x0000000000000000 ft10: 0x0000000000000000 ft11: 0x0000000000000000
pc: 0x0000000010000000 mstatus: 0x0000000a00000000 mcause: 0x0000000000000000 mepc: 0x0000000000000000
sstatus: 0x0000000200000000 scause: 0x0000000000000000 sepc: 0x0000000000000000
satp: 0x0000000000000000
mip: 0x0000000000000000 mie: 0x0000000000000000 mscratch: 0x0000000000000000 sscratch: 0x0000000000000000
mideleg: 0x0000000000000000 medeleg: 0x0000000000000000
mtval: 0x0000000000000000 stval: 0x0000000000000000 mtvec: 0x0000000000000000 stvec: 0x0000000000000000
privilege mode:2147483648 pmp: below
[src/cpu/cpu-exec.c,62,monitor_statistic] host time spent = 0 us
[src/cpu/cpu-exec.c,64,monitor_statistic] total guest instructions = 1
[src/cpu/cpu-exec.c,66,monitor_statistic] Finish running in less than 1 us and can not calculate the simulation frequency
gem5.opt: src/device/io/map.c:21: check_bound: Assertion `map != ((void *)0) && addr <= map->high && addr >= map->low' failed.
Program aborted at tick 61938
Aborted (core dumped)到这里我开始怀疑是我 checkpoint 有问题,于是我使用了我很久以前配置的 GEM5 跑同一个 checkpoint 文件,它是可以执行的。
如何构建的 GEM5
GEM5 build
git clone https://github.com/OpenXiangShan/GEM5.git
git checkout xs-dev
git pull
cd GEM5/ext/dramsim3
git clone https://github.com/umd-memsys/DRAMsim3.git DRAMsim3
cd DRAMsim3 && mkdir build
cd build
cmake ..
make
cd GEM5
scons build/RISCV/gem5.opt --gold-linker -j8NEMU build
# build gcpt.bin
git clone https://github.com/OpenXiangShan/NEMU.git
cd NEMU/resource
git clone https://github.com/OpenXiangShan/LibCheckpointAlpha.git gcpt_restore
cd gcpt_restore
make
# build riscv64-nemu-interpreter-so
cd NEMU
make riscv64-gem5-ref_defconfig
make -j16完整的执行命令
/nfs/home/haokangda/hebo/TOOL/GEM5/GEM5/build/RISCV/gem5.opt /nfs/home/haokangda/hebo/TOOL/GEM5/GEM5/configs/example/fs.py \
--xiangshan-system --cpu-type=DerivO3CPU --mem-size=8GB --caches \
--cacheline_size=64 --l1i_size=64kB --l1i_assoc=8 --l1d_size=64kB \
--l1d_assoc=8 --l1d-hwp-type=XSCompositePrefetcher --short-stride-thres=0 \
--l2cache --l2_size=1MB --l2_assoc=8 --l3cache --l3_size=16MB --l3_assoc=16 \
--l1-to-l2-pf-hint --l2-hwp-type=WorkerPrefetcher --l2-to-l3-pf-hint \
--l3-hwp-type=WorkerPrefetcher --mem-type=DRAMsim3 \
--dramsim3-ini=/nfs/home/haokangda/hebo/TOOL/GEM5/GEM5/ext/dramsim3/xiangshan_configs/xiangshan_DDR4_8Gb_x8_3200_2ch.ini \
--bp-type=DecoupledBPUWithFTB --enable-loop-predictor \
--difftest-ref-so /nfs/home/haokangda/hebo/TOOL/GEM5/NEMU/build/riscv64-nemu-interpreter-so \
--enable-difftest \
--generic-rv-cpt=/nfs/home/haokangda/hebo/BOSC/Simpoint_Checkpoint/auto_checkpoint/archive/archive/969cfff4cce9ca1ca51d45cbf8254e8d/checkpoint-0-0-0/astar_biglakes/57/_57_0.016584_.gz \
--gcpt-restorer=/nfs/home/haokangda/hebo/TOOL/GEM5/NEMU/resource/gcpt_restore/build/gcpt.bin \
--warmup-insts-no-switch=20000000 --maxinsts=40000000完整的日志信息
Global frequency set at 1000000000000 ticks per second
warn: No dot file generated. Please install pydot to generate the dot file and pdf.
WARNING: Output directory ext/dramsim3/DRAMsim3/ not exists! Using current directory for output!
build/RISCV/arch/riscv/bare_metal/fs_workload.cc:60: info: No bootload provided, because using XS GCPT, reset to 0x80000000
build/RISCV/cpu/base.cc:214: warn: cpu_id set to 0
Using /nfs/home/haokangda/hebo/TOOL/GEM5/NEMU/build/riscv64-nemu-interpreter-so for difftest
build/RISCV/cpu/base.cc:228: warn: Difftest is enabled with ref so: /nfs/home/haokangda/hebo/TOOL/GEM5/NEMU/build/riscv64-nemu-interpreter-so.
build/RISCV/cpu/o3/cpu.cc:233: warn: Setting isa ptr of cpu to 0x5600b7e3cc60
build/RISCV/base/statistics.hh:281: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated.
0: system.remote_gdb: listening for remote gdb on port 7000
build/RISCV/mem/physical.cc:507: warn: Unserializing physical memory from file /nfs/home/haokangda/hebo/BOSC/Simpoint_Checkpoint/auto_checkpoint/archive/archive/969cfff4cce9ca1ca51d45cbf8254e8d/checkpoint-0-0-0/astar_biglakes/57/_57_0.016584_.gz
build/RISCV/mem/physical.cc:707: warn: Overriding Gcpt restorer
build/RISCV/mem/physical.cc:708: warn: gCptRestorerPath: /nfs/home/haokangda/hebo/TOOL/GEM5/NEMU/resource/gcpt_restore/build/gcpt.bin
build/RISCV/mem/physical.cc:723: warn: Gcpt restorer file size 1048600 is larger than limit 1792, is partially loaded
build/RISCV/mem/physical.cc:731: warn: gcpt restore size: 1792
build/RISCV/sim/system.cc:561: info: Restored from Xiangshan RISC-V Checkpoint
gem5 Simulator System. https://www.gem5.org
gem5 is copyrighted software; use the --copyright option for details.
gem5 version [DEVELOP-FOR-22.1]
gem5 compiled Oct 12 2024 10:30:19
gem5 started Oct 12 2024 16:55:45
gem5 executing on node042.bosccluster.com, pid 1358518
command line: /nfs/home/haokangda/hebo/TOOL/GEM5/GEM5/build/RISCV/gem5.opt /nfs/home/haokangda/hebo/TOOL/GEM5/GEM5/configs/example/fs.py --xiangshan-system --cpu-type=DerivO3CPU --mem-size=8GB --caches --cacheline_size=64 --l1i_size=64kB --l1i_assoc=8 --l1d_size=64kB --l1d_assoc=8 --l1d-hwp-type=XSCompositePrefetcher --short-stride-thres=0 --l2cache --l2_size=1MB --l2_assoc=8 --l3cache --l3_size=16MB --l3_assoc=16 --l1-to-l2-pf-hint --l2-hwp-type=WorkerPrefetcher --l2-to-l3-pf-hint --l3-hwp-type=WorkerPrefetcher --mem-type=DRAMsim3 --dramsim3-ini=/nfs/home/haokangda/hebo/TOOL/GEM5/GEM5/ext/dramsim3/xiangshan_configs/xiangshan_DDR4_8Gb_x8_3200_2ch.ini --bp-type=DecoupledBPUWithFTB --enable-loop-predictor --difftest-ref-so /nfs/home/haokangda/hebo/TOOL/GEM5/NEMU/build/riscv64-nemu-interpreter-so --enable-difftest --generic-rv-cpt=/nfs/home/haokangda/hebo/BOSC/Simpoint_Checkpoint/auto_checkpoint/archive/archive/969cfff4cce9ca1ca51d45cbf8254e8d/checkpoint-0-0-0/astar_biglakes/57/_57_0.016584_.gz --gcpt-restorer=/nfs/home/haokangda/hebo/TOOL/GEM5/NEMU/resource/gcpt_restore/build/gcpt.bin --warmup-insts-no-switch=20000000 --maxinsts=40000000
[<m5.params.AddrRange object at 0x7fdb2fb9ee60>]
['basic']
db_switches: []
Attach 1 decoders to thread with addr: <orphan System>.cpu.decoder
Create threads for test sys cpu (RiscvO3CPU)
Add dtb for L2 prefetcher
Finish memory system configuration
No cpu_class provided
Registering probe listeners for BaseO3CPU system.cpu
Registering probe listeners for Prefetcher system.cpu.dcache.prefetcher
system.cpu.dcache.prefetcher addTLB system.cpu.mmu.dtb
system.cpu.dcache.prefetcher addHintDownStream system.l2_caches.prefetcher
Registering probe listeners for Prefetcher system.cpu.dcache.prefetcher.berti
Registering probe listeners for Prefetcher system.cpu.dcache.prefetcher.bop_large
Registering probe listeners for Prefetcher system.cpu.dcache.prefetcher.bop_learned
Registering probe listeners for Prefetcher system.cpu.dcache.prefetcher.bop_small
Registering probe listeners for Prefetcher system.cpu.dcache.prefetcher.cmc
Registering probe listeners for Prefetcher system.cpu.dcache.prefetcher.ipcp
Registering probe listeners for Prefetcher system.cpu.dcache.prefetcher.opt
Registering probe listeners for Prefetcher system.cpu.dcache.prefetcher.spp
Registering probe listeners for Prefetcher system.cpu.dcache.prefetcher.sstride
Registering probe listeners for Prefetcher system.cpu.dcache.prefetcher.xsstream
Registering probe listeners for Prefetcher system.l2_caches.prefetcher
system.l2_caches.prefetcher addTLB system.cpu.mmu.dtb
system.l2_caches.prefetcher addHintDownStream system.l3.prefetcher
Registering probe listeners for Prefetcher system.l3.prefetcher
**** REAL SIMULATION ****
build/RISCV/sim/simulate.cc:194: info: Entering event queue @ 0. Starting simulation...
build/RISCV/cpu/base.cc:1412: warn: Start memcpy to NEMU from 0x7fd92f9b8000, size=8589934592
build/RISCV/cpu/base.cc:1415: warn: Start regcpy to NEMU
build/RISCV/cpu/base.cc:1440: panic: Difftest failed!
Memory Usage: 17138044 KBytes
Program aborted at tick 61938
Aborted (core dumped)额外的信息
这是我很久之前配置的 GEM5 跑同一个 checkpoint 文件,它是可以执行的。
执行命令:
/nfs/home/hebo/TOOL/GEM5/GEM5-internal/build/RISCV/gem5.opt /nfs/home/hebo/TOOL/GEM5/GEM5-internal/configs/example/fs.py \
--xiangshan-system --cpu-type=DerivO3CPU --mem-size=8GB --caches \
--cacheline_size=64 --l1i_size=64kB --l1i_assoc=8 --l1d_size=64kB \
--l1d_assoc=8 --l1d-hwp-type=XSCompositePrefetcher --short-stride-thres=0 \
--l2cache --l2_size=1MB --l2_assoc=8 --l3cache --l3_size=16MB --l3_assoc=16 \
--l1-to-l2-pf-hint --l2-hwp-type=WorkerPrefetcher --l2-to-l3-pf-hint \
--l3-hwp-type=WorkerPrefetcher --mem-type=DRAMsim3 \
--dramsim3-ini=/nfs/home/hebo/TOOL/GEM5/GEM5-internal/ext/dramsim3/xiangshan_configs/xiangshan_DDR4_8Gb_x8_3200_2ch.ini \
--bp-type=DecoupledBPUWithFTB --enable-loop-predictor \
--difftest-ref-so /nfs/home/hebo/TOOL/GEM5/NEMU-test/732e4ccd/NEMU/build/riscv64-nemu-interpreter-so \
--enable-difftest \
--generic-rv-cpt=/nfs/home/haokangda/hebo/BOSC/Simpoint_Checkpoint/auto_checkpoint/archive/archive/969cfff4cce9ca1ca51d45cbf8254e8d/checkpoint-0-0-0/astar_biglakes/57/_57_0.016584_.gz \
--gcpt-restorer=/nfs/home/hebo/TOOL/GEM5/NEMU-test/732e4ccd/NEMU/resource/gcpt_restore/build/gcpt.bin \
--warmup-insts-no-switch=20000000 --maxinsts=40000000日志文件路径:/nfs/home/hebo/TOOL/GEM5/test/log.txt