-
Notifications
You must be signed in to change notification settings - Fork 142
Description
In Vanadis Tick() function, the temp registers are being reset N^2 times for N threads.
Here, the register reset function is being called within a loop to go over all threads,
sst-elements/src/sst/elements/vanadis/vanadis.cc
Line 1472 in fd22b77
| resetRegisterUseTemps(thread_decoders[i]->countISAIntReg(), thread_decoders[i]->countISAFPReg()); |
however, the reset function itself hosts a loop to go over all the threads.
| VANADIS_COMPONENT::resetRegisterUseTemps(const uint16_t int_reg_count, const uint16_t fp_reg_count) |
Profiling the simulation shows that the simulator spends 10% of its time on this function.
Details on profiling experiment:
System: Ubuntu 20.04
SST: Version 14.1
AARCH: riscv64
Cross compiler: MUSL 10 riscv64-linux-musl (https://more.musl.cc/10/x86_64-linux-musl/)
Test: Matrix multiplication using pthreads (Number of worker threads-2, Number of matrix multiplications - 2, Matrix size=64x64, )
Config file: basic_vanadis.py
Environment variables:
VANADIS_VERBOSE=0
VANADIS_OS_VERBOSE=0
VANADIS_NUM_HW_THREADS=16
VANADIS_EXE= /path/to/my/custom/test
