Skip to content

Commit 9afbcbc

Browse files
committed
update README file for cache analysis
1 parent f099741 commit 9afbcbc

File tree

2 files changed

+61
-1
lines changed

2 files changed

+61
-1
lines changed

benchmark/basic_performance/README.md

Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,66 @@ Configure the bandwidth vs latency test script:
6363
nano 200_{your test script}.yaml # or reuse 200_cache_heatmap.yaml
6464
```
6565

66+
**Fields**
67+
1. `repeat` configuration:
68+
- specify the number of pointer-chasing rounds
69+
2. `test_type` configuration:
70+
- `0`: measure load/store operation latency
71+
- `1`: measure flush operation latency if `use_flush` is set to `1`
72+
3. `use_flush` configuration:
73+
- `0`: no flushing
74+
- `1`: flush data after each round of pointer-chasing
75+
4. `flush_type` configuration:
76+
- select the type of flush instruction:
77+
- `0`: clflush
78+
- `1`: clflushopt
79+
- `2`: clwb
80+
5. `ldst_type` configuration:
81+
- select the type of load/store instruction:
82+
- `0`: regular
83+
- `1`: non-temporal (will add support soon)
84+
- `2`: atomic (will add support soon)
85+
6. `core_id` configuration:
86+
- specify the core to run benchmark
87+
7. `node_id` configuration:
88+
- specify the accessed memory node
89+
8. `access_order` configuration:
90+
- `0`: random pointer-chasing
91+
- `1`: sequential pointer-chasing
92+
9. `stride_size_array` configuration:
93+
- specify the stride size between two sequential memory blocks
94+
10. `block_num_array` configuration:
95+
- specify the number of accessed memory blocks
96+
97+
98+
### Cache Analysis Setup Instructions
99+
100+
To ensure accurate and repeatable cache analysis results, we use a pointer-chasing benchmark executed on a contiguous physical address range. Since physical addresses are accessible only in kernel mode, a kernel module is designed to perform the cache analysis.
101+
102+
#### Configuration Steps
103+
104+
1. **Specify Physical Address Ranges**
105+
Before running the cache analysis, define the physical address ranges for both DIMM and CXL memory in the `$(hostname).env` file. Example configuration:
106+
107+
```
108+
dimm_physical_start_addr=0x800000000 # 32GB
109+
cxl_physical_start_addr=0x4080000000 # 258GB
110+
test_size=0x840000000 # 33GB (32GB test buffer + 1GB cindex buffer)
111+
```
112+
113+
- For DIMM memory testing, the physical address range is set from 32GB to 65GB.
114+
- For CXL memory testing, the CXL node starts at the physical address 258GB.
115+
- To obtain the physical address range of each NUMA node, navigate to the `benchmark/basic_performance/build/cache_test/misc/numa_info` directory and execute the `make` command.
116+
117+
2. **Reserve Physical Address Range**
118+
To prevent system crashes when accessing physical addresses directly, reserve the specified physical address range using the `memmap` kernel boot parameter. For example, add the following to the Linux boot command:
119+
120+
```
121+
memmap=33G!32G
122+
```
123+
124+
This reserves a 33GB range starting at 32GB for safe testing.
125+
66126
## 3. Start the test
67127
68128
For the bandwidth vs latency test

benchmark/basic_performance/src/main_cache.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ int prepare(pchasing_args_t &args, int argc, char *argv[]) {
7979
}
8080

8181
if (args.in_ldst_type == 0) {
82-
ldst_type = "temporal";
82+
ldst_type = "regular";
8383
} else if (args.in_ldst_type == 1) {
8484
ldst_type = "non-temporal";
8585
} else if (args.in_ldst_type == 2) {

0 commit comments

Comments
 (0)