You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: benchmark/basic_performance/README.md
+60Lines changed: 60 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -63,6 +63,66 @@ Configure the bandwidth vs latency test script:
63
63
nano 200_{your test script}.yaml # or reuse 200_cache_heatmap.yaml
64
64
```
65
65
66
+
**Fields**
67
+
1. `repeat` configuration:
68
+
- specify the number of pointer-chasing rounds
69
+
2. `test_type` configuration:
70
+
- `0`: measure load/store operation latency
71
+
- `1`: measure flush operation latency if `use_flush` is set to `1`
72
+
3. `use_flush` configuration:
73
+
- `0`: no flushing
74
+
- `1`: flush data after each round of pointer-chasing
75
+
4. `flush_type` configuration:
76
+
- select the type of flush instruction:
77
+
- `0`: clflush
78
+
- `1`: clflushopt
79
+
- `2`: clwb
80
+
5. `ldst_type` configuration:
81
+
- select the type of load/store instruction:
82
+
- `0`: regular
83
+
- `1`: non-temporal (will add support soon)
84
+
- `2`: atomic (will add support soon)
85
+
6. `core_id` configuration:
86
+
- specify the core to run benchmark
87
+
7. `node_id` configuration:
88
+
- specify the accessed memory node
89
+
8. `access_order` configuration:
90
+
- `0`: random pointer-chasing
91
+
- `1`: sequential pointer-chasing
92
+
9. `stride_size_array` configuration:
93
+
- specify the stride size between two sequential memory blocks
94
+
10. `block_num_array` configuration:
95
+
- specify the number of accessed memory blocks
96
+
97
+
98
+
### Cache Analysis Setup Instructions
99
+
100
+
To ensure accurate and repeatable cache analysis results, we use a pointer-chasing benchmark executed on a contiguous physical address range. Since physical addresses are accessible only in kernel mode, a kernel module is designed to perform the cache analysis.
101
+
102
+
#### Configuration Steps
103
+
104
+
1. **Specify Physical Address Ranges**
105
+
Before running the cache analysis, define the physical address ranges for both DIMM and CXL memory in the `$(hostname).env` file. Example configuration:
106
+
107
+
```
108
+
dimm_physical_start_addr=0x800000000 # 32GB
109
+
cxl_physical_start_addr=0x4080000000 # 258GB
110
+
test_size=0x840000000 # 33GB (32GB test buffer + 1GB cindex buffer)
111
+
```
112
+
113
+
- For DIMM memory testing, the physical address range is set from 32GB to 65GB.
114
+
- For CXL memory testing, the CXL node starts at the physical address 258GB.
115
+
- To obtain the physical address range of each NUMA node, navigate to the `benchmark/basic_performance/build/cache_test/misc/numa_info` directory and execute the `make` command.
116
+
117
+
2. **Reserve Physical Address Range**
118
+
To prevent system crashes when accessing physical addresses directly, reserve the specified physical address range using the `memmap` kernel boot parameter. For example, add the following to the Linux boot command:
119
+
120
+
```
121
+
memmap=33G!32G
122
+
```
123
+
124
+
This reserves a 33GB range starting at 32GB for safe testing.
0 commit comments