Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
7b50b64
Update configs
i7mist Jan 5, 2016
c158203
bugfix: finalization not called in ctrl and DRAM
i7mist Jan 8, 2016
0f00202
add nozero support when prints statistics
i7mist Jan 8, 2016
0af3563
fix: only print non zero refresh related stat
i7mist Jan 8, 2016
983fb72
fix stat print: only print non zero cache stat
i7mist Jan 8, 2016
689e483
move in_queue_req update before ctrl->tick
i7mist Jan 8, 2016
51ab2c4
fix configs
i7mist Jan 16, 2016
76d362a
remove `nozero` flag in cache stats
i7mist Jan 18, 2016
415d8f7
change for gem5 integration
i7mist Jan 23, 2016
313a46b
fix several issues related with gem5:
i7mist Jan 23, 2016
0c110ee
update README: remind user to specify cpu type
i7mist Jan 23, 2016
6ada583
fixes runtime error in Ramulator+Gem5
i7mist Mar 9, 2016
07419dd
Update README.md
SaugataGhose Jun 16, 2016
d3028da
Changes to integrate Ramulator with the Structural Simulation Toolkit…
afrodri Oct 19, 2016
fcac189
Merge pull request #37 from afrodri/master
SaugataGhose Oct 29, 2016
7ce65d0
Fix a bug on tracking request hits in the scheduler. Current way of s…
Dec 11, 2016
cd96ed6
added warmup instruction support; updated all sample configuration fi…
Dec 12, 2017
ac7188a
[bugfix] fixed the issue that caused Ramulator run forever due to the…
Dec 21, 2017
da005e3
Added a row policy (ClosedAP) that makes use of the autoprecharge com…
Jan 15, 2018
7d2e723
Disabling starting over the input trace file when expected_num_insts …
Jan 25, 2018
c927d4e
fixing the bug in the cache hierarchy, issue #52;
May 18, 2018
c7a4f37
enabling rewinding an unfiltered trace when expected_num_insts is not…
May 18, 2018
7e727de
DDR4: Correct XS table entry for DDR4-3200
RSpliet Jul 12, 2018
557a41b
DDR4: Remove superfluous timing entries
RSpliet Jul 12, 2018
4df3ac0
DDR4: Replace tCCDS with tBL in RD->WR issue latency
RSpliet Jul 13, 2018
95a5115
DDR4: Honour RDA/WRA->REF distance.
RSpliet Jul 12, 2018
d4f1473
Merge pull request #57 from RSpliet/DDR4-fix
hasanh91 Jul 13, 2018
a28830b
Adding STT-RAM model (#59)
nisabostanci Sep 15, 2018
0b75f64
updated README to include pointers to VAMPIRE
SaugataGhose Oct 11, 2018
fe43b5e
updating README to reference VAMPIRE
SaugataGhose Oct 11, 2018
6dca513
Flexible mapping (#61)
agyaglikci Nov 5, 2018
0403530
Adding PCM model (#62)
nisabostanci Nov 14, 2018
845ca6a
DDR4: Honour ACT->REF distance. (#69)
RSpliet May 29, 2019
db0eb57
fix deadlock on write queue; add write callback; fix namespace (#72)
mattvilim Jul 15, 2019
36f9043
Update README.md
SaugataGhose Aug 5, 2019
791df71
Fixed a bug which made FCFS Scheduler fail an assert in DDR3 and DDR4…
metafly Aug 19, 2019
3582c44
Shivani updates (#76)
metafly Sep 12, 2019
dd32612
Ddr3 timing fixes (#83)
RSpliet Apr 13, 2020
9658765
fixing a bug that causes unnecessary delay between two ACT commands i…
Jul 30, 2020
833c0a5
initializing 'depart' in Request.h
hasanh91 Nov 23, 2020
216e512
Update README.md
SaugataGhose Dec 13, 2020
2465c75
disabing request served callback for write requests
May 11, 2021
68732e8
cleaning test_ddr3.py and updating it to print the results better
May 11, 2021
898e071
removing warmup_insts from DDR3/DDR4 default config files
May 11, 2021
4edcb0d
adding a new test: test_ramulator.py
May 11, 2021
a960ff0
adding the trace generator tool
hasanh91 Mar 1, 2022
80eafeb
updating the readme
hasanh91 Mar 1, 2022
743b940
Update README.md
omutlu May 3, 2022
f1dfd87
Update README.md
RichardLuo79 Aug 27, 2023
214f635
Update README.md
omutlu Aug 29, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,9 @@ endif
ramulator: $(MAIN) $(OBJS) $(SRCDIR)/*.h | depend
$(CXX) $(CXXFLAGS) -DRAMULATOR -o $@ $(MAIN) $(OBJS)

libramulator.a: $(OBJS) $(OBJDIR)/Gem5Wrapper.o
libtool -static -o $@ $(OBJS) $(OBJDIR)/Gem5Wrapper.o

$(OBJS): | $(OBJDIR)

$(OBJDIR):
Expand Down
75 changes: 53 additions & 22 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,43 @@
We have released an updated version of Ramulator, called [Ramulator 2.0](https://github.com/CMU-SAFARI/ramulator2), in August 2023. Ramulator 2.0 is easier to use, extend, and modify. It also has support for the latest DRAM standards at the time (e.g., DDR5, LPDDR5, HBM3 GDDR6). We suggest that you use Ramulator 2.0 and welcome your feedback and bug/issue reports.

# Ramulator: A DRAM Simulator

Ramulator is a fast and cycle-accurate DRAM simulator \[1\] that supports a
Ramulator is a fast and cycle-accurate DRAM simulator \[1, 2\] that supports a
wide array of commercial, as well as academic, DRAM standards:

- DDR3 (2007), DDR4 (2012)
- LPDDR3 (2012), LPDDR4 (2014)
- GDDR5 (2009)
- WIO (2011), WIO2 (2014)
- HBM (2013)
- SALP \[2\]
- TL-DRAM \[3\]
- RowClone \[4\]
- DSARP \[5\]
- SALP \[3\]
- TL-DRAM \[4\]
- RowClone \[5\]
- DSARP \[6\]

The initial release of Ramulator is described in the following paper:
>Y. Kim, W. Yang, O. Mutlu.
>"[**Ramulator: A Fast and Extensible DRAM Simulator**](https://people.inf.ethz.ch/omutlu/pub/ramulator_dram_simulator-ieee-cal15.pdf)".
>In _IEEE Computer Architecture Letters_, March 2015.

For information on new features, along with an extensive memory characterization using Ramulator, please read:
>S. Ghose, T. Li, N. Hajinazar, D. Senol Cali, O. Mutlu.
>"[**Demystifying Complex Workload–DRAM Interactions: An Experimental Study**](https://people.inf.ethz.ch/omutlu/pub/Workload-DRAM-Interaction-Analysis_sigmetrics19_pomacs19.pdf)".
>In _Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS)_, June 2019 ([slides](https://people.inf.ethz.ch/omutlu/pub/Workload-DRAM-Interaction-Analysis_sigmetrics19-talk.pdf)).
>In _Proceedings of the ACM on Measurement and Analysis of Computing Systems (POMACS)_, 2019.

[\[1\] Kim et al. *Ramulator: A Fast and Extensible DRAM Simulator.* IEEE CAL
2015.](http://dx.doi.org/10.1109/LCA.2015.2414456)
[\[2\] Kim et al. *A Case for Exploiting Subarray-Level Parallelism (SALP) in
DRAM.* ISCA 2012.](http://dx.doi.org/10.1109/ISCA.2012.6237032)
[\[3\] Lee et al. *Tiered-Latency DRAM: A Low Latency and Low Cost DRAM
Architecture.* HPCA 2013.](http://dx.doi.org/10.1109/HPCA.2013.6522354)
[\[4\] Seshadri et al. *RowClone: Fast and Energy-Efficient In-DRAM Bulk Data
2015.](https://people.inf.ethz.ch/omutlu/pub/ramulator_dram_simulator-ieee-cal15.pdf)
[\[2\] Ghose et al. *Demystifying Complex Workload–DRAM Interactions: An Experimental Study.* SIGMETRICS 2019.](https://people.inf.ethz.ch/omutlu/pub/Workload-DRAM-Interaction-Analysis_sigmetrics19_pomacs19.pdf)
[\[3\] Kim et al. *A Case for Exploiting Subarray-Level Parallelism (SALP) in
DRAM.* ISCA 2012.](https://users.ece.cmu.edu/~omutlu/pub/salp-dram_isca12.pdf)
[\[4\] Lee et al. *Tiered-Latency DRAM: A Low Latency and Low Cost DRAM
Architecture.* HPCA 2013.](https://users.ece.cmu.edu/~omutlu/pub/tldram_hpca13.pdf)
[\[5\] Seshadri et al. *RowClone: Fast and Energy-Efficient In-DRAM Bulk Data
Copy and Initialization.* MICRO
2013.](http://dx.doi.org/10.1145/2540708.2540725)
[\[5\] Chang et al. *Improving DRAM Performance by Parallelizing Refreshes with
Accesses.* HPCA 2014.](http://dx.doi.org/10.1109/HPCA.2014.6835946)
2013.](https://users.ece.cmu.edu/~omutlu/pub/rowclone_micro13.pdf)
[\[6\] Chang et al. *Improving DRAM Performance by Parallelizing Refreshes with
Accesses.* HPCA 2014.](https://users.ece.cmu.edu/~omutlu/pub/dram-access-refresh-parallelization_hpca14.pdf)


## Usage
Expand Down Expand Up @@ -56,13 +70,16 @@ Ramulator supports three different usage modes.
before it.

3. **gem5 Driven:** Ramulator runs as part of a full-system simulator (gem5
\[6\]), from which it receives memory request as they are generated.
\[7\]), from which it receives memory request as they are generated.

For some of the DRAM standards, Ramulator is also capable of reporting
power consumption by relying on DRAMPower \[7\] as the backend.
power consumption by relying on either VAMPIRE \[8\] or DRAMPower \[9\]
as the backend.

[\[6\] The gem5 Simulator System.](http://www.gem5.org)
[\[7\] Chandrasekar et al. *DRAMPower: Open-Source DRAM Power & Energy
[\[7\] The gem5 Simulator System.](http://www.gem5.org)
[\[8\] Ghose et al. *What Your DRAM Power Models Are Not Telling You:
Lessons from a Detailed Experimental Study.* SIGMETRICS 2018.](https://github.com/CMU-SAFARI/VAMPIRE)
[\[9\] Chandrasekar et al. *DRAMPower: Open-Source DRAM Power & Energy
Estimation Tool.* IEEE CAL 2015.](http://www.drampower.info)


Expand Down Expand Up @@ -106,6 +123,7 @@ Ramulator requires a C++11 compiler (e.g., `clang++`, `g++-5`).
# Compile gem5
# Run gem5 with `--mem-type=ramulator` and `--ramulator-config=configs/DDR3-config.cfg`

By default, gem5 uses the atomic CPU and uses atomic memory accesses, i.e. a detailed memory model like ramulator is not really used. To actually run gem5 in timing mode, a CPU type need to be specified by command line parameter `--cpu-type`. e.g. `--cpu-type=timing`

## Simulation Output

Expand Down Expand Up @@ -148,7 +166,7 @@ designated lines in the script's source code:

* Ramulator
* DRAMSim2 (https://wiki.umd.edu/DRAMSim2): `test_ddr3.py` lines 39-40
* USIMM, (http://www.cs.utah.edu/~rajeev/jwac12): `test_ddr3.py` lines 54-55
* USIMM (http://www.cs.utah.edu/~rajeev/jwac12): `test_ddr3.py` lines 54-55
* DrSim (http://lph.ece.utexas.edu/public/Main/DrSim): `test_ddr3.py` lines 66-67
* NVMain (http://wiki.nvmain.org): `test_ddr3.py` lines 78-79

Expand Down Expand Up @@ -180,11 +198,14 @@ CPU trace driven simulations.
### Power Estimation

For estimating power consumption, Ramulator can record the trace of every DRAM
command it issues to a file in DRAMPower \[7\] format. To do so, please turn
command it issues to a file in DRAMPower \[8\] format. To do so, please turn
on the `record_cmd_trace` variable in the configuration file. The resulting
DRAM command trace (e.g., `cmd-trace-chan-N-rank-M.cmdtrace`) should be fed
into DRAMPower with the correct configuration (standard/speed/organization)
to estimate energy/power usage for a single rank (a limitation of DRAMPower).
into a compatible DRAM energy simulator such as
[VAMPIRE](https://github.com/CMU-SAFARI/VAMPIRE) \[8\] or
[DRAMPower](http://www.drampower.info) \[9\] with the correct configuration
(standard/speed/organization) to estimate energy/power usage for a single rank
(a current limitation of both VAMPIRE and DRAMPower).


### Contributors
Expand All @@ -197,3 +218,13 @@ to estimate energy/power usage for a single rank (a limitation of DRAMPower).
- Saugata Ghose (Carnegie Mellon University)
- Tianshi Li (Carnegie Mellon University)
- @henryzh

### Acknowledgments

We thank the SAFARI group members who have contributed to
the initial development of Ramulator, including Kevin Chang, Saugata
Ghose, Donghyuk Lee, Tianshi Li, and Vivek Seshadri. We also
thank the anonymous reviewers for feedback. This work was
supported by NSF, SRC, and gifts from our industrial partners,
including Google, Intel, Microsoft, Nvidia, Samsung, Seagate
and VMware.
11 changes: 7 additions & 4 deletions configs/ALDRAM-config.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,16 @@
### Below are parameters only for CPU trace
cpu_tick = 4
mem_tick = 1
cache = no
### Below are parameters only for multicore mode
# When early_exit is on, all cores will be terminated when the earliest one finishes.
early_exit = on #, off (default value is on)
early_exit = on
# early_exit = on, off (default value is on)
# If expected_limit_insts is set, some per-core statistics will be recorded when this limit (or the end of the whole trace if it's shorter than specified limit) is reached. The simulation won't stop and will roll back automatically until the last one reaches the limit.
expected_limit_insts = 200000000
cache = no #, L1L2, L3, all (default value is no)
translation = None #, Random (default value is None)
warmup_insts = 100000000
cache = no
# cache = no, L1L2, L3, all (default value is no)
translation = None
# translation = None, Random (default value is None)
#
########################
12 changes: 8 additions & 4 deletions configs/DDR3-config.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,17 @@
### Below are parameters only for CPU trace
cpu_tick = 4
mem_tick = 1
cache = no
### Below are parameters only for multicore mode
# When early_exit is on, all cores will be terminated when the earliest one finishes.
early_exit = on #, off (default value is on)
early_exit = on
# early_exit = on, off (default value is on)
# If expected_limit_insts is set, some per-core statistics will be recorded when this limit (or the end of the whole trace if it's shorter than specified limit) is reached. The simulation won't stop and will roll back automatically until the last one reaches the limit.
expected_limit_insts = 200000000
cache = no #, L1L2, L3, all (default value is no)
translation = None #, Random (default value is None)
# warmup_insts = 100000000
warmup_insts = 0
cache = no
# cache = no, L1L2, L3, all (default value is no)
translation = None
# translation = None, Random (default value is None)
#
########################
14 changes: 9 additions & 5 deletions configs/DDR4-config.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
channels = 1
ranks = 1
speed = DDR4_2400R
org = DDR3_4Gb_x8
org = DDR4_4Gb_x8
# record_cmd_trace: (default is off): on, off
record_cmd_trace = off
# print_cmd_trace: (default is off): on, off
Expand All @@ -15,13 +15,17 @@
### Below are parameters only for CPU trace
cpu_tick = 8
mem_tick = 3
cache = no
### Below are parameters only for multicore mode
# When early_exit is on, all cores will be terminated when the earliest one finishes.
early_exit = on #, off (default value is on)
early_exit = on
# early_exit = on, off (default value is on)
# If expected_limit_insts is set, some per-core statistics will be recorded when this limit (or the end of the whole trace if it's shorter than specified limit) is reached. The simulation won't stop and will roll back automatically until the last one reaches the limit.
expected_limit_insts = 200000000
cache = no #, L1L2, L3, all (default value is no)
translation = None #, Random (default value is None)
# warmup_insts = 100000000
warmup_insts = 0
cache = no
# cache = no, L1L2, L3, all (default value is no)
translation = None
# translation = None, Random (default value is None)
#
########################
11 changes: 7 additions & 4 deletions configs/DSARP-config.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,16 @@
### Below are parameters only for CPU trace
cpu_tick = 4
mem_tick = 1
cache = no
### Below are parameters only for multicore mode
# When early_exit is on, all cores will be terminated when the earliest one finishes.
early_exit = on #, off (default value is on)
early_exit = on
# early_exit = on, off (default value is on)
# If expected_limit_insts is set, some per-core statistics will be recorded when this limit (or the end of the whole trace if it's shorter than specified limit) is reached. The simulation won't stop and will roll back automatically until the last one reaches the limit.
expected_limit_insts = 200000000
cache = no #, L1L2, L3, all (default value is no)
translation = None #, Random (default value is None)
warmup_insts = 100000000
cache = no
# cache = no, L1L2, L3, all (default value is no)
translation = None
# translation = None, Random (default value is None)
#
########################
11 changes: 7 additions & 4 deletions configs/GDDR5-config.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,16 @@
### Below are parameters only for CPU trace
cpu_tick = 2
mem_tick = 1
cache = no
### Below are parameters only for multicore mode
# When early_exit is on, all cores will be terminated when the earliest one finishes.
early_exit = on #, off (default value is on)
early_exit = on
# early_exit = on, off (default value is on)
# If expected_limit_insts is set, some per-core statistics will be recorded when this limit (or the end of the whole trace if it's shorter than specified limit) is reached. The simulation won't stop and will roll back automatically until the last one reaches the limit.
expected_limit_insts = 200000000
cache = no #, L1L2, L3, all (default value is no)
translation = None #, Random (default value is None)
warmup_insts = 100000000
cache = no
# cache = no, L1L2, L3, all (default value is no)
translation = None
# translation = None, Random (default value is None)
#
########################
11 changes: 7 additions & 4 deletions configs/HBM-config.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,16 @@
### Below are parameters only for CPU trace
cpu_tick = 32
mem_tick = 5
cache = no
### Below are parameters only for multicore mode
# When early_exit is on, all cores will be terminated when the earliest one finishes.
early_exit = on #, off (default value is on)
early_exit = on
# early_exit = on, off (default value is on)
# If expected_limit_insts is set, some per-core statistics will be recorded when this limit (or the end of the whole trace if it's shorter than specified limit) is reached. The simulation won't stop and will roll back automatically until the last one reaches the limit.
expected_limit_insts = 200000000
cache = no #, L1L2, L3, all (default value is no)
translation = None #, Random (default value is None)
warmup_insts = 100000000
cache = no
# cache = no, L1L2, L3, all (default value is no)
translation = None
# translation = None, Random (default value is None)
#
########################
11 changes: 7 additions & 4 deletions configs/LPDDR3-config.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,16 @@
### Below are parameters only for CPU trace
cpu_tick = 4
mem_tick = 1
cache = no
### Below are parameters only for multicore mode
# When early_exit is on, all cores will be terminated when the earliest one finishes.
early_exit = on #, off (default value is on)
early_exit = on
# early_exit = on, off (default value is on)
# If expected_limit_insts is set, some per-core statistics will be recorded when this limit (or the end of the whole trace if it's shorter than specified limit) is reached. The simulation won't stop and will roll back automatically until the last one reaches the limit.
expected_limit_insts = 200000000
cache = no #, L1L2, L3, all (default value is no)
translation = None #, Random (default value is None)
warmup_insts = 100000000
cache = no
# cache = no, L1L2, L3, all (default value is no)
translation = None
# translation = None, Random (default value is None)
#
########################
13 changes: 8 additions & 5 deletions configs/LPDDR4-config.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
channels = 2
ranks = 1
speed = LPDDR4_2400
org = LPDDR3_8Gb_x16
org = LPDDR4_8Gb_x16
# record_cmd_trace: (default is off): on, off
record_cmd_trace = off
# print_cmd_trace: (default is off): on, off
Expand All @@ -15,13 +15,16 @@
### Below are parameters only for CPU trace
cpu_tick = 8
mem_tick = 3
cache = no
### Below are parameters only for multicore mode
# When early_exit is on, all cores will be terminated when the earliest one finishes.
early_exit = on #, off (default value is on)
early_exit = on
# early_exit = on, off (default value is on)
# If expected_limit_insts is set, some per-core statistics will be recorded when this limit (or the end of the whole trace if it's shorter than specified limit) is reached. The simulation won't stop and will roll back automatically until the last one reaches the limit.
expected_limit_insts = 200000000
cache = no #, L1L2, L3, all (default value is no)
translation = None #, Random (default value is None)
warmup_insts = 100000000
cache = no
# cache = no, L1L2, L3, all (default value is no)
translation = None
# translation = None, Random (default value is None)
#
########################
30 changes: 30 additions & 0 deletions configs/PCM-config.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
########################
# Example config file
# Comments start with #
# There are restrictions for valid channel/rank numbers
standard = PCM
channels = 1
ranks = 1
speed = PCM_800D
org = PCM_2Gb_x8
# record_cmd_trace: (default is off): on, off
record_cmd_trace = off
# print_cmd_trace: (default is off): on, off
print_cmd_trace = off

### Below are parameters only for CPU trace
cpu_tick = 4
mem_tick = 1
### Below are parameters only for multicore mode
# When early_exit is on, all cores will be terminated when the earliest one finishes.
early_exit = on
# early_exit = on, off (default value is on)
# If expected_limit_insts is set, some per-core statistics will be recorded when this limit (or the end of the whole trace if it's shorter than specified limit) is reached. The simulation won't stop and will roll back automatically until the last one reaches the limit.
expected_limit_insts = 200000000
warmup_insts = 100000000
cache = no
# cache = no, L1L2, L3, all (default value is no)
translation = None
# translation = None, Random (default value is None)
#
########################
11 changes: 7 additions & 4 deletions configs/SALP-config.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -16,13 +16,16 @@
### Below are parameters only for CPU trace
cpu_tick = 4
mem_tick = 1
cache = no
### Below are parameters only for multicore mode
# When early_exit is on, all cores will be terminated when the earliest one finishes.
early_exit = on #, off (default value is on)
early_exit = on
# early_exit = on, off (default value is on)
# If expected_limit_insts is set, some per-core statistics will be recorded when this limit (or the end of the whole trace if it's shorter than specified limit) is reached. The simulation won't stop and will roll back automatically until the last one reaches the limit.
expected_limit_insts = 200000000
cache = no #, L1L2, L3, all (default value is no)
translation = None #, Random (default value is None)
warmup_insts = 100000000
cache = no
# cache = no, L1L2, L3, all (default value is no)
translation = None
# translation = None, Random (default value is None)
#
########################
Loading