Skip to content

Commit 131c8b0

Browse files
authored
Merge pull request #75 from jangorecki/memory-R
memory benchmarking in R
2 parents bc4682b + 9c575e0 commit 131c8b0

File tree

1 file changed

+141
-0
lines changed

1 file changed

+141
-0
lines changed
Lines changed: 141 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,141 @@
1+
2+
# Benchmarking memory usage in R
3+
4+
Profiling memory in R has never been a trivial task.
5+
In this post, I would like to emphasize that currently popular methods are quite inaccurate and should therefore be used with caution. More importantly, they should not be used for drawing conclusions about the actual memory usage of R functions.
6+
7+
The root cause of the inaccuracy with many memory profiling tools in R is that they measure memory allocated by R (including R's C code). They do not take into account memory allocated using C.
8+
9+
## Memory allocation in R
10+
11+
Following example should make it very clear.
12+
13+
Below R chunk is the content of `memtest.R` file.
14+
```r
15+
code = "
16+
int nx = LENGTH(x);
17+
double *y = (double*)(
18+
LOGICAL(r_alloc)[0] ?
19+
R_alloc(nx, sizeof(*y)) : // allocated by R's C
20+
malloc(nx * sizeof(*y)) // allocated by C
21+
);
22+
double *xp = REAL(x);
23+
// populate y
24+
for (int i=0; i<nx; i++)
25+
y[i] = xp[i];
26+
// do something with y
27+
for (int i=1; i<nx; i++)
28+
y[i] = y[i-1]+y[i];
29+
// sum double array to ensure compiler wont optimize it away
30+
double sum = 0.0;
31+
for (int i=0; i<nx; i++)
32+
sum += y[i];
33+
SEXP res = PROTECT(Rf_allocVector(REALSXP, 1));
34+
REAL(res)[0] = sum;
35+
if (!LOGICAL(r_alloc)[0])
36+
free(y);
37+
UNPROTECT(1);
38+
return res;
39+
"
40+
funx = inline::cfunction(signature(x="numeric", r_alloc="logical"), code, language="C")
41+
set.seed(108)
42+
x = rnorm(1e8)
43+
```
44+
45+
## Check equal
46+
47+
First, we will ensure that the results are the same, regardless of whether we allocate temporary working memory using R or C:
48+
49+
```sh
50+
Rscript -e 'source("memtest.R"); funx(x, r_alloc=TRUE)'
51+
#[1] 1.160649e+12
52+
Rscript -e 'source("memtest.R"); funx(x, r_alloc=FALSE)'
53+
#[1] 1.160649e+12
54+
```
55+
56+
## Memory benchmark using `bench`
57+
58+
Next, we will use the currently most popular package for profiling memory, `bench`:
59+
60+
```sh
61+
Rscript -e 'source("memtest.R"); bench::mark(funx(x, r_alloc=TRUE))'
62+
## A tibble: 1 × 13
63+
# expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time
64+
# <bch:expr> <bch> <bch:> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm>
65+
#1 funx(x, r_al… 577ms 577ms 1.73 763MB 1.73 1 1 577ms
66+
Rscript -e 'source("memtest.R"); bench::mark(funx(x, r_alloc=FALSE))'
67+
## A tibble: 1 × 13
68+
# expression min median `itr/sec` mem_alloc `gc/sec` n_itr n_gc total_time
69+
# <bch:expr> <bch> <bch:> <dbl> <bch:byt> <dbl> <int> <dbl> <bch:tm>
70+
#1 funx(x, r_al… 589ms 589ms 1.70 0B 0 1 0 589ms
71+
```
72+
73+
As we can see in the output of `mark` function, `mem_alloc` is reported to be 0B when we use `malloc`, while for `R_alloc` it reports 763MB. The difference we observe here should serve as a warning. It is because `bench::mark` tracks memory allocations managed by R's memory allocator and doesn't inherently account for memory allocated directly through C functions like `malloc` or `calloc`. If one intends to use the `mark` function to draw conclusions about memory usage, it's crucial to also examine the source code of the function being benchmarked.
74+
75+
It is worth to note that `?mark` explains this issue:
76+
77+
> `mem_alloc` - `bench_bytes` Total amount of memory allocated by R while running the expression. Memory allocated outside the R heap, e.g. by `malloc()` or `new` directly is not tracked, take care to avoid misinterpreting the results if running code that may do this.
78+
79+
Unfortunately, people are not aware of it and often publish memory usage benchmarks believing they are accurate.
80+
81+
## Memory benchmark using `cgmemtime`
82+
83+
Lastly, we will use an external process to measure memory, [cgmemtime](https://github.com/gsauthof/cgmemtime), proposed by Matt Dowle in 2014 during his work on [2B rows data.frame grouping benchmark](https://github.com/Rdatatable/data.table/wiki/Benchmarks-:-Grouping).
84+
85+
> `cgmemtime` measures the high-water RSS+CACHE memory usage of a process and its descendant processes.
86+
87+
```sh
88+
./cgmemtime Rscript -e 'source("memtest.R"); funx(x, r_alloc=TRUE)'
89+
#child_RSS_high: 1641808 KiB
90+
#group_mem_high: 1626264 KiB
91+
./cgmemtime Rscript -e 'source("memtest.R"); funx(x, r_alloc=FALSE)'
92+
#child_RSS_high: 1641096 KiB
93+
#group_mem_high: 1625820 KiB
94+
```
95+
96+
While `cgmemtime` will report very accurate memory usage statistics, it cannot directly measure the memory usage of an individual function call in isolation as it tracks the memory footprint of the entire process (and its child processes).
97+
To estimate the memory usage of the `funx()` call in this simple example, we can first measure the R process without calling `funx()`.
98+
99+
```sh
100+
./cgmemtime Rscript -e 'source("memtest.R");'
101+
#child_RSS_high: 860884 KiB
102+
#group_mem_high: 843844 KiB
103+
```
104+
105+
And then subtract this baseline from the memory usage when `funx()` is executed:
106+
107+
```r
108+
(1641096-860884)/1024
109+
#[1] 761.9258
110+
```
111+
112+
## Thank you
113+
114+
I hope this post will help people to be a bit more skeptical when reading R's memory benchmarks.
115+
116+
```
117+
R version 4.5.0 (2025-04-11)
118+
Platform: x86_64-redhat-linux-gnu
119+
Running under: Fedora Linux 42 (Workstation Edition)
120+
121+
Matrix products: default
122+
BLAS/LAPACK: FlexiBLAS OPENBLAS-OPENMP; LAPACK version 3.12.0
123+
124+
locale:
125+
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
126+
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
127+
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
128+
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
129+
[9] LC_ADDRESS=C LC_TELEPHONE=C
130+
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
131+
132+
attached base packages:
133+
[1] stats graphics grDevices utils datasets methods base
134+
135+
other attached packages:
136+
[1] bench_1.1.4 inline_0.3.21
137+
138+
loaded via a namespace (and not attached):
139+
[1] compiler_4.5.0 cli_3.6.4 pillar_1.10.2 glue_1.8.0
140+
[5] vctrs_0.6.5 lifecycle_1.0.4 rlang_1.1.6
141+
```

0 commit comments

Comments
 (0)