Skip to content

Commit de0d3f1

Browse files
update
1 parent bc6a727 commit de0d3f1

File tree

1 file changed

+99
-0
lines changed

1 file changed

+99
-0
lines changed

test/perf/notes.md

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
# Benchmark between xarray and groupby of CommonDataModel
2+
3+
4+
Test case is 3D array of size 360x180x10959 (one degree resolution global dataset representing 30 years of daily data).
5+
Normally distributed random data (mean = 100, variance = 1) in single precision floats (`Float32`).
6+
We compute the mean and standard deviation (std) of data grouped by month.
7+
8+
Accuracy is assessed by comparison with built-in functions (Statistics.jl or numpy) in double precision (using `Float64`).
9+
Note that julia’s `mean`/`std` give exactly the same results as numpy’s equivalent.
10+
11+
Using 1 CPU core, xarray’s default implementations (i. e. no dask…)
12+
30 trials, minimum time is reported here
13+
Ubuntu 22.04, Julia 1.11, python 3.10.12, xarray 2024.12
14+
15+
16+
Creation of the data file:
17+
18+
```bash
19+
julia test_perf_init.jl
20+
```
21+
22+
Get root priviledges (to drop file cache)
23+
24+
```
25+
sudo -s
26+
export HOME=/home/abarth
27+
cd ~/.julia/dev/CommonDataModel/test/perf
28+
```
29+
30+
## Laptop with a i5-1135G7 CPU and NVMe SSD WDC WDS100T2B0C
31+
32+
33+
### CommonDataModel
34+
35+
```bash
36+
~/.juliaup/bin/julia test_perf_cdm.jl
37+
```
38+
39+
Output:
40+
41+
```
42+
runtime of mean
43+
2.133 s (1686528 allocations: 2.71 GiB)
44+
runtime of std
45+
2.574 s (1686525 allocations: 2.72 GiB)
46+
accuracy
47+
sqrt(mean((gm - mean_ref) .^ 2)) = 4.643795867042341e-5
48+
sqrt(mean((gs - std_ref) .^ 2)) = 9.251281717683748e-7
49+
```
50+
51+
52+
### xarray
53+
54+
```bash
55+
python3 test_perf_xarray.py
56+
```
57+
58+
Output:
59+
60+
```
61+
python: 3.10.12 (main, Sep 11 2024, 15:47:36) [GCC 11.4.0]
62+
xarray: 2024.10.0
63+
numpy: 1.26.1
64+
runtime
65+
minimum time of <function mean_no_cache at 0x741456563d90> : 4.260775363999983
66+
minimum time of <function std_no_cache at 0x74143df69240> : 5.453749345000006
67+
accuracy
68+
accuracy of mean 4.64379586704234e-05
69+
accuracy of std 2.1211715725139616e-07
70+
```
71+
72+
73+
# Workstation with i7-7700 CPU and SATA SSD (WD Green 120G)
74+
75+
```
76+
~/.juliaup/bin/julia test_perf_init.jl
77+
~/.juliaup/bin/julia test_perf_cdm.jl
78+
python3 test_perf_xarray.py
79+
```
80+
81+
Output:
82+
83+
```
84+
runtime
85+
7.177 s (1686528 allocations: 2.71 GiB)
86+
8.090 s (1686525 allocations: 2.72 GiB)
87+
accuracy
88+
sqrt(mean((gm - mean_ref) .^ 2)) = 4.6300139982730906e-5
89+
sqrt(mean((gs - std_ref) .^ 2)) = 9.268973317814482e-7
90+
python: 3.10.12 (main, Jul 29 2024, 16:56:48) [GCC 11.4.0]
91+
xarray: 2024.10.0
92+
numpy: 1.26.2
93+
runtime
94+
minimum time of <function mean_no_cache at 0x7f54cb64bd90> : 8.740452307043597
95+
minimum time of <function std_no_cache at 0x7f54b31a0a60> : 10.462690721964464
96+
accuracy
97+
accuracy of mean 4.6300139982730906e-05
98+
accuracy of std 2.12226305970758e-07
99+
```

0 commit comments

Comments
 (0)