Skip to content

Commit 250151c

Browse files
authored
Update the performance of SharedArray for v0.31.0 (#229)
1 parent 945a424 commit 250151c

File tree

1 file changed

+19
-27
lines changed

1 file changed

+19
-27
lines changed

src/data/markdown/docs/02 javascript api/04 k6-data/1-SharedArray.md

Lines changed: 19 additions & 27 deletions
Original file line numberDiff line numberDiff line change
@@ -2,25 +2,18 @@
22
title: SharedArray
33
---
44

5-
`SharedArray` is an array-like object that shares the underlying memory between VUs. Its constructor
6-
takes a name for the `SharedArray` and a function which needs to return an array object itself. The
7-
function will be executed only once and its result will then be saved in memory once and copies
8-
of the elements will be given when requested. The name is needed as VUs are completely separate JS
9-
VMs and k6 needs some way to identify the `SharedArray`s that it needs to return.
5+
`SharedArray` is an array-like object that shares the underlying memory between VUs. Its constructor takes a name for the `SharedArray` and a function which needs to return an array object itself. The function will be executed only once and its result will then be saved in memory once and copies of the elements will be given when requested. The name is needed as VUs are completely separate JS VMs and k6 needs some way to identify the `SharedArray`s that it needs to return.
106

11-
This does mean that you can have multiple such ones and even only load some of them for given VUs, although that is
12-
unlikely to have any performance benefit.
7+
This does mean that you can have multiple such ones and even only load some of them for given VUs, although that is unlikely to have any performance benefit.
138

14-
Everything about `SharedArray` is read-only once it is constructed, so it is not possible to
15-
communicate between VUs using it.
9+
Everything about `SharedArray` is read-only once it is constructed, so it is not possible to communicate between VUs using it.
1610

1711
Supported operations include:
1812
1. getting the number of elements with `length`
1913
2. getting an element by its index using the normal syntax `array[index]`
2014
3. using `for-of` loops
2115

22-
Which means that for the most part if you currently have an array data structure that you want to
23-
take less memory you can just wrap it in `SharedArray` and it should work for most cases.
16+
Which means that for the most part if you currently have an array data structure that you want to take less memory you can just wrap it in `SharedArray` and it should work for most cases.
2417

2518
### Examples
2619

@@ -30,7 +23,7 @@ take less memory you can just wrap it in `SharedArray` and it should work for mo
3023
import { SharedArray } from "k6/data";
3124

3225
var data = new SharedArray("some name", function() {
33-
// here you can open files, and then do additional processing on them or just generate the data dynamically
26+
// here you can open files, and then do additional processing on them or just generate the data dynamically
3427
var f = JSON.parse(open("./somefile.json")).;
3528
return f; // f must be an array
3629
});
@@ -45,12 +38,9 @@ export default () => {
4538

4639
## Performance characteristics
4740

48-
As the `SharedArray` is keeping the data marshalled as JSON and unmarshals elements when requested it
49-
does take additional time to unmarshal JSONs and generate objects that then need to be garbage collected.
41+
As the `SharedArray` is keeping the data marshalled as JSON and unmarshals elements when requested it does take additional time to unmarshal JSONs and generate objects that then need to be garbage collected.
5042

51-
This in general should be unnoticeable compared to whatever you are doing with the data, but might
52-
mean that for small sets of data it is better to not use `SharedArray`, although your mileage may
53-
vary.
43+
This in general should be unnoticeable compared to whatever you are doing with the data, but might mean that for small sets of data it is better to not use `SharedArray`, although your mileage may vary.
5444

5545
As an example the following script:
5646

@@ -86,18 +76,20 @@ export default function () {
8676

8777
</div>
8878

89-
Which was ran with v0.30.0 and 100 VUs started to even out the CPU usage around 10k elements, but also was using 1/3 of the memory. At 100k `SharedArray` was the clear winner, while for lower numbers it is possible that not using it will help with performance.
79+
Which was ran with v0.31.0 and 100 VUs. As can be seen from the table below, there isn't much of a difference at lower numbers of data lines - up until around 1000 data lines there is little benefit in memory usage. But also there is little to no difference in CPU usage as well. At 10k and above, the memory savings start to heavily translate to CPU ones.
9080

91-
| data lines | shared | wall time | CPU % | mem usage | http requests |
81+
| data lines | shared | wall time | CPU % | MEM usage | http requests |
9282
| --- | --- | --- | --- | ---- | --- |
93-
| 100 | true | 2:02:50 | 86-90% | 509-517MB | 90248-92979 |
94-
| 100 | false | 2:02:50 | 76-80% | 512-533MB | 92534-94666 |
95-
| 1000 | true | 2:03:00 | 86-92% | 509-519MB | 92007-95234 |
96-
| 1000 | false | 2:02:60 | 78-80% | 621-630MB | 92814-94526 |
97-
| 10000 | true | 2:02:70 | 88-95% | 515-523MB | 92936-94997 |
98-
| 10000 | false | 2:03:80 | 81-85% | 1650-1675MB | 92339-95083 |
99-
| 100000 | true | 2:04:50 | 89-91% | 528-531MB | 92274-93987 |
100-
| 100000 | false | 2:15:00 | 115-123% | 8.9-9.5GB | 90416-94817 |
83+
| 100 | true | 2:01:70 | 70-79% | 213-217MB | 92191-98837 |
84+
| 100 | false | 2:01:80 | 74-75% | 224-232MB | 96851-98643 |
85+
| 1000 | true | 2:01:60 | 74-79% | 209-216MB | 98251-98806 |
86+
| 1000 | false | 2:01:90 | 75-77% | 333-339MB | 98069-98951 |
87+
| 10000 | true | 2:01:70 | 78-79% | 213-217MB | 97953-98735 |
88+
| 10000 | false | 2:03:00 | 80-83% | 1364-1400MB | 96816-98852 |
89+
| 100000 | true | 2:02:20 | 78-79% | 238-275MB | 98103-98540 |
90+
| 100000 | false | 2:14:00 | 120-124% | 8.3-9.1GB | 96003-97802 |
91+
92+
In v0.30.0 the difference in CPU usage at lower numbers was around 10-15%, but it also started to even out at around 10k data lines and was a clear winner at 100k.
10193

10294
The CPU/memory data comes from using `/usr/bin/time` and the raw data can be found [here](https://gist.github.com/MStoykov/1181cfa6f00bc56b90915155f885e2bb).
10395

0 commit comments

Comments
 (0)