Skip to content

Commit 5ed2a42

Browse files
committed
final report: cross-reference and clarify negative IOPS scaling
and its effect in the parallel scaling on the affected machines.
1 parent 60ed022 commit 5ed2a42

File tree

1 file changed

+16
-7
lines changed

1 file changed

+16
-7
lines changed

doc/final-report/final-report.md

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -582,7 +582,7 @@ simplicity.
582582

583583
Since the term merge is already part of the LSM-tree terminology, we chose to
584584
call this operation a table *union* instead. Moreover, union is a more fitting
585-
name, since the behavior of table union is similar to that of
585+
name, since the behaviour of table union is similar to that of
586586
`Data.Map.unionWith`: all logical keyvalue pairs with unique keys are
587587
preserved, but pairs that have the same key are combined using the resolve
588588
function that is also used for upserts (see [functional
@@ -902,6 +902,8 @@ stretch target.
902902
Finally, note that the first three of the machines listed in the table above
903903
have SSDs that are not capable of 100 k IOPS.
904904

905+
### Micro-benchmarks of the benchmark machines
906+
905907
To help evaluate the `lsm-tree` benchmark results across these different
906908
machines, it is useful to have a rough sense of their CPU and I/O performance.
907909
Therefore, we have determined the machines’ scores according to standard
@@ -945,6 +947,8 @@ created as part of the project:
945947
[^3]: This is the `lsm-tree-bench-bloomfilter` benchmark.
946948
Use `cabal run lsm-tree-bench-bloomfilter` to run it yourself.
947949

950+
### Micro-benchmark results
951+
948952
The results of all these benchmarks are as follows:
949953

950954
----------------------------------------------------------------------------------------
@@ -996,7 +1000,11 @@ The IOPS scores scale negatively when adding more cores.
9961000
measurement artefact but shows a real effect, and it is *opposite* to what
9971001
happens with physical hardware. Running `fio` on the i8g.xlarge machine with
9981002
4 cores results in 175 k IOPS (which is near to the rated 150 k IOPS), showing
999-
that the negative scaling continues beyond two cores.
1003+
that the negative scaling continues beyond two cores. One can but speculate
1004+
as to the reason for this behaviour. It is probably an artefact of the way
1005+
the Nitro hypervisor limits IOPS on the VMs, but it is unclear why it would
1006+
allow exceeding the minimum rated IOPS by a greater proportion when
1007+
submitting I/O from fewer cores.
10001008

10011009
The IOPS scores of i7i.xlarge and i8g.xlarge are the same.
10021010

@@ -1284,10 +1292,11 @@ the two-core case.
12841292
100 k target already in this setting. We know it has higher one-core
12851293
performance than i8g.xlarge, with an advantage of approximately 40 % in the
12861294
Bloom filter micro-benchmark. Nevertheless, it is probably limited by CPU,
1287-
not by SSD, since its one-core IOPS value is so high (350 k). We know its
1288-
IOPS value scales negatively when going to two cores, down to 210 k
1289-
aggregated across both cores. This is probably the cause of its poor speedup:
1290-
the machine goes from being limited by CPU to being limited by SSD.
1295+
not by SSD, since its one-core IOPS value is so high (350 k). We know
1296+
from the subsection *[micro-benchmark results]* that this machine's IOPS
1297+
value scales _negatively_ when going to two cores, from 350 k down to 210 k
1298+
aggregated across both cores. This is probably the cause of its poor
1299+
speedup: the machine goes from being limited by CPU to being limited by SSD.
12911300

12921301
* The i8g.xlarge machine is clearly limited by CPU in the one-core case. Adding
12931302
a second core improves its CPU performance substantially but does not push
@@ -1526,7 +1535,7 @@ the application directly or by employing non-standard server functionality, as
15261535
present for example in recent versions of PostgreSQL.
15271536
15281537
By contrast, with functional persistence using explicit database handles,
1529-
implementing the desired behavior is straightforward. We generate two
1538+
implementing the desired behaviour is straightforward. We generate two
15301539
independent handles based on the same initial database state and then let one
15311540
thread execute A using one of the handles and another thread execute B using the
15321541
other handle. This is not only simpler but also involves less synchronisation,

0 commit comments

Comments
 (0)