Skip to content

Commit 90fe4be

Browse files
committed
Revise the part on meeting the memory targets
1 parent 4db6e3b commit 90fe4be

File tree

1 file changed

+23
-23
lines changed

1 file changed

+23
-23
lines changed

doc/final-report/final-report.md

Lines changed: 23 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1251,41 +1251,41 @@ can be interleaved with the database operations then using more cores can
12511251
improve performance even more. With the real UTxO workload, we are in this
12521252
situation, of course, because there is transaction validation work to do.
12531253

1254-
### The memory targets
1254+
### Meeting the memory targets
12551255

1256-
Performance requirement 7 states:
1256+
Item 7 of the performance requirements states the following:
12571257

12581258
> A benchmark should demonstrate that the memory use of a table with 10 M
12591259
> entries is within 100 Mb, and a 100 M entry table is within 1 Gb. This should
12601260
> be for key value sizes as in the primary benchmark (34 + 60 bytes).
12611261
1262-
The table in the [Primary Benchmark Results] section reports the results for a
1263-
table with 100 M entries. The last column is the maximum memory used during the
1264-
run. As noted in the table description, the memory measurement is the peak RSS,
1265-
as reported by the OS.
1262+
The benchmark results shown at the beginning of the section *[Results of the
1263+
primary benchmark]* are for a database table with 100 M entries. They contain
1264+
the amount of memory for each run, which is the maximum RSS as reported by the
1265+
operating system. We can see from the listed values that all the benchmark runs
1266+
use less than 1 GiB (1024 MiB).
12661267

1267-
We can see from the reported memory use that all the benchmark runs work in
1268-
less than 1 GiB (1024 MiB).
1268+
For the target of a 100 MiB maximum when using a 10 M entry table, we run the
1269+
same benchmark with a slightly different configuration:
12691270

1270-
For the target of the 10 M entry table operating within 100 Mb, we have to use
1271-
a slightly different benchmark configuration.
1271+
* Of course, we use a smaller table, one with an initial size of 10 M entries.
12721272

1273-
* Obviously we must use a smaller table, with an initial size of 10 M entries.
1274-
* We also scale down the size of the write buffer correspondingly, from 20k
1275-
entries to 2k entries.
1276-
* We tell the GHC RTS to limit its heap size to 100 Mb, using: `+RTS -M100m`.
1273+
* We scale down the size of the write buffer correspondingly, from 20 k entries
1274+
to 2 k entries.
12771275

1278-
One minor "gotcha" to avoid, when reproducing this result, is that one has to
1279-
run the benchmark executable directly, not via `cabal run`. This is because
1280-
`/usr/bin/time` reports the largest RSS of any sub-process which may turn out
1281-
to be from `cabal` itself, and not the benchmark process.
1276+
* We tell the GHC RTS to limit its heap size to 100 MiB, using `+RTS -M100m`.
12821277

1283-
With this, the RSS is reported as 85,220 Kb, 83.2 Mb, which is less than the
1284-
target of 100 Mb.
1278+
With this configuration, the maximum RSS is reported as 85,220 KiB (83.2 MiB),
1279+
which is less than the target of 100 MiB.
12851280

1286-
We also get excellent performance results for smaller tables like this, since
1287-
there is less merging work to do. In this case we get around 150k ops/sec,
1288-
compared to around 86k ops/sec for the 100 M entry table.
1281+
When reproducing this result, one minor trap to avoid is to run the benchmark
1282+
using `cabal run` instead of launching its executable directly. The problem with
1283+
the former is that GNU Time reports the largest RSS of any subprocess, which may
1284+
turn out to be the `cabal` process and not the benchmark process.
1285+
1286+
By the way, we also get excellent speed with the 10 M entry table: 150 k
1287+
ops/sec, much more than the circa 86 k ops/sec we get with the 100 M entry
1288+
table. The reason is that for smaller tables there is less merging work to do.
12891289

12901290
### Reproducing the results
12911291

0 commit comments

Comments
 (0)