Skip to content

Commit f3f1ba8

Browse files
authored
Avoid using object pinning in native blackhole (#132)
Re-implemented native blackhole without using object pinning as it affects performance and leads to unstable results as each time GC has to spend more and more time scanning all pinned values. Instead, primitive values consumption is implemented as a comparison of the value for equality with two fields and publishing the value in case when comparison succeeds. The values themselves are never the same and one of the fields is volatile, thus the condition is always false and it could not be omitted because of volatility. That should prevent both dead code elimination and movement of the code computing the consumed value into an effectively unreachable branch. For the objects, identifyHashCode is used to obtain an int-value that is then passed into a regular consumption routine. That function is an intrinsic that simply gets an address of the object, so it has no performance impact, yet it requires an object. Fixes #114
1 parent e39a604 commit f3f1ba8

File tree

1 file changed

+75
-20
lines changed

1 file changed

+75
-20
lines changed
Lines changed: 75 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -1,35 +1,90 @@
11
package kotlinx.benchmark
22

3-
import kotlinx.cinterop.pin
3+
import kotlinx.cinterop.toByte
4+
import kotlin.concurrent.Volatile
5+
import kotlin.native.identityHashCode
6+
import kotlin.random.Random
47

8+
@OptIn(ExperimentalStdlibApi::class)
59
actual class Blackhole {
6-
actual fun consume(obj: Any?) {
7-
obj?.pin()
10+
@Volatile
11+
var i0: Int = Random.nextInt()
12+
var i1 = i0 + 1
13+
14+
@Volatile
15+
var l0 = Random.nextLong()
16+
var l1 = l0 + 1L
17+
18+
@Volatile
19+
var f0 = Random.nextFloat()
20+
var f1 = f0 + 1.0f
21+
22+
@Volatile
23+
var d0 = Random.nextDouble()
24+
var d1 = d0 + 1.0
25+
26+
@Volatile
27+
var bh: Blackhole? = null
28+
29+
actual inline fun consume(obj: Any?) {
30+
// identityHashCode is an intrinsic function
31+
// resolved into getting an object address, so there will be no call.
32+
consume(obj.identityHashCode())
833
}
9-
actual fun consume(bool: Boolean) {
10-
bool.pin()
34+
35+
actual inline fun consume(bool: Boolean) {
36+
consume(bool.toByte())
1137
}
12-
actual fun consume(c: Char) {
13-
c.pin()
38+
39+
actual inline fun consume(c: Char) {
40+
consume(c.code)
1441
}
15-
actual fun consume(b: Byte) {
16-
b.pin()
42+
43+
actual inline fun consume(b: Byte) {
44+
consume(b.toInt())
1745
}
18-
actual fun consume(s: Short) {
19-
s.pin()
46+
47+
actual inline fun consume(s: Short) {
48+
consume(s.toInt())
2049
}
21-
actual fun consume(i: Int) {
22-
i.pin()
50+
51+
actual inline fun consume(i: Int) {
52+
// To ensure that i's value will not be removed by optimizations like dead code elimination,
53+
// its value is compares with two value i0 and i1, such that i1 = i0 + 1.
54+
// As long as i0 and i1 are different, the following condition should not ever be met
55+
// and as the branch following it (note that if it is executed, then NPE will happen).
56+
// To ensure that at least i0 value will be loaded on every call, it was annotated with Volatile.
57+
//
58+
// This approach has one drawback: in general, it should be compiled to a code with two branch instructions,
59+
// and performance characteristics of a benchmark may not be stable if consumed value is sometimes equal to i0.
60+
// In practice, there is almost no effect on the measured performance and
61+
// the difference is within the error margin.
62+
// However, if it becomes a problem one day, then the condition should be rewritten to something like:
63+
// if (((i0 xor i) and (i1 xor i)) == -1) { ... }
64+
// We can't simply compare xor results as it could be optimized to comparison of i0 and i1 and i's evaluation
65+
// may be sunk into the unreachable branch.
66+
if ((i0 == i) && (i1 == i)) {
67+
bh!!.i0 = i
68+
}
2369
}
24-
actual fun consume(l: Long) {
25-
l.pin()
70+
71+
actual inline fun consume(l: Long) {
72+
if ((l0 == l) && (l1 == l)) {
73+
bh!!.l0 = l
74+
}
2675
}
27-
actual fun consume(f: Float) {
28-
f.pin()
76+
77+
actual inline fun consume(f: Float) {
78+
if ((f0 == f) && (f1 == f)) {
79+
bh!!.f0 = f
80+
}
2981
}
30-
actual fun consume(d: Double) {
31-
d.pin()
82+
83+
actual inline fun consume(d: Double) {
84+
if ((d0 == d) && (d1 == d)) {
85+
bh!!.d0 = d
86+
}
3287
}
3388
}
3489

35-
actual fun Blackhole.flush() = Unit
90+
actual fun Blackhole.flush() = Unit

0 commit comments

Comments
 (0)