Skip to content

Commit cd33954

Browse files
authored
[Heap] Performance tweaks (apple#78)
* [Heap] Don't always inline large functions * [docs] Use a stable link to the Atkinson article (The original link was pointing to course materials at a random university.) * Remove stray import * Revert "[Heap] Don't always inline large functions" This reverts commit 07ffd6591f2a790f6b84b668c06a188325d4eaf2. * [Heap] Enable code coverage collection in Xcode scheme * [Heap] Speed up invariant checking * [Heap] Precalculate levels for each offset `_Node` is a new struct that consists of a storage offset (the old index) along with its level in the tree. The level can be incrementally calculated, saving some time vs counting bits whenever it's needed. * [Heap] Switch to using unsafe buffer pointers Introduce `Heap._UnsafeHandle` (a thin wrapper around an unsafe buffer pointer) and rebase most heap algorithms on top of that instead of array operations. This simplifies things by reducing (hopefully) unnecessary index validation, resulting in some measurable performance improvements. * [Heap] Stop force-inlining bubbleUp; mark it releasenone Not inlining such a large function speeds things up by leaving some headroom for the optimizer to make better inlining decisions elsewhere. (Force inlining this resulted in the compiler not inlining the closure passed to `_update` instead, which isn't a great tradeoff. To speed things up, mark `bubbleUp` with `@_effects(releasenone)`. This may be questionable (because it calls `Element.<`), but it results in better codegen, making `insert` match the performance of `std::priority_queue`. * [Heap] Rework removals * [Heap] Remove dead code * [Heap] Switch to using _ContiguousArray as storage * [Heap] insert<S>(contentsOf:): add fast path for count == 0 case * [Heap] Perf pass on trickleDown code paths This improves popMin/popMax (and the sequence initializer) by reviewing trickleDown and optimizing things: - Replace swapAt with a scheme where we keep a hold in the storage buffer - Slightly shorten min/max dependency chain in primary sink loop In exchange, we get even less readable code. * [Heap] bubbleUp: remove @_effects attribute This reintroduces retain/release operations for the empty array until swiftlang/swift#38898 lands. Mitigate the performance costs of this by refactoring code to reduce the size of the inlined bubbleUpMin/Max invocations. * [Heap] Finetune Heap.insert
1 parent 69d4867 commit cd33954

File tree

6 files changed

+660
-476
lines changed

6 files changed

+660
-476
lines changed

.swiftpm/xcode/xcshareddata/xcschemes/PriorityQueueModule.xcscheme

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,8 @@
2626
buildConfiguration = "Debug"
2727
selectedDebuggerIdentifier = "Xcode.DebuggerFoundation.Debugger.LLDB"
2828
selectedLauncherIdentifier = "Xcode.DebuggerFoundation.Launcher.LLDB"
29-
shouldUseLaunchSchemeArgsEnv = "YES">
29+
shouldUseLaunchSchemeArgsEnv = "YES"
30+
codeCoverageEnabled = "YES">
3031
<Testables>
3132
<TestableReference
3233
skipped = "NO">

Sources/PriorityQueueModule/Heap+Invariants.swift

Lines changed: 28 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -14,78 +14,41 @@ import Foundation
1414

1515
extension Heap {
1616
#if COLLECTIONS_INTERNAL_CHECKS
17-
/// Iterates through all the levels in the heap, ensuring that items in min
18-
/// levels are smaller than all their descendants and items in max levels
19-
/// are larger than all their descendants.
20-
///
21-
/// The min-max heap indices are structured like this:
22-
///
23-
/// ```
24-
/// min 0
25-
/// max 1 2
26-
/// min 3 4 5 6
27-
/// max 7 8 9 10 11 12 13 14
28-
/// min 15...
29-
/// ...
30-
/// ```
31-
///
32-
/// The iteration happens in depth-first order, so the descendants of the
33-
/// element at index 0 are checked to ensure they are >= that element. This is
34-
/// repeated for each child of the element at index 0 (inverting the
35-
/// comparison at each level).
36-
///
37-
/// In the case of 7 elements total (spread across 3 levels), the checking
38-
/// happens in the following order:
39-
///
40-
/// ```
41-
/// compare >= 0: 1, 3, 4, 2, 5, 6
42-
/// compare <= 1: 3, 4
43-
/// compare >= 3: (no children)
44-
/// compare >= 4: (no children)
45-
/// compare <= 2: 5, 6
46-
/// compare >= 5: (no children)
47-
/// compare >= 6: (no children)
48-
/// ```
17+
/// Visits each item in the heap in depth-first order, verifying that the
18+
/// contents satisfy the min-max heap property.
4919
@inlinable
5020
@inline(never)
5121
internal func _checkInvariants() {
5222
guard count > 1 else { return }
53-
var indicesToVisit: [Int] = [0]
54-
55-
while let elementIdx = indicesToVisit.popLast() {
56-
let element = _storage[elementIdx]
57-
58-
let isMinLevel = _minMaxHeapIsMinLevel(elementIdx)
59-
60-
var descendantIndicesToVisit = [Int]()
23+
_checkInvariants(node: .root, min: nil, max: nil)
24+
}
6125

62-
// Add the children of this element to the outer loop (as we want to check
63-
// that they are >= or <= their descendants as well)
64-
if let rightIdx = _rightChildIndex(of: elementIdx) {
65-
descendantIndicesToVisit.append(rightIdx)
66-
indicesToVisit.append(rightIdx)
26+
@inlinable
27+
internal func _checkInvariants(node: _Node, min: Element?, max: Element?) {
28+
let value = _storage[node.offset]
29+
if let min = min {
30+
precondition(value >= min,
31+
"Element \(value) at \(node) is less than min \(min)")
32+
}
33+
if let max = max {
34+
precondition(value <= max,
35+
"Element \(value) at \(node) is greater than max \(max)")
36+
}
37+
let left = node.leftChild()
38+
let right = node.rightChild()
39+
if node.isMinLevel {
40+
if left.offset < count {
41+
_checkInvariants(node: left, min: value, max: max)
6742
}
68-
if let leftIdx = _leftChildIndex(of: elementIdx) {
69-
descendantIndicesToVisit.append(leftIdx)
70-
indicesToVisit.append(leftIdx)
43+
if right.offset < count {
44+
_checkInvariants(node: right, min: value, max: max)
7145
}
72-
73-
// Compare the current element against its descendants
74-
while let idx = descendantIndicesToVisit.popLast() {
75-
if isMinLevel {
76-
precondition(element <= _storage[idx],
77-
"Element '\(_storage[idx])' at index \(idx) should be >= '\(element)'")
78-
} else {
79-
precondition(element >= _storage[idx],
80-
"Element '\(_storage[idx])' at index \(idx) should be <= '\(element)'")
81-
}
82-
83-
if let rightIdx = _rightChildIndex(of: idx) {
84-
descendantIndicesToVisit.append(rightIdx)
85-
}
86-
if let leftIdx = _leftChildIndex(of: idx) {
87-
descendantIndicesToVisit.append(leftIdx)
88-
}
46+
} else {
47+
if left.offset < count {
48+
_checkInvariants(node: left, min: min, max: value)
49+
}
50+
if right.offset < count {
51+
_checkInvariants(node: right, min: min, max: value)
8952
}
9053
}
9154
}

0 commit comments

Comments
 (0)