You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/disk_hnsw_multithreaded_architecture.md
+76-14Lines changed: 76 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,8 @@ This document describes the multi-threaded architecture of the HNSWDisk index, f
8
8
9
9
### 1. Lightweight Insert Jobs
10
10
11
-
Each insert job is lightweight and only stores metadata (vectorId, elementMaxLevel). Vector data is looked up from shared storage when the job executes, minimizing memory usage when many jobs are queued.
11
+
Each insert job (to the threadpool) is lightweight and only stores metadata (vectorId, elementMaxLevel). Vector data is looked up from shared storage when the job executes, minimizing memory usage when many jobs are queued.
The segment cache (`cacheSegment.cache`) currently **grows unboundedly** and is **never evicted**. Once a node's neighbor list is loaded into cache (either from disk or created during insert), it remains in memory indefinitely.
56
+
57
+
**Why Cache is Source of Truth**
58
+
59
+
The cache cannot simply be cleared because it serves as the **source of truth** for pending updates that haven't been flushed to disk yet. The Swap-and-Flush pattern relies on:
60
+
1. Cache always having the latest neighbor lists
61
+
2.`dirty` set tracking which nodes need to be written
62
+
3. Flusher reading current cache state (not stale data)
63
+
64
+
**Need to decide which strategy to implement (if any).**
65
+
**Another option is to not use the neighbors cache at all and always read from disk**
66
+
67
+
**1. LRU Eviction for Clean Entries**
68
+
69
+
Evict least-recently-used entries that are **not dirty** (already persisted to disk):
70
+
```cpp
71
+
// Pseudocode
72
+
if (cacheSize > maxCacheSize) {
73
+
for (auto& entry : lruOrder) {
74
+
if (!dirty.contains(entry.key)) {
75
+
cache.erase(entry.key);
76
+
if (--evicted >= targetEviction) break;
77
+
}
78
+
}
79
+
}
80
+
```
81
+
*Pros:* Simple, safe (dirty entries always kept)
82
+
*Cons:* Requires LRU tracking overhead (linked list + map)
83
+
84
+
**2. Time-Based Eviction**
85
+
86
+
Evict clean entries older than a threshold:
87
+
```cpp
88
+
// Pseudocode
89
+
for (auto& entry : cache) {
90
+
if (!dirty.contains(entry.key) &&
91
+
now - entry.lastAccessTime > evictionTimeout) {
92
+
cache.erase(entry.key);
93
+
}
94
+
}
95
+
```
96
+
*Pros:* Predictable memory behavior
97
+
*Cons:* Requires timestamp tracking per entry
98
+
99
+
**3. Write-Through with Immediate Eviction**
100
+
101
+
After flushing to disk, immediately evict the written entries:
102
+
```cpp
103
+
// In flushDirtyNodesToDisk(), after successful write:
104
+
for (uint64_t key : flushedNodes) {
105
+
cacheSegment.cache.erase(key); // Evict after persist
106
+
}
107
+
```
108
+
*Pros:* Minimal memory usage, no tracking overhead
109
+
*Cons:* Increases disk reads on subsequent access
110
+
111
+
**4. Size-Limited Cache with Eviction Policy**
112
+
113
+
Configure maximum cache size and evict when exceeded:
114
+
```cpp
115
+
size_t maxCacheEntries = 100000; // Configurable
116
+
// On insert, check size and evict clean entries if needed
117
+
```
118
+
*Pros:* Bounded memory usage
119
+
*Cons:* Need to choose appropriate eviction policy
120
+
49
121
### 3. Lock Hierarchy
50
122
51
123
| Lock | Type | Protects | Notes |
@@ -65,7 +137,7 @@ std::atomic<size_t> totalDirtyCount_{0}; // Fast threshold check without lockin
65
137
std::atomic<size_t> pendingSingleInsertJobs_{0}; // Track pending async jobs
66
138
```
67
139
Note:
68
-
We can think of more atomics that can be added to further improve performance.
140
+
We can probably think of more atomics variables that can be added to further improve performance, I just used for the important ones.
69
141
70
142
## Concurrency Patterns
71
143
@@ -79,7 +151,7 @@ Phase 1: Read cache data under SHARED lock (concurrent writers allowed)
79
151
- Build RocksDB WriteBatch
80
152
81
153
Phase 2: Clear dirty flags under EXCLUSIVE lock (brief, per-segment)
82
-
- Atomically swap dirty set contents
154
+
- "Atomically" (under a lock) swap dirty set contents
0 commit comments