Skip to content

Commit 14ef9f6

Browse files
committed
Fixed the readme
1 parent b6ed93a commit 14ef9f6

File tree

1 file changed

+76
-14
lines changed

1 file changed

+76
-14
lines changed

docs/disk_hnsw_multithreaded_architecture.md

Lines changed: 76 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,8 @@ This document describes the multi-threaded architecture of the HNSWDisk index, f
88

99
### 1. Lightweight Insert Jobs
1010

11-
Each insert job is lightweight and only stores metadata (vectorId, elementMaxLevel). Vector data is looked up from shared storage when the job executes, minimizing memory usage when many jobs are queued.
11+
Each insert job (to the threadpool) is lightweight and only stores metadata (vectorId, elementMaxLevel). Vector data is looked up from shared storage when the job executes, minimizing memory usage when many jobs are queued.
12+
1213

1314
```
1415
┌──────────────────────────────────────────────────────────────┐
@@ -39,13 +40,84 @@ struct alignas(64) CacheSegment {
3940
```
4041
Note:
4142
NUM_CACHE_SEGMENTS can be changed which will cause better separation of the cache,
42-
but will require more RAM usage.
43+
but will require more RAM usage - can be configured by the user or by the expected
44+
number of vectors in the index.
4345

4446
**Key benefits:**
4547
- Threads accessing different segments proceed in parallel.
4648
- Cache-line alignment (`alignas(64)`) prevents false sharing.
4749
- Hash-based segment distribution.
4850

51+
#### Cache Memory Management
52+
53+
**Current Behavior: No Eviction**
54+
55+
The segment cache (`cacheSegment.cache`) currently **grows unboundedly** and is **never evicted**. Once a node's neighbor list is loaded into cache (either from disk or created during insert), it remains in memory indefinitely.
56+
57+
**Why Cache is Source of Truth**
58+
59+
The cache cannot simply be cleared because it serves as the **source of truth** for pending updates that haven't been flushed to disk yet. The Swap-and-Flush pattern relies on:
60+
1. Cache always having the latest neighbor lists
61+
2. `dirty` set tracking which nodes need to be written
62+
3. Flusher reading current cache state (not stale data)
63+
64+
**Need to decide which strategy to implement (if any).**
65+
**Another option is to not use the neighbors cache at all and always read from disk**
66+
67+
**1. LRU Eviction for Clean Entries**
68+
69+
Evict least-recently-used entries that are **not dirty** (already persisted to disk):
70+
```cpp
71+
// Pseudocode
72+
if (cacheSize > maxCacheSize) {
73+
for (auto& entry : lruOrder) {
74+
if (!dirty.contains(entry.key)) {
75+
cache.erase(entry.key);
76+
if (--evicted >= targetEviction) break;
77+
}
78+
}
79+
}
80+
```
81+
*Pros:* Simple, safe (dirty entries always kept)
82+
*Cons:* Requires LRU tracking overhead (linked list + map)
83+
84+
**2. Time-Based Eviction**
85+
86+
Evict clean entries older than a threshold:
87+
```cpp
88+
// Pseudocode
89+
for (auto& entry : cache) {
90+
if (!dirty.contains(entry.key) &&
91+
now - entry.lastAccessTime > evictionTimeout) {
92+
cache.erase(entry.key);
93+
}
94+
}
95+
```
96+
*Pros:* Predictable memory behavior
97+
*Cons:* Requires timestamp tracking per entry
98+
99+
**3. Write-Through with Immediate Eviction**
100+
101+
After flushing to disk, immediately evict the written entries:
102+
```cpp
103+
// In flushDirtyNodesToDisk(), after successful write:
104+
for (uint64_t key : flushedNodes) {
105+
cacheSegment.cache.erase(key); // Evict after persist
106+
}
107+
```
108+
*Pros:* Minimal memory usage, no tracking overhead
109+
*Cons:* Increases disk reads on subsequent access
110+
111+
**4. Size-Limited Cache with Eviction Policy**
112+
113+
Configure maximum cache size and evict when exceeded:
114+
```cpp
115+
size_t maxCacheEntries = 100000; // Configurable
116+
// On insert, check size and evict clean entries if needed
117+
```
118+
*Pros:* Bounded memory usage
119+
*Cons:* Need to choose appropriate eviction policy
120+
49121
### 3. Lock Hierarchy
50122

51123
| Lock | Type | Protects | Notes |
@@ -65,7 +137,7 @@ std::atomic<size_t> totalDirtyCount_{0}; // Fast threshold check without lockin
65137
std::atomic<size_t> pendingSingleInsertJobs_{0}; // Track pending async jobs
66138
```
67139
Note:
68-
We can think of more atomics that can be added to further improve performance.
140+
We can probably think of more atomics variables that can be added to further improve performance, I just used for the important ones.
69141
70142
## Concurrency Patterns
71143
@@ -79,7 +151,7 @@ Phase 1: Read cache data under SHARED lock (concurrent writers allowed)
79151
- Build RocksDB WriteBatch
80152

81153
Phase 2: Clear dirty flags under EXCLUSIVE lock (brief, per-segment)
82-
- Atomically swap dirty set contents
154+
- "Atomically" (under a lock) swap dirty set contents
83155
- Release lock immediately
84156

85157
Phase 3: Write to disk (NO locks held)
@@ -253,13 +325,3 @@ std::shared_ptr<std::string> localRawRef;
253325
// Lock released, but data stays alive via localRawRef
254326
// Use localRawRef->data() for graph insertion and disk write
255327
```
256-
257-
## Thread Safety Summary
258-
259-
| Operation | Thread Safety | Notes |
260-
|-----------|---------------|-------|
261-
| `addVector()` | ✅ Safe | Atomic ID allocation, locked metadata access |
262-
| `topKQuery()` | ✅ Safe | Read-only with lock-free deleted checks |
263-
| Cache read | ✅ Safe | Shared lock per segment |
264-
| Cache write | ✅ Safe | Exclusive lock per segment |
265-
| Disk flush | ✅ Safe | Swap-and-flush pattern |

0 commit comments

Comments
 (0)