You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/disk_hnsw_multithreaded_architecture.md
+41-16Lines changed: 41 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,28 +2,27 @@
2
2
3
3
## Overview
4
4
5
-
This document describes the architectural changes introduced in the `dorer-disk-poc-add-delete-mt` branch compared to the original `disk-poc` branch. The focus is on multi-threading, synchronization, concurrency in writing to disk, and performance enhancements.
5
+
This document describes the multi-threaded architecture of the HNSWDisk index, focusing on synchronization, concurrency in writing to disk, and performance enhancements.
6
6
7
-
## Key Architectural Changes
7
+
## Key Architectural Components
8
8
9
-
### 1. Insertion Mode
9
+
### 1. Lightweight Insert Jobs
10
10
11
-
**Previous single threaded approach:** Vectors were accumulated in batches before being written to disk, requiring complex coordination between threads.
12
-
13
-
14
-
**Current approach:** Each insert job is self-contained and can write directly to disk upon completion, optimized for workloads where disk writes are cheap but neighbor searching (reads from disk) is the bottleneck.
11
+
Each insert job is lightweight and only stores metadata (vectorId, elementMaxLevel). Vector data is looked up from shared storage when the job executes, minimizing memory usage when many jobs are queued.
0 commit comments