Skip to content

Commit 7b5bf13

Browse files
committed
Document vector index memory configuration
1 parent 1ca9e6d commit 7b5bf13

File tree

4 files changed

+199
-0
lines changed

4 files changed

+199
-0
lines changed

modules/ROOT/content-nav.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -213,6 +213,7 @@
213213
214214
* xref:performance/index.adoc[]
215215
** xref:performance/memory-configuration.adoc[]
216+
** xref:performance/vector-index-memory-configuration.adoc[]
216217
** xref:performance/index-configuration.adoc[]
217218
** xref:performance/gc-tuning.adoc[]
218219
** xref:performance/bolt-thread-pool-configuration.adoc[]

modules/ROOT/pages/performance/index.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ This section describes factors that affect operational performance and how to tu
66
The following topics are covered:
77

88
* xref:performance/memory-configuration.adoc[Memory configuration] -- How to configure memory settings for efficient operations.
9+
* xref:performance/vector-index-memory-configuration.adoc[Vector index memory configuration] -- How to configure memory for vector indexes.
910
* xref:performance/index-configuration.adoc[Index configuration] -- How to configure indexes.
1011
* xref:performance/gc-tuning.adoc[Garbage collector] -- How to configure the Java Virtual Machine's garbage collector.
1112
* xref:performance/bolt-thread-pool-configuration.adoc[Bolt thread pool configuration] -- How to configure the Bolt thread pool.

modules/ROOT/pages/performance/memory-configuration.adoc

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,9 @@ It is not possible to explicitly configure the amount of RAM that should be rese
1515
1GB is a good starting point for a server that is dedicated to running Neo4j.
1616
However, there are cases where the amount reserved for the OS is significantly larger than 1GB, such as servers with exceptionally large RAM.
1717
+
18+
If you have a vector index, you need to ensure that the OS has sufficient memory set aside for the vector index to perform optimally, because the vector index is loaded in OS memory and not in Neo4j page cache.
19+
For more information, see xref:performance/vector-index-memory-configuration.adoc[Vector index memory configuration].
20+
+
1821
[NOTE]
1922
====
2023
If you do not leave enough space for the OS, it will start to swap memory to disk, which will heavily affect performance.
Lines changed: 194 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,194 @@
1+
= Vector index memory configuration
2+
:description: How to configure Neo4j vector indexes to enhance performance in search operations.
3+
4+
Vector indexes are based on link:https://lucene.apache.org/[Lucene].
5+
Lucene uses OS memory, not Neo4j page cache memory, as described in the xref:performance/memory-configuration.adoc[Memory configuration] section.
6+
When you have a vector index, you must ensure that the OS has sufficient memory for the JVM heap, Neo4j page cache, and the Lucene vector indexes to perform optimally.
7+
If Lucene has insufficient memory, the OS will perform page swapping and read data from disk, which will dramatically degrade the Neo4j vector index search performance.
8+
Tools like IOTOP can assist in understanding disk I/O usage.
9+
10+
== Warming up the vector index
11+
12+
The Neo4j vector index is only loaded into memory when it is accessed.
13+
Ideally, the Lucene vector index is preloaded into OS-managed memory before quering the index.
14+
However, you can also warm up the index by running a few random queries to help the OS load the index into memory.
15+
The number of queries required to warm up the index depends on the size of the index and the amount of memory available.
16+
For a smaller index (up to 1M entries), five queries should be sufficient to load the index into memory.
17+
For a larger index, you may need to run a hundred queries.
18+
It is also recommended to test the index.
19+
20+
[NOTE]
21+
====
22+
Since the Neo4j vector index is managed by the OS memory, restarting the server clears the buffer/cache, requiring the vector index to be reloaded into memory.
23+
In the event of a rolling restart/patch/upgrade, an approach may be needed to address the warmup requirement.
24+
====
25+
26+
== Optimal Neo4j memory configuration for vector indexes
27+
28+
The memory configuration recommendations are _Heap + Neo4j PageCache + .25(Vector Index Size) + Additional OS Managed Memory_.
29+
30+
31+
=== Considerations and caveats
32+
33+
You can reduce the Neo4j page cache if you plan not to return vectors to the user or calling application, as the vectors are not needed and should not be loaded into memory.
34+
For example, the vector storage in a database with a total size of 459 GB is 402 GB.
35+
By setting the page cache to 100 GB, the important part of the graph is still in memory, and the server requirements are reduced.
36+
The ratio of memory to storage will be high, but Neo4j will still be able to maintain its performance.
37+
A 1:4 ratio should perform well.
38+
39+
You must increase the memory if you plan to return vectors to the user or call the application and use them for a refined search.
40+
If you use the vectors for further searching or refining search results, the page cache memory allocation must also be increased.
41+
42+
=== Example calculations
43+
44+
The following examples show how to calculate the memory requirements when vectors are only used for searching and will not be returned to the user or application.
45+
46+
.Disk storage requirements
47+
[cols="1,1,1"]
48+
|===
49+
| Neo4j DB
50+
| 10M
51+
| ~40GB
52+
53+
| Vector Index (single index)
54+
| (1.1 * (4 * 768 + 8 * 16) * 10M)/1048576000
55+
| 33.5GB
56+
57+
| Total DB Size
58+
|
59+
| 73.5GB
60+
|===
61+
62+
.Memory requirements
63+
[cols="1,1,1"]
64+
|===
65+
| Heap
66+
| 10-20GB
67+
| 20GB
68+
69+
| Page Cache
70+
| DB Size * 1.2
71+
| 50GB
72+
73+
| OS Memory for Index
74+
| .4 of the Vector Index
75+
| 12GB
76+
77+
| Total
78+
|
79+
| 82GB
80+
|===
81+
82+
.Aura vector specified cluster memory configurations
83+
[cols="1,1,1,1,1"]
84+
|===
85+
| Instance Size
86+
| Disk Storage
87+
| Heap
88+
| Page Cache
89+
| Remaining Memory
90+
91+
| 32GB
92+
| 64GB
93+
| 7.58GB
94+
| 9.01GB
95+
| 15.41GB
96+
97+
| 64GB
98+
| 128GB
99+
| 16.17GB
100+
| 17.56GB
101+
| 30.27GB
102+
103+
| 128GB
104+
| 256GB
105+
| 26.90GB
106+
| 49.94GB
107+
| 51.16GB
108+
109+
| 256GB
110+
| 512GB
111+
| 31GB
112+
| 132.34GB
113+
| 92.66GB
114+
115+
| 384GB
116+
| 768GB
117+
| 31GB
118+
| 220.25GB
119+
| 132.75GB
120+
121+
| 512GB
122+
| 1024GB
123+
| 31GB
124+
| 308.55GB
125+
| 172.45GB
126+
|===
127+
128+
.Aura non-vector specified cluster memory configurations
129+
[cols="1,1,1,1,1"]
130+
|===
131+
| Instance Size
132+
| Disk Storage
133+
| Heap
134+
| Page Cache
135+
| Remaining Memory
136+
137+
| 32GB
138+
| 64GB
139+
| 10.39GB
140+
| 11.13GB
141+
| 10.48GB
142+
143+
| 64GB
144+
| 128GB
145+
| 20.57GB
146+
| 23.43GB
147+
| 20GB
148+
149+
| 128GB
150+
| 256GB
151+
| 29.60GB
152+
| 70.40GB
153+
| 28GB
154+
155+
| 256GB
156+
| 512GB
157+
| 31GB
158+
| 180.20GB
159+
| 44.8GB
160+
161+
| 384GB
162+
| 768GB
163+
| 31GB
164+
| 293.20GB
165+
| 59.8GB
166+
167+
| 512GB
168+
| 1024GB
169+
| 31GB
170+
| 410.5GB
171+
| 70.5GB
172+
|===
173+
174+
175+
176+
177+
178+
179+
180+
181+
182+
183+
184+
185+
186+
187+
188+
189+
190+
191+
192+
193+
194+

0 commit comments

Comments
 (0)