-
-
Notifications
You must be signed in to change notification settings - Fork 96
Open
Description
Hello,
Long story short, in our RAG app, the data ingestion can occur while the user is toying with the application.
We have an embeddings node type which pretty much has inheritance and other stuff, but it doesn't matter here.
We have an index on this and we use vectorNeighbors to search for the closest matching chunk answering the user question.
The first time we ask a question, it will take some time (2-9 minutes), next questions will be answered fast.
If we add more data to the database, the next question will again takes something like 2-9 minutes. Even if addition is really small.
Logs :
2026-03-18 15:27:51.386 INFO [LSMVectorIndex] <ArcadeDB_0> Graph build validating: 18803/18803 (vector accesses=0, heap=575,4/8120,0MB, offheap=0,8MB, files=4,8MB [idx=0,3, graph=4,5, pq=0,0, compacted=0,0])
2026-03-18 15:27:51.393 INFO [LSMVectorIndex] <ArcadeDB_0> Building graph with 18803 vectors using property 'vector' (cache enabled: size=100000)
2026-03-18 15:27:51.396 INFO [LSMVectorIndex] <ArcadeDB_0> Building JVector graph index with 18803 vectors for index: CHUNK_EMBEDDING_0_32963829291000
2026-03-18 15:27:56.466 INFO [LSMVectorIndex] Graph build building: 17306/18803 (vector accesses=17322, heap=395,8/8120,0MB, offheap=0,8MB, files=4,8MB [idx=0,3, graph=4,5, pq=0,0, compacted=0,0])
2026-03-18 15:28:01.515 INFO [LSMVectorIndex] Graph build building: 17333/18803 (vector accesses=17349, heap=400,6/8120,0MB, offheap=0,8MB, files=4,8MB [idx=0,3, graph=4,5, pq=0,0, compacted=0,0])
2026-03-18 15:28:06.562 INFO [LSMVectorIndex] Graph build building: 17530/18803 (vector accesses=17546, heap=330,2/8120,0MB, offheap=0,8MB, files=4,8MB [idx=0,3, graph=4,5, pq=0,0, compacted=0,0])
2026-03-18 15:28:11.587 INFO [LSMVectorIndex] Graph build building: 17627/18803 (vector accesses=17643, heap=579,2/8120,0MB, offheap=0,8MB, files=4,8MB [idx=0,3, graph=4,5, pq=0,0, compacted=0,0])
2026-03-18 15:28:32.464 INFO [LSMVectorIndex] Graph build building: 18363/18803 (vector accesses=18379, heap=634,2/8120,0MB, offheap=0,8MB, files=4,8MB [idx=0,3, graph=4,5, pq=0,0, compacted=0,0])
2026-03-18 15:28:37.498 INFO [LSMVectorIndex] Graph build building: 18463/18803 (vector accesses=18479, heap=535,7/8120,0MB, offheap=0,8MB, files=4,8MB [idx=0,3, graph=4,5, pq=0,0, compacted=0,0])
2026-03-18 15:28:42.531 INFO [LSMVectorIndex] Graph build building: 18708/18803 (vector accesses=18723, heap=420,5/8120,0MB, offheap=0,8MB, files=4,8MB [idx=0,3, graph=4,5, pq=0,0, compacted=0,0])
2026-03-18 15:28:46.354 INFO [LSMVectorIndex] Graph build building: 18803/18803 (vector accesses=18813, heap=525,2/8120,0MB, offheap=0,8MB, files=4,8MB [idx=0,3, graph=4,5, pq=0,0, compacted=0,0])
2026-03-18 15:28:51.986 INFO [LSMVectorIndex] <ArcadeDB_0> JVector graph index built successfully
2026-03-18 15:28:51.986 INFO [LSMVectorIndex] <ArcadeDB_0> Graph build persisting: 0/18803 (vector accesses=0, heap=472,9/8120,0MB, offheap=0,8MB, files=4,8MB [idx=0,3, graph=4,5, pq=0,0, compacted=0,0])
2026-03-18 15:28:51.987 INFO [LSMVectorIndexGraphFile] <ArcadeDB_0> Starting graph write (sequential) with chunking: 18803 nodes, 50MB chunk size
2026-03-18 15:28:51.988 INFO [LSMVectorIndexGraphFile] <ArcadeDB_0> Writing graph WITHOUT inline vectors - topology only (vectors fetched from documents on-demand)
2026-03-18 15:28:52.057 INFO [LSMVectorIndexGraphFile] <ArcadeDB_0> Graph written to pages (sequential): 18803 nodes, 4964596 bytes, 18 pages (topology only, vectors in documents)
2026-03-18 15:28:52.057 INFO [LSMVectorIndex] <ArcadeDB_0> Graph build persisting: 18803/18803 (vector accesses=0, heap=524,9/8120,0MB, offheap=0,8MB, files=4,8MB [idx=0,3, graph=4,5, pq=0,0, compacted=0,0])
2026-03-18 15:28:52.058 INFO [LSMVectorIndex] <ArcadeDB_0> Built graph for index: CHUNK_EMBEDDING_0_32963829291000
2026-03-18 15:28:52.068 INFO [LSMVectorIndex] <ArcadeDB_0> GraphSearcher returned 100 nodes, graphSize=18803, vectorsSize=18803, ordinalToVectorIdLength=18803
2026-03-18 15:28:52.068 INFO [LSMVectorIndex] <ArcadeDB_0> Vector search returned 100 results (skipped: 0 out of bounds, 0 deleted/null)
my two coworkers are working on a reproducible example as I'm writing this post.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels