diff --git a/deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md b/deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md index 8c802a3dd3..1c6969348f 100644 --- a/deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md +++ b/deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md @@ -121,14 +121,18 @@ You can check the current value in `KiB` using `lsblk -o NAME,RA,MOUNTPOINT,TYPE `blockdev` expects values in 512 byte sectors whereas `lsblk` reports values in `KiB`. As an example, to temporarily set readahead to `128KiB` for `/dev/nvme0n1`, specify `blockdev --setra 256 /dev/nvme0n1`. :::: - ## Use Direct IO when the vector data does not fit in RAM + ```{applies_to} stack: preview 9.1 serverless: unavailable ``` + If your indices are of type `bbq_hnsw` and your nodes don't have enough off-heap RAM to store all vector data in memory, you may see very high query latencies. Vector data includes the HNSW graph, quantized vectors, and raw float32 vectors. -In these scenarios, direct IO can significantly reduce query latency. Enable it by setting the JVM option `vector.rescoring.directio=true` on all vector search nodes in your cluster. +In these scenarios, direct IO can significantly reduce query latency. +Enable it by setting the JVM option `vector.rescoring.directio=true` on all vector search nodes in your cluster. {applies_to}`stack: preview 9.1.1` + +`vector.rescoring.directio` is enabled by default. {applies_to}`stack: preview 9.1.0` Only use this option if you're experiencing very high query latencies on indices of type `bbq_hnsw`. Otherwise, enabling direct IO may increase your query latencies.