diff --git a/deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md b/deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md
index 85f9ae8c81..8c802a3dd3 100644
--- a/deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md
+++ b/deploy-manage/production-guidance/optimize-performance/approximate-knn-search.md
@@ -122,3 +122,13 @@ You can check the current value in `KiB` using `lsblk -o NAME,RA,MOUNTPOINT,TYPE
 ::::
 
 
+## Use Direct IO when the vector data does not fit in RAM
+```{applies_to}
+stack: preview 9.1
+serverless: unavailable
+```
+If your indices are of type `bbq_hnsw` and your nodes don't have enough off-heap RAM to store all vector data in memory, you may see very high query latencies. Vector data includes the HNSW graph, quantized vectors, and raw float32 vectors.
+
+In these scenarios, direct IO can significantly reduce query latency. Enable it by setting the JVM option `vector.rescoring.directio=true` on all vector search nodes in your cluster.
+
+Only use this option if you're experiencing very high query latencies on indices of type `bbq_hnsw`. Otherwise, enabling direct IO may increase your query latencies.