Skip to content

Commit c1e272a

Browse files
ctindelclaude
andcommitted
Fix technical inaccuracies identified via ChatGPT validation
Corrects three technical issues found through external LLM validation: 1. Line 250: Fix HNSW m parameter description - OLD: "Number of bidirectional links per node in the HNSW graph" - NEW: "The number of neighbors each node will be connected to in the HNSW graph" - REASON: HNSW graphs in Elasticsearch use directional connections, not bidirectional 2. Line 174: Fix bbq_flat description - OLD: "Use disk-optimized BBQ for simpler use cases with fewer vectors" - NEW: "Use BBQ without HNSW for smaller datasets. This uses brute-force search and requires less compute resources during indexing but more during querying" - REASON: bbq_flat is NOT disk-optimized (it keeps data in memory). The term "disk-optimized" only applies to bbq_disk 3. Line 225: Add qualifier for int4 memory reduction - OLD: "which provides 8x memory reduction" - NEW: "which provides up to 8x memory reduction" - REASON: 8x is theoretical maximum; actual reduction varies by dataset All changes validated via ChatGPT API technical review and approved by documentation owner. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
1 parent ba95752 commit c1e272a

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

solutions/search/semantic-search/semantic-search-semantic-text.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,7 @@ PUT semantic-embeddings-flat
171171
}
172172
```
173173

174-
1. Use disk-optimized BBQ for simpler use cases with fewer vectors. This requires less compute resources during indexing.
174+
1. Use BBQ without HNSW for smaller datasets. This uses brute-force search and requires less compute resources during indexing but more during querying.
175175

176176
For very large datasets where RAM is constrained, use `bbq_disk` (DiskBBQ) to minimize memory usage:
177177

@@ -222,7 +222,7 @@ PUT semantic-embeddings-int8
222222
}
223223
```
224224

225-
1. Use 8-bit integer quantization for 4x memory reduction with high accuracy retention. For 4-bit quantization, use `"type": "int4_hnsw"` instead, which provides 8x memory reduction. For the full list of other available quantization options (including `int4_flat` and others), refer to the [`dense_vector` `index_options` documentation](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-index-options).
225+
1. Use 8-bit integer quantization for 4x memory reduction with high accuracy retention. For 4-bit quantization, use `"type": "int4_hnsw"` instead, which provides up to 8x memory reduction. For the full list of other available quantization options (including `int4_flat` and others), refer to the [`dense_vector` `index_options` documentation](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-index-options).
226226

227227
For HNSW-specific tuning parameters like `m` and `ef_construction`, you can include them in the `index_options`:
228228

@@ -247,7 +247,7 @@ PUT semantic-embeddings-custom
247247
}
248248
```
249249

250-
1. Number of bidirectional links per node in the HNSW graph. Higher values improve recall but increase memory usage. Default is 16.
250+
1. The number of neighbors each node will be connected to in the HNSW graph. Higher values improve recall but increase memory usage. Default is 16.
251251
2. Number of candidates considered during graph construction. Higher values improve index quality but slow down indexing. Default is 100.
252252

253253
::::{note}

0 commit comments

Comments
 (0)