Fix technical inaccuracies identified via ChatGPT validation

ctindel · claude · ctindel · commit c1e272ae99e0 · 2025-11-12T16:49:59.000Z
Corrects three technical issues found through external LLM validation: 1. Line 250: Fix HNSW m parameter description - OLD: "Number of bidirectional links per node in the HNSW graph" - NEW: "The number of neighbors each node will be connected to in the HNSW graph" - REASON: HNSW graphs in Elasticsearch use directional connections, not bidirectional 2. Line 174: Fix bbq_flat description - OLD: "Use disk-optimized BBQ for simpler use cases with fewer vectors" - NEW: "Use BBQ without HNSW for smaller datasets. This uses brute-force search and requires less compute resources during indexing but more during querying" - REASON: bbq_flat is NOT disk-optimized (it keeps data in memory). The term "disk-optimized" only applies to bbq_disk 3. Line 225: Add qualifier for int4 memory reduction - OLD: "which provides 8x memory reduction" - NEW: "which provides up to 8x memory reduction" - REASON: 8x is theoretical maximum; actual reduction varies by dataset All changes validated via ChatGPT API technical review and approved by documentation owner. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
diff --git a/solutions/search/semantic-search/semantic-search-semantic-text.md b/solutions/search/semantic-search/semantic-search-semantic-text.md
@@ -171,7 +171,7 @@ PUT semantic-embeddings-flat
 }
 ```
 
-1. Use disk-optimized BBQ for simpler use cases with fewer vectors. This requires less compute resources during indexing.
+1. Use BBQ without HNSW for smaller datasets. This uses brute-force search and requires less compute resources during indexing but more during querying.
 
 For very large datasets where RAM is constrained, use `bbq_disk` (DiskBBQ) to minimize memory usage:
 
@@ -222,7 +222,7 @@ PUT semantic-embeddings-int8
 }
 ```
 
-1. Use 8-bit integer quantization for 4x memory reduction with high accuracy retention. For 4-bit quantization, use `"type": "int4_hnsw"` instead, which provides 8x memory reduction. For the full list of other available quantization options (including `int4_flat` and others), refer to the [`dense_vector` `index_options` documentation](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-index-options).
+1. Use 8-bit integer quantization for 4x memory reduction with high accuracy retention. For 4-bit quantization, use `"type": "int4_hnsw"` instead, which provides up to 8x memory reduction. For the full list of other available quantization options (including `int4_flat` and others), refer to the [`dense_vector` `index_options` documentation](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-index-options).
 
 For HNSW-specific tuning parameters like `m` and `ef_construction`, you can include them in the `index_options`:
 
@@ -247,7 +247,7 @@ PUT semantic-embeddings-custom
 }
 ```
 
-1. Number of bidirectional links per node in the HNSW graph. Higher values improve recall but increase memory usage. Default is 16.
+1. The number of neighbors each node will be connected to in the HNSW graph. Higher values improve recall but increase memory usage. Default is 16.
 2. Number of candidates considered during graph construction. Higher values improve index quality but slow down indexing. Default is 100.
 
 ::::{note}

Original file line number	Diff line number	Diff line change
`@@ -171,7 +171,7 @@ PUT semantic-embeddings-flat`
`171`	`171`	`}`
`172`	`172`	```
`173`	`173`
`174`		`-1. Use disk-optimized BBQ for simpler use cases with fewer vectors. This requires less compute resources during indexing.`
	`174`	`+1. Use BBQ without HNSW for smaller datasets. This uses brute-force search and requires less compute resources during indexing but more during querying.`
`175`	`175`
`176`	`176`	For very large datasets where RAM is constrained, use `bbq_disk` (DiskBBQ) to minimize memory usage:
`177`	`177`
`@@ -222,7 +222,7 @@ PUT semantic-embeddings-int8`
`222`	`222`	`}`
`223`	`223`	```
`224`	`224`
`225`		-1. Use 8-bit integer quantization for 4x memory reduction with high accuracy retention. For 4-bit quantization, use `"type": "int4_hnsw"` instead, which provides 8x memory reduction. For the full list of other available quantization options (including `int4_flat` and others), refer to the [`dense_vector` `index_options` documentation](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-index-options).
	`225`	+1. Use 8-bit integer quantization for 4x memory reduction with high accuracy retention. For 4-bit quantization, use `"type": "int4_hnsw"` instead, which provides up to 8x memory reduction. For the full list of other available quantization options (including `int4_flat` and others), refer to the [`dense_vector` `index_options` documentation](elasticsearch://reference/elasticsearch/mapping-reference/dense-vector.md#dense-vector-index-options).
`226`	`226`
`227`	`227`	For HNSW-specific tuning parameters like `m` and `ef_construction`, you can include them in the `index_options`:
`228`	`228`
`@@ -247,7 +247,7 @@ PUT semantic-embeddings-custom`
`247`	`247`	`}`
`248`	`248`	```
`249`	`249`
`250`		`-1. Number of bidirectional links per node in the HNSW graph. Higher values improve recall but increase memory usage. Default is 16.`
	`250`	`+1. The number of neighbors each node will be connected to in the HNSW graph. Higher values improve recall but increase memory usage. Default is 16.`
`251`	`251`	`2. Number of candidates considered during graph construction. Higher values improve index quality but slow down indexing. Default is 100.`
`252`	`252`
`253`	`253`	`::::{note}`