Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 14 additions & 8 deletions docs/reference/how-to/knn-search.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,12 @@ results contains the full document `_source`. When the documents contain
high-dimensional `dense_vector` fields, the `_source` can be quite large and
expensive to load. This could significantly slow down the speed of kNN search.

NOTE: <<docs-reindex, reindex>>, <<docs-update, update>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this "NOTE" be after a paragraph where we suggest disabling source?

and <<docs-update-by-query, update by query>> operations generally
require the `_source` field. Disabling `_source` for a field might result in
expected behavior for these operations. For example, reindex might not actually
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the train has left the station but typo "might result in expected behavior"

contain the `dense_vector` field in the new index.

You can disable storing `dense_vector` fields in the `_source` through the
<<include-exclude, `excludes`>> mapping parameter. This prevents loading and
returning large vectors during search, and also cuts down on the index size.
Expand Down Expand Up @@ -102,14 +108,14 @@ merges smaller segments into larger ones through a background
explicit steps to reduce the number of index segments.

[discrete]
==== Force merge to one segment

The <<indices-forcemerge,force merge>> operation forces an index merge. If you
force merge to one segment, the kNN search only need to check a single,
all-inclusive HNSW graph. Force merging `dense_vector` fields is an expensive
operation that can take significant time to complete.

include::{es-ref-dir}/indices/forcemerge.asciidoc[tag=force-merge-read-only-warn]
==== Increase maximum segment size

{es} provides many tunable settings for controlling the merge process. One
important setting is `index.merge.policy.max_merged_segment`. This controls
the maximum size of the segments that are created during the merge process.
By increasing the value, you can reduce the number of segments in the index.
The default value is `5GB`, but that might be too small for larger dimensional vectors.
Consider increasing this value to `10GB` or `20GB` can help reduce the number of segments.

[discrete]
==== Create large segments during bulk indexing
Expand Down
Loading