-
Notifications
You must be signed in to change notification settings - Fork 25.5k
[DOCS][101] Add BYO vectors ingestion tutorial #115112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Documentation preview: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
<titleabbrev>Bring your own vector embeddings</titleabbrev> | ||
++++ | ||
|
||
This tutorial demonstrates how to index documents that already have dense vector embeddings into {es}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth adding an example for sparse_vector
embeddings here as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be best to keep this tightly focused to dense vectors and investigate demand for sparse vector equivalent going forward.
[[bring-your-own-vectors-search-documents]] | ||
=== Step 3: Search documents with embeddings | ||
|
||
Now you can query these document vectors using a <<knn-retriever,`knn` retriever>>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to see retriever examples! 🎉
Added explanation for dims parameter Separated single and bulk document indexing examples Improved explanations and wording throughout Added tip for beginners about semantic search Mentioned client-side vector generation as an alternative
[TIP] | ||
==== | ||
The `dense_vector` type supports quantization to reduce the memory footprint required when searching float vectors. | ||
Learn more about balancing performance and accuracy in <<dense-vector-quantization,Dense vector quantization>>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it worth mentioning that we auto-quantize with int8_hnsw
by default for dense_vector
field type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea! 👍
Pinging @elastic/es-docs (Team:Docs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome, LGTM!
💔 Backport failedThe backport operation could not be completed due to the following error:
You can use sqren/backport to manually backport by running |
(cherry picked from commit d500daf)
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
(cherry picked from commit d500daf)
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
(cherry picked from commit d500daf)
👁️ URL preview