Skip to content

Conversation

leemthompo
Copy link
Contributor

@leemthompo leemthompo commented Oct 18, 2024

👁️ URL preview

  • Adds a new bite-sized tutorial to Search your data > Semantic search
  • This is a toy example to learn syntax of ingesting a set of existing vectors. Tries to add enough links to relevant material for follow-up without too much cognitive overload.
  • Don't want to overload with information about the knn search side of things, but still making sure users can get where they need to next if they wanna drill down.

@leemthompo leemthompo added the >docs General docs changes label Oct 18, 2024
@leemthompo leemthompo self-assigned this Oct 18, 2024
Copy link
Contributor

Documentation preview:

Copy link
Member

@kderusso kderusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

<titleabbrev>Bring your own vector embeddings</titleabbrev>
++++

This tutorial demonstrates how to index documents that already have dense vector embeddings into {es}.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth adding an example for sparse_vector embeddings here as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be best to keep this tightly focused to dense vectors and investigate demand for sparse vector equivalent going forward.

[[bring-your-own-vectors-search-documents]]
=== Step 3: Search documents with embeddings

Now you can query these document vectors using a <<knn-retriever,`knn` retriever>>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see retriever examples! 🎉

Added explanation for dims parameter
Separated single and bulk document indexing examples
Improved explanations and wording throughout
Added tip for beginners about semantic search
Mentioned client-side vector generation as an alternative
[TIP]
====
The `dense_vector` type supports quantization to reduce the memory footprint required when searching float vectors.
Learn more about balancing performance and accuracy in <<dense-vector-quantization,Dense vector quantization>>.
Copy link

@jeffvestal jeffvestal Oct 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth mentioning that we auto-quantize with int8_hnsw by default for dense_vector field type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea! 👍

@leemthompo leemthompo added auto-backport Automatically create backport pull requests when merged v8.15.0 v8.16.0 Team:Docs Meta label for docs team labels Oct 24, 2024
@leemthompo leemthompo marked this pull request as ready for review October 24, 2024 11:26
@leemthompo leemthompo requested a review from kderusso October 24, 2024 11:26
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-docs (Team:Docs)

@leemthompo leemthompo requested a review from szabosteve October 24, 2024 13:59
Copy link
Contributor

@szabosteve szabosteve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome, LGTM!

@leemthompo leemthompo changed the title [DOCS] Add BYO vectors ingestion tutorial [DOCS][101] Add BYO vectors ingestion tutorial Oct 24, 2024
@leemthompo leemthompo merged commit d500daf into elastic:main Oct 24, 2024
5 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

The backport operation could not be completed due to the following error:

An unexpected error occurred when attempting to backport this PR.

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 115112

@leemthompo
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.16
8.15

Questions ?

Please refer to the Backport tool documentation

leemthompo added a commit to leemthompo/elasticsearch that referenced this pull request Oct 24, 2024
@leemthompo
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.x

Questions ?

Please refer to the Backport tool documentation

leemthompo added a commit to leemthompo/elasticsearch that referenced this pull request Oct 24, 2024
elasticsearchmachine pushed a commit that referenced this pull request Oct 24, 2024
elasticsearchmachine pushed a commit that referenced this pull request Oct 24, 2024
elasticsearchmachine pushed a commit that referenced this pull request Oct 24, 2024
georgewallace pushed a commit to georgewallace/elasticsearch that referenced this pull request Oct 25, 2024
jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending >docs General docs changes Team:Docs Meta label for docs team v8.15.0 v8.16.0 v8.17.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants