Skip to content

Conversation

kosabogi
Copy link
Contributor

@kosabogi kosabogi commented Oct 9, 2024

Overview

This PR expands the existing semantic_text tutorial by adding hybrid search method to it.

  • Changed the structure of the tutorial.
  • Added detailed instructions for hybrid search, including index mapping creation, reindexing, and performing hybrid search queries.

Related Issue

https://github.com/elastic/search-docs-team/issues/191

Preview

Tutorial: semantic search and hybrid search with semantic_text

@kosabogi kosabogi requested a review from szabosteve October 9, 2024 10:23
Copy link
Contributor

github-actions bot commented Oct 9, 2024

Documentation preview:

@elasticsearchmachine elasticsearchmachine added v9.0.0 needs:triage Requires assignment of a team area label external-contributor Pull request authored by a developer outside the Elasticsearch team labels Oct 9, 2024
@kosabogi kosabogi added Team:Docs Meta label for docs team and removed needs:triage Requires assignment of a team area label external-contributor Pull request authored by a developer outside the Elasticsearch team v9.0.0 labels Oct 9, 2024
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label and removed Team:Docs Meta label for docs team labels Oct 9, 2024
@kosabogi kosabogi added Team:Docs Meta label for docs team auto-backport Automatically create backport pull requests when merged v8.16.0 v9.0.0 and removed needs:triage Requires assignment of a team area label labels Oct 9, 2024
@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label and removed Team:Docs Meta label for docs team labels Oct 9, 2024
@kosabogi kosabogi added the Team:Docs Meta label for docs team label Oct 9, 2024
@elasticsearchmachine elasticsearchmachine removed the Team:Docs Meta label for docs team label Oct 9, 2024
@szabosteve szabosteve added >docs General docs changes Team:Docs Meta label for docs team labels Oct 9, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-docs (Team:Docs)

@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label Oct 9, 2024
Copy link
Contributor

@szabosteve szabosteve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These suggestions will make the CI pass. It's not a comprehensive review.

@leemthompo
Copy link
Contributor

leemthompo commented Oct 9, 2024

🚗 high-level impressions from me to help clarify this workflow for myself.

I think there should be just one index here, with a combination of semantic_text and standard text mappings. This eliminates a lot of redundancies, and simplifies the workflow. In my head this is how it should work, assuming we can also eliminate the reindexing step?

  • Create an inference endpoint to set up the model for generating embeddings (e.g., ELSER)
  • Set up one index with both standard text fields and semantic_text fields
  • Index your documents, which will automatically generate embeddings for semantic_text fields.
    • 🤔 I wonder if the file upload indexing is overkill and we could just have a small inline indexing step to stay in the API flow?
  • Perform a semantic search query (existing content from tutorial)
  • Perform hybrid search query by combining a standard match query (for example) and a semantic query, using RRF to combine the results from both query types

@kosabogi
Copy link
Contributor Author

kosabogi commented Oct 10, 2024

🚗 high-level impressions from me to help clarify this workflow for myself.

I think there should be just one index here, with a combination of semantic_text and standard text mappings. This eliminates a lot of redundancies, and simplifies the workflow. In my head this is how it should work, assuming we can also eliminate the reindexing step?

  • Create an inference endpoint to set up the model for generating embeddings (e.g., ELSER)

  • Set up one index with both standard text fields and semantic_text fields

  • Index your documents, which will automatically generate embeddings for semantic_text fields.

    • 🤔 I wonder if the file upload indexing is overkill and we could just have a small inline indexing step to stay in the API flow?
  • Perform a semantic search query (existing content from tutorial)

  • Perform hybrid search query by combining a standard match query (for example) and a semantic query, using RRF to combine the results from both query types

Thank you for the feedback! 🙌 You made some really good points, and I can see how we can simplify the workflow as you suggested. I’ll start working on the changes based on your suggestions.

Just a quick question about the reindexing step - if we use the API to index documents, I assume the embeddings will be generated automatically since the destination index mapping has the semantic_text field. Would that make reindexing unnecessary? Did I get that right?
However—please correct me if I'm wrong—this approach might only work for new documents. For already existing indices that users want to search on, wouldn't reindexing still be necessary to generate the embeddings? What do you think?

Copy link
Contributor

@szabosteve szabosteve left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@kosabogi kosabogi merged commit 7bd6f2c into elastic:main Oct 14, 2024
5 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

The backport operation could not be completed due to the following error:

An unexpected error occurred when attempting to backport this PR.

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 114398

davidkyle pushed a commit to davidkyle/elasticsearch that referenced this pull request Oct 14, 2024
* Creates a new page for the hybrid search tutorial

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Adds search  response example

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

---------

Co-authored-by: István Zoltán Szabó <[email protected]>
kosabogi added a commit to kosabogi/elasticsearch that referenced this pull request Oct 15, 2024
* Creates a new page for the hybrid search tutorial

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Adds search  response example

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

---------

Co-authored-by: István Zoltán Szabó <[email protected]>
kosabogi added a commit to kosabogi/elasticsearch that referenced this pull request Oct 15, 2024
* Creates a new page for the hybrid search tutorial

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Adds search  response example

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

---------

Co-authored-by: István Zoltán Szabó <[email protected]>
davidkyle pushed a commit that referenced this pull request Oct 15, 2024
* Creates a new page for the hybrid search tutorial

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Adds search  response example

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

---------

Co-authored-by: István Zoltán Szabó <[email protected]>
kosabogi added a commit that referenced this pull request Oct 15, 2024
* Creates a new page for the hybrid search tutorial
---------
Co-authored-by: István Zoltán Szabó <[email protected]>
kosabogi added a commit that referenced this pull request Oct 15, 2024
… (#114804)

* Expands semantic_text tutorial with hybrid search (#114398)

Co-authored-by: István Zoltán Szabó <[email protected]>
georgewallace pushed a commit to georgewallace/elasticsearch that referenced this pull request Oct 25, 2024
* Creates a new page for the hybrid search tutorial

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Adds search  response example

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

* Update docs/reference/search/search-your-data/semantic-text-hybrid-search

Co-authored-by: István Zoltán Szabó <[email protected]>

---------

Co-authored-by: István Zoltán Szabó <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending >docs General docs changes Team:Docs Meta label for docs team v8.16.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants