-
Notifications
You must be signed in to change notification settings - Fork 25.5k
[DOCS] Add search and filtering tutorial/quickstart, edit filtering page #114353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Documentation preview: |
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice start so far!
[TIP] | ||
==== | ||
Full-text search is powered by <<analysis,text analysis>>. | ||
Text analysis normalizes and standardizes text data so it can be efficiently stored in an inverted index and searched in near real-time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link to docs defining inverted index here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have this today :sad_panda:.
I would love to have a page about data structures in the basics section that summarizes how Elasticsearch stores different data types in optimized structures: inverted index for text, BKD trees for geospatial data, HNSW for vectors etc.
A lot of this core stuff is buried in blogs.
Another disappointing learning doing some of this work is that at some point the Definitive Guide was deprecated but not all of the essential basics were ported over into the docs!
See https://github.com/elastic/elasticsearch-definitive-guide/blob/master/052_Mapping_Analysis/35_Inverted_index.asciidoc. Need to restore this to the docs ASAP in the right spot.
|
||
[source,console] | ||
---- | ||
POST /cooking_blog/_bulk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this make more sense in a notebook?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean the whole tutorial? For sure, the plan is to have notebook versions too once have a few of these :)
Co-authored-by: Kathleen DeRusso <[email protected]>
|
||
* `filter` or `must_not` parameters in <<query-dsl-bool-query,`bool`>> queries | ||
* `filter` parameter in <<query-dsl-constant-score-query,`constant_score`>> queries | ||
* <<search-aggregations-bucket-filter-aggregation,`filter`>> aggregations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add knn
here as an example as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be a bit too much in this 101 searching and filtering intro! This is the first tentative toe in the search paddling pool. 🦶
Currently I'm thinking the next basics "quickstart" could be asemantic_text
quickstart (maybe ELSER + hybrid via rrf). That would be a toy example, API-only, with no reindexing or anything, just the happiest path — and then link to Search your data for the full-blown semantic search examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there might be a little too much in this first draft already, let's see how we feel once I've incorporated the first round of feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@carlosdelest sorry I forgot that I edited this page too, so I glanced over this comment as if it related to the tutorial!
Indeed we should add whatever else is missing from this query_filter_context.asciidoc page too 😄
---- | ||
GET /cooking_blog/_search | ||
{ | ||
"size": 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment on why we're using size: 0 would be interesting I think - we don't want to return search results, but just the aggregated information on all of them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of scope now that we've decided to trim aggregations from this tutorial, but relevant for follow-up aggregations tutorial
Co-authored-by: Carlos Delgado <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great start, thank you for doing this! I think the tutorial could benefit from a tighter "story arc", so to speak. Since this is a targeted at a brand new user, I say we try to cover less material with more detail, including putting the search responses in the documentation so that the user can see matching docs and we can explain why they match. I think it could also help to create a business scenario for each section to ground the tutorial in the real world.
We can stretch this material across multiple pages too. For example, there's no need to cover aggregations in the "searching and filtering" page, we can make an aggregation-specific tutorial page instead.
[discrete] | ||
=== Perform basic aggregations | ||
|
||
Aggregations provide summary statistics and analytics on your search results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we link to a page with more detailed info on aggregations here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
out of scope now, we'll have a follow up aggregations heavy tutorial methinks
Added detailed examples of how to configure the cooking_blog index with text and keyword fields for different properties, improving exact matching and filtering capabilities. Expanded explanations and added concrete examples and responses to illustrate how match and multi_match queries work. Provided detailed examples on how to use filter in queries, including exact matches for categories and date ranges. Added a detailed example of a complex bool query combining multiple search criteria, tailored to the cooking blog scenario. Added learn more section
Tried to generally sharpen this tutorial, make it more user-focused, removing non-essential steps and going into more detail, with more annotated responses in 92e47af |
…-fields explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, nice progress!
<titleabbrev>Basics: Full-text search and filtering</titleabbrev> | ||
++++ | ||
|
||
This is a hands-on introduction to the basics of full-text search with {es}, also known as _lexical search_, using the <<search-search,`_search` API>> and <<query-dsl,Query DSL>>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Philosophical question here, but should we refactor new tutorials to use retrievers now that they will be GA soon?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe future tutorials, honestly I'd like an aligned answer from product because the query languages decision for new users is getting complex between query DSL / retrievers / ES|QL
docs/reference/quickstart/full-text-filtering-tutorial.asciidoc
Outdated
Show resolved
Hide resolved
docs/reference/quickstart/full-text-filtering-tutorial.asciidoc
Outdated
Show resolved
Hide resolved
docs/reference/quickstart/full-text-filtering-tutorial.asciidoc
Outdated
Show resolved
Hide resolved
docs/reference/quickstart/full-text-filtering-tutorial.asciidoc
Outdated
Show resolved
Hide resolved
docs/reference/quickstart/full-text-filtering-tutorial.asciidoc
Outdated
Show resolved
Hide resolved
docs/reference/quickstart/full-text-filtering-tutorial.asciidoc
Outdated
Show resolved
Hide resolved
@kderusso random question: when or where in a new user's journey would it be helpful to introduce the explain API? |
@leemthompo TBH, I could see the |
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! I left a few comments about wording and technical edge-cases.
docs/reference/quickstart/full-text-filtering-tutorial.asciidoc
Outdated
Show resolved
Hide resolved
docs/reference/quickstart/full-text-filtering-tutorial.asciidoc
Outdated
Show resolved
Hide resolved
For the record @kderusso saved the day by finding that the docstests CI checks were failing due to invalid JSON 🥇 |
💔 Backport failedThe backport operation could not be completed due to the following error:
You can use sqren/backport to manually backport by running |
…age (elastic#114353) (cherry picked from commit 0d8d8bd)
…age (elastic#114353) (cherry picked from commit 0d8d8bd)
💚 All backports created successfully
Questions ?Please refer to the Backport tool documentation |
…age (elastic#114353) (cherry picked from commit 0d8d8bd)
Ꙫ URL PREVIEW
Todo