Skip to content

Conversation

leemthompo
Copy link
Contributor

@leemthompo leemthompo commented Oct 8, 2024

URL PREVIEW

  • First draft was very WIPpy
  • Based on SME feedback 92e47af represents significant tightening and focus of tutorial
  • Includes minor update to the filtering overview 🚗

Todo

  • Change filename, ID

@leemthompo leemthompo self-assigned this Oct 8, 2024
Copy link
Contributor

github-actions bot commented Oct 8, 2024

Documentation preview:

@leemthompo
Copy link
Contributor Author

@elasticmachine update branch

Copy link
Member

@kderusso kderusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice start so far!

[TIP]
====
Full-text search is powered by <<analysis,text analysis>>.
Text analysis normalizes and standardizes text data so it can be efficiently stored in an inverted index and searched in near real-time.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link to docs defining inverted index here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have this today :sad_panda:.

I would love to have a page about data structures in the basics section that summarizes how Elasticsearch stores different data types in optimized structures: inverted index for text, BKD trees for geospatial data, HNSW for vectors etc.

A lot of this core stuff is buried in blogs.

Another disappointing learning doing some of this work is that at some point the Definitive Guide was deprecated but not all of the essential basics were ported over into the docs!

See https://github.com/elastic/elasticsearch-definitive-guide/blob/master/052_Mapping_Analysis/35_Inverted_index.asciidoc. Need to restore this to the docs ASAP in the right spot.


[source,console]
----
POST /cooking_blog/_bulk
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this make more sense in a notebook?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean the whole tutorial? For sure, the plan is to have notebook versions too once have a few of these :)


* `filter` or `must_not` parameters in <<query-dsl-bool-query,`bool`>> queries
* `filter` parameter in <<query-dsl-constant-score-query,`constant_score`>> queries
* <<search-aggregations-bucket-filter-aggregation,`filter`>> aggregations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add knn here as an example as well?

Copy link
Contributor Author

@leemthompo leemthompo Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be a bit too much in this 101 searching and filtering intro! This is the first tentative toe in the search paddling pool. 🦶

Currently I'm thinking the next basics "quickstart" could be asemantic_text quickstart (maybe ELSER + hybrid via rrf). That would be a toy example, API-only, with no reindexing or anything, just the happiest path — and then link to Search your data for the full-blown semantic search examples.

Copy link
Contributor Author

@leemthompo leemthompo Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there might be a little too much in this first draft already, let's see how we feel once I've incorporated the first round of feedback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@carlosdelest sorry I forgot that I edited this page too, so I glanced over this comment as if it related to the tutorial!

Indeed we should add whatever else is missing from this query_filter_context.asciidoc page too 😄

----
GET /cooking_blog/_search
{
"size": 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment on why we're using size: 0 would be interesting I think - we don't want to return search results, but just the aggregated information on all of them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope now that we've decided to trim aggregations from this tutorial, but relevant for follow-up aggregations tutorial

leemthompo and others added 2 commits October 14, 2024 15:34
Copy link
Contributor

@Mikep86 Mikep86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great start, thank you for doing this! I think the tutorial could benefit from a tighter "story arc", so to speak. Since this is a targeted at a brand new user, I say we try to cover less material with more detail, including putting the search responses in the documentation so that the user can see matching docs and we can explain why they match. I think it could also help to create a business scenario for each section to ground the tutorial in the real world.

We can stretch this material across multiple pages too. For example, there's no need to cover aggregations in the "searching and filtering" page, we can make an aggregation-specific tutorial page instead.

[discrete]
=== Perform basic aggregations

Aggregations provide summary statistics and analytics on your search results.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we link to a page with more detailed info on aggregations here?

Copy link
Contributor Author

@leemthompo leemthompo Oct 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

out of scope now, we'll have a follow up aggregations heavy tutorial methinks

Added detailed examples of how to configure the cooking_blog index with text and keyword fields for different properties, improving exact matching and filtering capabilities.
Expanded explanations and added concrete examples and responses to illustrate how match and multi_match queries work.
Provided detailed examples on how to use filter in queries, including exact matches for categories and date ranges.
Added a detailed example of a complex bool query combining multiple search criteria, tailored to the cooking blog scenario.
Added learn more section
@leemthompo
Copy link
Contributor Author

leemthompo commented Oct 15, 2024

Tried to generally sharpen this tutorial, make it more user-focused, removing non-essential steps and going into more detail, with more annotated responses in 92e47af

Copy link
Member

@kderusso kderusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, nice progress!

<titleabbrev>Basics: Full-text search and filtering</titleabbrev>
++++

This is a hands-on introduction to the basics of full-text search with {es}, also known as _lexical search_, using the <<search-search,`_search` API>> and <<query-dsl,Query DSL>>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Philosophical question here, but should we refactor new tutorials to use retrievers now that they will be GA soon?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe future tutorials, honestly I'd like an aligned answer from product because the query languages decision for new users is getting complex between query DSL / retrievers / ES|QL

@leemthompo
Copy link
Contributor Author

@kderusso random question: when or where in a new user's journey would it be helpful to introduce the explain API?

@kderusso
Copy link
Member

@kderusso random question: when or where in a new user's journey would it be helpful to introduce the explain API?

@leemthompo TBH, I could see the explain API deserving its own page with an explanation of how to use it?

@leemthompo
Copy link
Contributor Author

@elasticmachine update branch

Copy link
Contributor

@Mikep86 Mikep86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! I left a few comments about wording and technical edge-cases.

@leemthompo
Copy link
Contributor Author

For the record @kderusso saved the day by finding that the docstests CI checks were failing due to invalid JSON 🥇

@leemthompo leemthompo enabled auto-merge (squash) October 28, 2024 09:46
@leemthompo leemthompo merged commit 0d8d8bd into elastic:main Oct 28, 2024
4 of 5 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

The backport operation could not be completed due to the following error:

An unexpected error occurred when attempting to backport this PR.

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 114353

@leemthompo
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.x
8.16
8.15

Questions ?

Please refer to the Backport tool documentation

leemthompo added a commit to leemthompo/elasticsearch that referenced this pull request Oct 28, 2024
elasticsearchmachine pushed a commit that referenced this pull request Oct 28, 2024
elasticsearchmachine pushed a commit that referenced this pull request Oct 28, 2024
elasticsearchmachine pushed a commit that referenced this pull request Oct 28, 2024
ioanatia pushed a commit to ioanatia/elasticsearch that referenced this pull request Nov 4, 2024
jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request Nov 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending >docs General docs changes Team:Docs Meta label for docs team v8.15.0 v8.16.0 v8.17.0 v9.0.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants