Skip to content
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
49 commits
Select commit Hold shift + click to select a range
81aa97c
Create 101 section
leemthompo Sep 2, 2024
4682400
- **intro.asciidoc**:
leemthompo Sep 2, 2024
dcdcffb
Merge main
leemthompo Sep 2, 2024
ec87abb
Add link
leemthompo Sep 2, 2024
cfbe986
Tweak ql table, mini nav shuffle
leemthompo Sep 3, 2024
c3984ea
Fix typos, links
leemthompo Sep 3, 2024
1c57ecd
Move what's new page
leemthompo Sep 3, 2024
143b0eb
Move some nav items
leemthompo Sep 3, 2024
dadd793
Add couple links
leemthompo Sep 3, 2024
fa818e3
Update landing page getting started link
leemthompo Sep 3, 2024
4e815e4
Fix link
leemthompo Sep 3, 2024
e85565e
Apply suggestions from feedback
leemthompo Sep 5, 2024
3f13270
Rename to Elasticsearch basics
leemthompo Sep 5, 2024
5251740
Refactorings
leemthompo Sep 5, 2024
1c729db
Merge main
leemthompo Sep 10, 2024
755e4fd
Little fixes
leemthompo Sep 10, 2024
03500ef
The term -we you- is not a thing
leemthompo Sep 10, 2024
cdef9f5
Use attributes
leemthompo Sep 10, 2024
b758ff8
Adds links to API bullet
leemthompo Sep 10, 2024
98a9780
Apply suggestions from code review
leemthompo Sep 11, 2024
00daf2b
Apply suggestions from code review
leemthompo Sep 11, 2024
e3fea62
Updates per review
leemthompo Sep 11, 2024
0df8cbe
Refactor search and analyze section, add back QL table
leemthompo Sep 11, 2024
ddfdf1c
Stop saying -our- all the time
leemthompo Sep 11, 2024
8314217
Add semantic search to use cases
leemthompo Sep 16, 2024
a2d8d66
Merge branch 'main' into 101-section
leemthompo Sep 19, 2024
ec62068
Merge branch 'main' into 101-section
leemthompo Sep 19, 2024
9629a55
Delete duplicated include
leemthompo Sep 19, 2024
4686858
Some updates based on Serena feedback
leemthompo Sep 23, 2024
303109f
Rename quickstart section, add vector example
leemthompo Sep 23, 2024
754ddaa
Add snippet test teardown, uniformize quick start term
leemthompo Sep 23, 2024
d30b432
Keep search your data where it is in nav for now
leemthompo Sep 23, 2024
3a5dca6
Make it clear esql has new goodness every release
leemthompo Sep 23, 2024
216bede
API quick starts is better name
leemthompo Sep 23, 2024
3395671
Apply suggestions from code review
leemthompo Sep 24, 2024
dfb5916
Move sample data to tip
leemthompo Sep 24, 2024
085625f
Remove vectors from first quick start, update local dev verbiage
leemthompo Sep 24, 2024
c9fa2f0
Add short definition of timestamped data
leemthompo Sep 24, 2024
6da5180
Mention quick starts use queryDSL unless otherwise noted
leemthompo Sep 24, 2024
90e5ac0
Fix url
leemthompo Sep 24, 2024
e96aac0
fix typo
leemthompo Sep 24, 2024
bcadaa3
Update query languages section
leemthompo Sep 24, 2024
09a4120
An -> a
leemthompo Sep 24, 2024
b1f5144
Tweaky McTweakface
leemthompo Sep 24, 2024
a4d934c
Clarify file uploader use case
leemthompo Sep 25, 2024
f655978
Delegate file format info to uploader docs, pdf support is coming soo…
leemthompo Sep 25, 2024
a272bf7
Local dev-elopment
leemthompo Sep 25, 2024
0a0a074
Apply suggestions
leemthompo Sep 25, 2024
a25b908
Future proof quick start section title
leemthompo Sep 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/reference/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,6 @@ include::intro.asciidoc[]

include::quickstart/index.asciidoc[]

include::search/search-your-data/search-your-data.asciidoc[]

include::setup.asciidoc[]

include::upgrade.asciidoc[]
Expand All @@ -32,6 +30,8 @@ include::ingest.asciidoc[]

include::alias.asciidoc[]

include::search/search-your-data/search-your-data.asciidoc[]

include::reranking/index.asciidoc[]

include::query-dsl.asciidoc[]
Expand Down
90 changes: 55 additions & 35 deletions docs/reference/intro.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
== {es} basics

This guide covers the core concepts you need to understand to get started with {es}.
If you'd prefer to start working with {es} right away, set up a <<run-elasticsearch-locally,local dev environment>> and jump to <<quickstart,hands-on code examples>> .
If you'd prefer to start working with {es} right away, set up a <<run-elasticsearch-locally,local development environment>> and jump to <<quickstart,hands-on code examples>>.

This guide covers the following topics:

* <<elasticsearch-intro-what-is-es>>: Learn about {es} and some of its main use cases.
* <<elasticsearch-intro-deploy>>: Understand your options for deploying {es} in different environments, including a fast local dev setup.
* <<elasticsearch-intro-deploy>>: Understand your options for deploying {es} in different environments, including a fast local development setup.
* <<documents-indices>>: Understand {es}'s most important primitives and how it stores data.
* <<es-ingestion-overview>>: Understand your options for ingesting data into {es}.
* <<search-analyze>>: Understand your options for searching and analyzing data in {es}.
Expand Down Expand Up @@ -49,18 +49,18 @@ Combined with https://www.elastic.co/kibana[{kib}], it powers the following Elas
**Observability**

* *Logs, metrics, and traces*: Collect, store, and analyze logs, metrics, and traces from applications, systems, and services.
* *Application performance monitoring (APM)*: Monitor and analyze application performance data.
* *Application performance monitoring (APM)*: Monitor and analyze the performance of business-critical software applications.
* *Real user monitoring (RUM)*: Monitor, quantify, and analyze user interactions with web applications.
* *OpenTelemetry*: Elastic has full native support for OpenTelemetry data.
* *OpenTelemetry*: Reuse your existing instrumentation to send telemetry data to the Elastic Stack using the OpenTelemetry standard.

**Search**

* *Full-text search*: Fast, relevant full-text search using inverted indexes, tokenization, and text analysis.
* *Full-text search*: Build a fast, relevant full-text search solution using inverted indexes, tokenization, and text analysis.
* *Vector database*: Store and search vectorized data, and create vector embeddings with built-in and third-party natural language processing (NLP) models.
* *Semantic search*: Understand the intent and contextual meaning behind search queries using tools like synonyms, dense vector embeddings, and learned sparse query/document expansion.
* *Semantic search*: Understand the intent and contextual meaning behind search queries using tools like synonyms, dense vector embeddings, and learned sparse query-document expansion.
* *Hybrid search*: Combine full-text search with vector search using state-of-the-art ranking algorithms.
* *Search applications*: Add hybrid search capabilities to apps or websites, or build enterprise search engines over your organization's internal data sources.
* *Retrieval augmented generation (RAG)*: Use {es} as a retrieval engine to update and augment Generative AI models.
* *Build search experiences*: Add hybrid search capabilities to apps or websites, or build enterprise search engines over your organization's internal data sources.
* *Retrieval augmented generation (RAG)*: Use {es} as a retrieval engine to supplement generative AI models with more relevant, up-to-date, or proprietary data for a range of use cases.
* *Geospatial search*: Search for locations and calculate spatial relationships using geospatial queries.

**Security**
Expand All @@ -80,7 +80,7 @@ You can deploy {es} in various ways.

**Quick start option**

* <<run-elasticsearch-locally,*Local dev*>>: Get started quickly with a minimal local Docker setup for development and testing.
* <<run-elasticsearch-locally,*Local development*>>: Get started quickly with a minimal local Docker setup for development and testing.

**Hosted options**

Expand Down Expand Up @@ -152,10 +152,10 @@ A simple {es} document might look like this:

[discrete]
[[elasticsearch-intro-documents-fields-data-metadata]]
==== Data and metadata
==== Metadata fields

An indexed document contains data and metadata.
In {es}, <<mapping-fields,metadata fields>> are prefixed with an underscore.
An indexed document contains data and metadata. <<mapping-fields,Metadata fields>> are system fields that store information about the documents.
In {es}, metadata fields are prefixed with an underscore.
For example, the following fields are metadata fields:

* `_index`: The name of the index where the document is stored.
Expand All @@ -170,7 +170,7 @@ A mapping defines the <<mapping-types,data type>> for each field, how the field
and how it should be stored.
When adding documents to {es}, you have two options for mappings:

* <<mapping-dynamic, Dynamic mapping>>: Let {es} automatically detect the data types and create the mappings for you. This is great for getting started quickly, but might yield suboptimal results for your specific use case due to automatic field type inference.
* <<mapping-dynamic, Dynamic mapping>>: Let {es} automatically detect the data types and create the mappings for you. Dynamic mapping helps you get started quickly, but might yield suboptimal results for your specific use case due to automatic field type inference.
* <<mapping-explicit, Explicit mapping>>: Define the mappings up front by specifying data types for each field. Recommended for production use cases, because you have full control over how your data is indexed to suit your specific use case.

[TIP]
Expand All @@ -184,34 +184,42 @@ This is useful when you have a mix of known and unknown fields in your data.
=== Add data to {es}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the changes that you made to this page are so so valuable! I LOVE being able to understand how to get data into ES at a high level. would have been a game changer for March Shaina.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The origin of this content has been available for a couple of years, but somewhat hidden: https://www.elastic.co/guide/en/cloud/current/ec-cloud-ingest-data.html

I've been using this starter content as the foundation of my ingest experience work, which means we'll have some duplication to sort out. We knew that there would be overlaps, so no big surprise there. We'll figure out the best way to communicate this info to users, and do the right thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯 Karen, it's just that the timestamped data decision tree is a little heavy for a brand-new user, who're not going to be thinking about data pipelines from the get go, but made sure to link to that page if that is what a reader is looking for. There's no specific reason I can tell why that original page is part of the cloud docs but again a topic for another day :)


There are multiple ways to ingest data into {es}.
The option that you choose depends on whether you're working with timestamped data, non-timestamped data, where the data is coming from, its complexity, and more.
The option that you choose depends on whether you're working with timestamped data or non-timestamped data, where the data is coming from, its complexity, and more.

[TIP]
====
You can load {kibana-ref}/connect-to-elasticsearch.html#_add_sample_data[sample data] into your {es} cluster using {kib}, to get started quickly.
====

[discrete]
[[es-ingestion-overview-general-content]]
==== General content

General content is data that does not have a timestamp.
This could be data like vector embeddings, website content, product catalogs, or more.
This could be data like vector embeddings, website content, product catalogs, and more.
For general content, you have the following options for adding data to {es} indices:

* <<docs,API>>: Use the {es} <<docs,Document APIs>> to index documents directly, using the Dev Tools {kibana-ref}/console-kibana.html[Console], cURL.
** You can use https://www.elastic.co/guide/en/elasticsearch/client/index.html[{es} programming language clients] to index documents in your programming language of choice. For Python devs, check out the `elasticsearch-labs` repo for various https://github.com/elastic/elasticsearch-labs/tree/main/notebooks/search/python-examples[example notebooks].
* {kibana-ref}/connect-to-elasticsearch.html#upload-data-kibana[File upload]: Use the {kib} file uploader to upload and index CSV, JSON, and log files.
* {kibana-ref}/connect-to-elasticsearch.html#_add_sample_data[Sample data]: Load sample data sets into your {es} cluster using {kib}.
* <<docs,API>>: Use the {es} <<docs,Document APIs>> to index documents directly, using the Dev Tools {kibana-ref}/console-kibana.html[Console], or cURL.
+
If you're building a website or app, then you can call Elasticsearch APIs using an https://www.elastic.co/guide/en/elasticsearch/client/index.html[{es} client] in the programming language of your choice. If you use the Python client, then check out the `elasticsearch-labs` repo for various https://github.com/elastic/elasticsearch-labs/tree/main/notebooks/search/python-examples[example notebooks].
* {kibana-ref}/connect-to-elasticsearch.html#upload-data-kibana[File upload]: Use the {kib} file uploader to index single files for one-off testing and exploration. The GUI guides you through setting up your index and field mappings.
* https://github.com/elastic/crawler[Web crawler]: Extract and index web page content into {es} documents.
* {enterprise-search-ref}/connectors.html[Connectors]: Sync data from various third-party data sources to create searchable, read-only replicas in {es}.

[discrete]
[[es-ingestion-overview-timestamped]]
==== Timestamped data

Timestamped data in {es} refers to datasets that include a timestamp field, typically named `@timestamp` when using the https://www.elastic.co/guide/en/ecs/current/ecs-reference.html[Elastic Common Schema (ECS)].
This could be data like logs, metrics, and traces.

For timestamped data, you have the following options for adding data to {es} data streams:

* {fleet-guide}/fleet-overview.html[Elastic Agent and Fleet]: The preferred way to index timestamped data. Each Elastic Agent based integration includes default ingestion rules, dashboards, and visualizations to start analyzing your data right away.
You can use the Fleet UI in {kib} to centrally manage Elastic Agents and their policies.
* {beats-ref}/beats-reference.html[Beats]: If your data source isn't supported by Elastic Agent, use Beats to collect and ship data. You install a separate Beat for each type of data to collect.
* {beats-ref}/beats-reference.html[Beats]: If your data source isn't supported by Elastic Agent, use Beats to collect and ship data to Elasticsearch. You install a separate Beat for each type of data to collect.
* {logstash-ref}/introduction.html[Logstash]: Logstash is an open source data collection engine with real-time pipelining capabilities that supports a wide variety of data sources. You might use this option because neither Elastic Agent nor Beats supports your data source. You can also use Logstash to persist incoming data, or if you need to send the data to multiple destinations.
* {cloud}/ec-ingest-guides.html[Language clients]: The linked tutorials demonstrate how to use {es} programming language clients to ingest data from an application. (In these examples, {es} is running on Elastic Cloud, but the same principles apply to any {es} deployment.)
* {cloud}/ec-ingest-guides.html[Language clients]: The linked tutorials demonstrate how to use {es} programming language clients to ingest data from an application. In these examples, {es} is running on Elastic Cloud, but the same principles apply to any {es} deployment.

[TIP]
====
Expand All @@ -226,19 +234,21 @@ You can use {es} as a basic document store to retrieve documents and their
metadata.
However, the real power of {es} comes from its advanced search and analytics capabilities.

You'll use a combination of an API endpoint and a query language to interact with your data.

[discrete]
[[search-analyze-rest-api]]
==== Rest API
==== REST API

Use {es}'s REST API to manage your cluster, and to index
Use REST APIs to manage your {es} cluster, and to index
and search your data.
For testing purposes, you can submit requests
directly from the command line or through the Dev Tools {kibana-ref}/console-kibana.html[Console] in {kib}.
From your applications, you can use an
https://www.elastic.co/guide/en/elasticsearch/client/index.html[{es} client]
From your applications, you can use a
https://www.elastic.co/guide/en/elasticsearch/client/index.html[client]
in your programming language of choice.

Refer to <<getting-started,first steps with Elasticsearch>> for a hands-on example of using the REST API, adding data to {es}, and running basic searches.
Refer to <<getting-started,first steps with Elasticsearch>> for a hands-on example of using the `_search` endpoint, adding data to {es}, and running basic searches in Query DSL syntax.

[discrete]
[[search-analyze-query-languages]]
Expand All @@ -249,7 +259,9 @@ Refer to <<getting-started,first steps with Elasticsearch>> for a hands-on examp
*Query DSL* is the primary query language for {es} today.

*{esql}* is a new piped query language and compute engine which was first added in version *8.11*.
It does not yet support all the features of Query DSL, like full-text search and semantic search.

{esql} does not yet support all the features of Query DSL, like full-text search and semantic search.
Look forward to new {esql} features and functionalities in each release.

Refer to <<search-analyze-query-languages>> for a full overview of the query languages available in {es}.

Expand All @@ -260,6 +272,8 @@ Refer to <<search-analyze-query-languages>> for a full overview of the query lan
<<query-dsl, Query DSL>> is a full-featured JSON-style query language that enables complex searching, filtering, and aggregations.
It is the original and most powerful query language for {es} today.

The <<search-your-data, `_search` endpoint>> accepts queries written in Query DSL syntax.

[discrete]
[[search-analyze-query-dsl-search-filter]]
====== Search and filter with Query DSL
Expand All @@ -272,7 +286,7 @@ Query DSL support a wide range of search techniques, including the following:
* <<knn-search,*Vector search*>>: Search for similar dense vectors using the kNN algorithm for embeddings generated outside of {es}.
* <<geo-queries,*Geospatial search*>>: Search for locations and calculate spatial relationships using geospatial queries.

Learn about the full range of queries supported by the <<query-dsl,Query DSL>>.
Learn about the full range of queries supported by <<query-dsl,Query DSL>>.

You can also filter data using Query DSL.
Filters enable you to include or exclude documents by retrieving documents that match specific field-level criteria.
Expand All @@ -286,7 +300,7 @@ A query that uses the `filter` parameter indicates <<filter-context,filter conte
Aggregrations enable you to build complex summaries of your data and gain
insight into key metrics, patterns, and trends.

Because aggregations leverage the same data-structures used for search, they are
Because aggregations leverage the same data structures used for search, they are
also very fast. This enables you to analyze and visualize your data in real time.
You can search documents, filter results, and perform analytics at the same time, on the same
data, in a single request.
Expand All @@ -310,6 +324,9 @@ Learn more in <<run-an-agg,Run an aggregation>>.
<<esql,Elasticsearch Query Language ({esql})>> is a piped query language for filtering, transforming, and analyzing data.
{esql} is built on top of a new compute engine, where search, aggregation, and transformation functions are
directly executed within {es} itself.
{esql} syntax can also be used within various {kib} tools.

The <<esql-rest,`_query` endpoint>> accepts queries written in {esql} syntax.

Today, it supports a subset of the features available in Query DSL, like aggregations, filters, and transformations.
It does not yet support full-text search or semantic search.
Expand All @@ -320,9 +337,7 @@ Learn more in <<esql-getting-started,Getting started with {esql}>>, or try https

[discrete]
[[search-analyze-data-query-languages-table]]
==== Query languages overview

// TODO: do I belong here?
==== List of available query languages

The following table summarizes all available {es} query languages, to help you choose the right one for your use case.

Expand All @@ -331,8 +346,8 @@ The following table summarizes all available {es} query languages, to help you c
| Name | Description | Use cases | API endpoint

| <<query-dsl,Query DSL>>
| Primary query language for {es}. Powerful and flexible JSON-style language that enables complex queries.
| Supports full-text search, semantic search, keyword search, filtering, aggregations, and more.
| The primary query language for {es}. A powerful and flexible JSON-style language that enables complex queries.
| Full-text search, semantic search, keyword search, filtering, aggregations, and more.
| <<search-search,`_search`>>


Expand All @@ -345,7 +360,7 @@ Does not yet support full-text search.


| <<eql,EQL>>
| Event Query Language (EQL) is a query language for event-based time series data. Data must contain an `@timestamp` field to use EQL.
| Event Query Language (EQL) is a query language for event-based time series data. Data must contain the `@timestamp` field to use EQL.
| Designed for the threat hunting security use case.
| <<eql-apis,`_eql`>>

Expand All @@ -354,6 +369,11 @@ Does not yet support full-text search.
| Enables users familiar with SQL to query {es} data using familiar syntax for BI and reporting.
| <<sql-apis,`_sql`>>

| {kibana-ref}/kuery-query.html[Kibana Query Language (KQL)]
| Kibana Query Language (KQL) is a text-based query language for filtering data that is only available in the {kib} UI.
| Use KQL to filter documents where a value for a field exists, matches a given value, or is within a given range.
| N/A

|===

// New html page
Expand Down
20 changes: 14 additions & 6 deletions docs/reference/quickstart/getting-started.asciidoc
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
[[getting-started]]
== Quick start: First steps with {es}
== Quick start: Add data using Elasticsearch APIs
++++
<titleabbrev>First steps with {es}</titleabbrev>
<titleabbrev>Basics: Add data using APIs</titleabbrev>
++++

In this quickstart guide you'll learn how to:
In this quick start guide, you'll learn how to do the following tasks:

* Add a small (non-timestamped) dataset to {es} using the REST API and the <<query-dsl,query DSL>>
* Run basic searches
* Add a small, non-timestamped dataset to {es} using Elasticsearch REST APIs.
* Run basic searches.

[discrete]
[[add-data]]
Expand All @@ -31,6 +31,13 @@ The request automatically creates the index.
PUT books
----
// TESTSETUP

[source,console]
--------------------------------------------------
DELETE books
--------------------------------------------------
// TEARDOWN

////

[source,console]
Expand Down Expand Up @@ -209,10 +216,11 @@ JSON object submitted during indexing.
[[qs-match-query]]
==== `match` query

You can use the `match` query to search for documents that contain a specific value in a specific field.
You can use the <<query-dsl-match-query,`match` query>> to search for documents that contain a specific value in a specific field.
This is the standard query for performing full-text search, including fuzzy matching and phrase searches.

Run the following command to search the `books` index for documents containing `brave` in the `name` field:

[source,console]
----
GET books/_search
Expand Down
Loading