[DOCS] Update documents and indices overview (#112394) (#112429)

leemthompo · web-flow · commit bdf68645aff0 · 2024-09-02T19:47:19.000+10:00
diff --git a/docs/reference/intro.asciidoc b/docs/reference/intro.asciidoc
@@ -55,66 +55,107 @@ You can deploy {es} in various ways:
 [[elasticsearch-next-steps]]
 === Learn more
 
-Here are some resources to help you get started:
+Some resources to help you get started:
 
 * <<getting-started, Quickstart>>. A beginner's guide to deploying your first {es} instance, indexing data, and running queries.
 * https://elastic.co/webinars/getting-started-elasticsearch[Webinar: Introduction to {es}]. Register for our live webinars to learn directly from {es} experts.
 * https://www.elastic.co/search-labs[Elastic Search Labs]. Tutorials and blogs that explore AI-powered search using the latest {es} features.
 ** Follow our tutorial https://www.elastic.co/search-labs/tutorials/search-tutorial/welcome[to build a hybrid search solution in Python].
 ** Check out the https://github.com/elastic/elasticsearch-labs?tab=readme-ov-file#elasticsearch-examples--apps[`elasticsearch-labs` repository] for a range of Python notebooks and apps for various use cases.
 
+// new html page 
 [[documents-indices]]
-=== Documents and indices
-
-{es} is a distributed document store. Instead of storing information as rows of
-columnar data, {es} stores complex data structures that have been serialized
-as JSON documents. When you have multiple {es} nodes in a cluster, stored
-documents are distributed across the cluster and can be accessed immediately
-from any node.
-
-When a document is stored, it is indexed and fully searchable in <<near-real-time,near real-time>>--within 1 second. {es} uses a data structure called an
-inverted index that supports very fast full-text searches. An inverted index
-lists every unique word that appears in any document and identifies all of the
-documents each word occurs in.
-
-An index can be thought of as an optimized collection of documents and each
-document is a collection of fields, which are the key-value pairs that contain
-your data. By default, {es} indexes all data in every field and each indexed
-field has a dedicated, optimized data structure. For example, text fields are
-stored in inverted indices, and numeric and geo fields are stored in BKD trees.
-The ability to use the per-field data structures to assemble and return search
-results is what makes {es} so fast.
-
-{es} also has the ability to be schema-less, which means that documents can be
-indexed without explicitly specifying how to handle each of the different fields
-that might occur in a document. When dynamic mapping is enabled, {es}
-automatically detects and adds new fields to the index. This default
-behavior makes it easy to index and explore your data--just start
-indexing documents and {es} will detect and map booleans, floating point and
-integer values, dates, and strings to the appropriate {es} data types.
-
-You can define rules to control dynamic mapping and explicitly
-define mappings to take full control of how fields are stored and indexed.
-
-Defining your own mappings enables you to:
-
-* Distinguish between full-text string fields and exact value string fields
-* Perform language-specific text analysis
-* Optimize fields for partial matching
-* Use custom date formats
-* Use data types such as `geo_point` and `geo_shape` that cannot be automatically
-detected
-
-It’s often useful to index the same field in different ways for different
-purposes. For example, you might want to index a string field as both a text
-field for full-text search and as a keyword field for sorting or aggregating
-your data. Or, you might choose to use more than one language analyzer to
-process the contents of a string field that contains user input.
-
-The analysis chain that is applied to a full-text field during indexing is also
-used at search time. When you query a full-text field, the query text undergoes
-the same analysis before the terms are looked up in the index.
+=== Indices, documents, and fields
+++++
+<titleabbrev>Indices and documents</titleabbrev>
+++++
 
+The index is the fundamental unit of storage in {es}, a logical namespace for storing data that share similar characteristics.
+After you have {es} <<elasticsearch-intro-deploy,deployed>>, you'll get started by creating an index to store your data.
+
+[TIP]
+====
+A closely related concept is a <<data-streams,data stream>>.
+This index abstraction is optimized for append-only time-series data, and is made up of hidden, auto-generated backing indices.
+If you're working with time-series data, we recommend the {observability-guide}[Elastic Observability] solution.
+====
+
+Some key facts about indices:
+
+* An index is a collection of documents
+* An index has a unique name
+* An index can also be referred to by an alias
+* An index has a mapping that defines the schema of its documents
+
+[discrete]
+[[elasticsearch-intro-documents-fields]]
+==== Documents and fields
+
+{es} serializes and stores data in the form of JSON documents.
+A document is a set of fields, which are key-value pairs that contain your data.
+Each document has a unique ID, which you can create or have {es} auto-generate.
+
+A simple {es} document might look like this:
+
+[source,js]
+----
+{
+  "_index": "my-first-elasticsearch-index",
+  "_id": "DyFpo5EBxE8fzbb95DOa",
+  "_version": 1,
+  "_seq_no": 0,
+  "_primary_term": 1,
+  "found": true,
+  "_source": {
+    "email": "john@smith.com",
+    "first_name": "John",
+    "last_name": "Smith",
+    "info": {
+      "bio": "Eco-warrior and defender of the weak",
+      "age": 25,
+      "interests": [
+        "dolphins",
+        "whales"
+      ]
+    },
+    "join_date": "2024/05/01"
+  }
+}
+----
+// NOTCONSOLE
+
+[discrete]
+[[elasticsearch-intro-documents-fields-data-metadata]]
+==== Data and metadata
+
+An indexed document contains data and metadata.
+In {es}, metadata fields are prefixed with an underscore.
+
+The most important metadata fields are:
+
+* `_source`. Contains the original JSON document.
+* `_index`. The name of the index where the document is stored.
+* `_id`. The document's ID. IDs must be unique per index.
+
+[discrete]
+[[elasticsearch-intro-documents-fields-mappings]]
+==== Mappings and data types
+
+Each index has a <<mapping,mapping>> or schema for how the fields in your documents are indexed.
+A mapping defines the <<mapping-types,data type>> for each field, how the field should be indexed,
+and how it should be stored.
+When adding documents to {es}, you have two options for mappings:
+
+* <<mapping-dynamic, Dynamic mapping>>. Let {es} automatically detect the data types and create the mappings for you. This is great for getting started quickly.
+* <<mapping-explicit, Explicit mapping>>. Define the mappings up front by specifying data types for each field. Recommended for production use cases.
+
+[TIP]
+====
+You can use a combination of dynamic and explicit mapping on the same index.
+This is useful when you have a mix of known and unknown fields in your data.
+====
+
+// New html page
 [[search-analyze]]
 === Search and analyze