Skip to content

Commit 9b0a8da

Browse files
authored
Merge branch 'main' into mw-es-ts-structure
2 parents e2c9336 + 1087787 commit 9b0a8da

22 files changed

+72
-130
lines changed
Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,11 @@
11
---
22
mapped_pages:
33
- https://www.elastic.co/guide/en/reference-architectures/current/reference-architectures-overview.html
4+
applies:
5+
stack: all
6+
hosted: all
7+
ece: all
8+
eck: all
49
---
510

611
# Reference architectures [reference-architectures-overview]
@@ -9,19 +14,20 @@ Elasticsearch reference architectures are blueprints for deploying Elasticsearch
914

1015
These architectures are designed by architects and engineers to provide standardized, proven solutions that help you to follow best practices when deploying {{es}}.
1116

12-
::::{tip}
13-
These architectures are specific to running your deployment on-premises or on cloud. If you are using Elastic serverless your {{es}} clusters are autoscaled and fully managed by Elastic. For all the deployment options, refer to [Run Elasticsearch](https://www.elastic.co/guide/en/elasticsearch/reference/current/elasticsearch-intro-deploy.html).
17+
::::{tip}
18+
These architectures are specific to deploying Elastic on {{ech}}, {{eck}}, {{ece}}, or deploying a self-managed instance. If you are using {{serverless-full}}, your {{es}} clusters are autoscaled and fully managed by Elastic. To learn about all of the deployment options, refer to the [Deploy and manage overview](/deploy-manage/index.md).
1419
::::
1520

1621

17-
These reference architectures are recommendations and should be adapted to fit your specific environment and needs. Each solution can vary based on the unique requirements and conditions of your deployment. In these architectures we discuss about how to deploy cluster components. For information about designing ingest architectures to feed content into your cluster, refer to [Ingest architectures](https://www.elastic.co/guide/en/ingest/current/use-case-arch.html)
22+
These reference architectures are recommendations and should be adapted to fit your specific environment and needs. Each solution can vary based on the unique requirements and conditions of your deployment. In these architectures we discuss about how to deploy cluster components. For information about designing ingest architectures to feed content into your cluster, refer to [Ingest architectures](../manage-data/ingest/ingest-reference-architectures/use-case-arch.md)
1823

19-
20-
## Architectures [reference-architectures-time-series-2]
24+
## Architectures [reference-architectures-time-series]
2125

2226
| | |
2327
| --- | --- |
2428
| **Architecture** | **When to use** |
25-
| [*Hot/Frozen - High Availability*](https://www.elastic.co/guide/en/reference-architectures/current/hot-frozen-architecture.html)<br>A high availability architecture that is cost optimized for large time-series datasets. | * Have a requirement for cost effective long term data storage (many months or years).<br>* Provide insights and alerts using logs, metrics, traces, or various event types to ensure optimal performance and quick issue resolution for applications.<br>* Apply Machine Learning and Search AI to assist in dealing with the large amount of data.<br>* Deploy an architecture model that allows for maximum flexibility between storage cost and performance.<br> |
29+
| [*Hot/Frozen - High Availability*](/deploy-manage/reference-architectures/hotfrozen-high-availability.md)<br>A high availability architecture that is cost optimized for large time-series datasets. | * Have a requirement for cost effective long term data storage (many months or years).<br>* Provide insights and alerts using logs, metrics, traces, or various event types to ensure optimal performance and quick issue resolution for applications.<br>* Apply Machine Learning and Search AI to assist in dealing with the large amount of data.<br>* Deploy an architecture model that allows for maximum flexibility between storage cost and performance.<br> |
2630
| Additional architectures are on the way.<br>Stay tuned for updates. | |
2731

32+
33+

deploy-manage/reference-architectures/hotfrozen-high-availability.md

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,11 @@
11
---
22
mapped_pages:
33
- https://www.elastic.co/guide/en/reference-architectures/current/hot-frozen-architecture.html
4+
applies:
5+
stack: all
6+
hosted: all
7+
ece: all
8+
eck: all
49
---
510

611
# Hot/Frozen - High Availability [hot-frozen-architecture]
@@ -50,15 +55,15 @@ Machine learning nodes are optional but highly recommended for large scale time
5055

5156
## Recommended hardware specifications [hot-frozen-hardware]
5257

53-
With {{ecloud}} you can deploy clusters in AWS, Azure, and Google Cloud. Available hardware types and configurations vary across all three cloud providers but each provides instance types that meet our recommendations for the node types used in this architecture. For more details on these instance types, see our documentation on {{ecloud}} hardware for [AWS](https://www.elastic.co/guide/en/cloud/current/ec-default-aws-configurations.html), [Azure](https://www.elastic.co/guide/en/cloud/current/ec-default-azure-configurations.html), and [GCP](https://www.elastic.co/guide/en/cloud/current/ec-default-gcp-configurations.html). The **Physical** column below is guidance, based on the cloud node types, when self-deploying {{es}} in your own data center.
58+
With {{ech}}, you can deploy clusters in AWS, Azure, and Google Cloud. Available hardware types and configurations vary across all three cloud providers but each provides instance types that meet our recommendations for the node types used in this architecture. For more details on these instance types, see our documentation on {{ech}} hardware for [AWS](https://www.elastic.co/guide/en/cloud/current/ec-default-aws-configurations.html), [Azure](https://www.elastic.co/guide/en/cloud/current/ec-default-azure-configurations.html), and [GCP](https://www.elastic.co/guide/en/cloud/current/ec-default-gcp-configurations.html). The **Physical** column below is guidance, based on the cloud node types, when self-deploying {{es}} in your own data center.
5459

55-
In the links provided above, Elastic has performance tested hardware for each of the cloud providers to find the optimal hardware for each node type. We use ratios to represent the best mix of CPU, RAM, and disk for each type. In some cases the CPU to RAM ratio is key, in others the disk to memory ratio and type of disk is critical. Significantly deviating from these ratios may seem like a way to save on hardware costs, but may result in an {{es}} cluster that does not scale and perform well.
60+
In the links provided above, Elastic has performance tested hardware for each of the cloud providers to find the optimal hardware for each node type. We use ratios to represent the best mix of CPU, RAM, and disk for each type. In some cases the CPU to RAM ratio is key, in others the disk to memory ratio and type of disk is critical. Significantly deviating from these ratios may seem like a way to save on hardware costs, but may result in an {{es}} cluster that does not scale and perform well.
5661

5762
This table shows our specific recommendations for nodes in a Hot/Frozen architecture.
5863

5964
| | | | | |
6065
| --- | --- | --- | --- | --- |
61-
| **Type** | **AWS*** | ***Azure*** | ***GCP** | **Physical** |
66+
| **Type** | **AWS** | **Azure** | **GCP** | **Physical** |
6267
| ![Hot data node](../../images/reference-architectures-hot.png "") | c6gd | f32sv2 | N2 | 16-32 vCPU<br>64 GB RAM<br>2-6 TB NVMe SSD |
6368
| ![Frozen data node](../../images/reference-architectures-frozen.png "") | i3en | e8dsv4 | N2 | 8 vCPU<br>64 GB RAM<br>6-20+ TB NVMe SSD<br>Depending on days cached |
6469
| ![Machine learning node](../../images/reference-architectures-machine-learning.png "") | m6gd | f16sv2 | N2 | 16 vCPU<br>64 GB RAM<br>256 GB SSD |
@@ -82,7 +87,7 @@ This table shows our specific recommendations for nodes in a Hot/Frozen architec
8287

8388
**Snapshots:**
8489

85-
* If auditable or business critical events are being logged, a backup is necessary. The choice to back up data will depend on each individual business’s needs and requirements. Refer to our [snapshot repository](https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshots-register-repository.html) documentation to learn more.
90+
* If auditable or business critical events are being logged, a backup is necessary. The choice to back up data will depend on each individual business’s needs and requirements. Refer to our [snapshot repository](https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshots-register-repository.html) documentation to learn more.
8691
* To automate snapshots and attach to Index lifecycle management policies, refer to [SLM (Snapshot lifecycle management)](https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshots-take-snapshot.html#automate-snapshots-slm).
8792

8893
**Kibana:**

explore-analyze/machine-learning/nlp/ml-nlp-ner-example.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ Using the example text "Elastic is headquartered in Mountain View, California.",
113113

114114
## Add the NER model to an {{infer}} ingest pipeline [ex-ner-ingest]
115115

116-
You can perform bulk {{infer}} on documents as they are ingested by using an [{{infer}} processor](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-processor.html) in your ingest pipeline. The novel *Les Misérables* by Victor Hugo is used as an example for {{infer}} in the following example. [Download](https://github.com/elastic/stack-docs/blob/8.5/docs/en/stack/ml/nlp/data/les-miserables-nd.json) the novel text split by paragraph as a JSON file, then upload it by using the [Data Visualizer](../../../manage-data/ingest.md#upload-data-kibana). Give the new index the name `les-miserables` when uploading the file.
116+
You can perform bulk {{infer}} on documents as they are ingested by using an [{{infer}} processor](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-processor.html) in your ingest pipeline. The novel *Les Misérables* by Victor Hugo is used as an example for {{infer}} in the following example. [Download](https://github.com/elastic/stack-docs/blob/8.5/docs/en/stack/ml/nlp/data/les-miserables-nd.json) the novel text split by paragraph as a JSON file, then upload it by using the [Data Visualizer](../../../manage-data/ingest/tools/upload-data-files.md). Give the new index the name `les-miserables` when uploading the file.
117117

118118
Now create an ingest pipeline either in the [Stack management UI](ml-nlp-inference.md#ml-nlp-inference-processor) or by using the API:
119119

explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ In this step, you load the data that you later use in an ingest pipeline to get
103103

104104
The data set `msmarco-passagetest2019-top1000` is a subset of the MS MARCO Passage Ranking data set used in the testing stage of the 2019 TREC Deep Learning Track. It contains 200 queries and for each query a list of relevant text passages extracted by a simple information retrieval (IR) system. From that data set, all unique passages with their IDs have been extracted and put into a [tsv file](https://github.com/elastic/stack-docs/blob/8.5/docs/en/stack/ml/nlp/data/msmarco-passagetest2019-unique.tsv), totaling 182469 passages. In the following, this file is used as the example data set.
105105

106-
Upload the file by using the [Data Visualizer](../../../manage-data/ingest.md#upload-data-kibana). Name the first column `id` and the second one `text`. The index name is `collection`. After the upload is done, you can see an index named `collection` with 182469 documents.
106+
Upload the file by using the [Data Visualizer](../../../manage-data/ingest/tools/upload-data-files.md). Name the first column `id` and the second one `text`. The index name is `collection`. After the upload is done, you can see an index named `collection` with 182469 documents.
107107

108108
:::{image} ../../../images/machine-learning-ml-nlp-text-emb-data.png
109109
:alt: Importing the data

manage-data/ingest.md

Lines changed: 29 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -10,29 +10,41 @@ mapped_urls:
1010

1111
# Ingestion
1212

13-
% What needs to be done: Finish draft
13+
Bring your data! Whether you call it *adding*, *indexing*, or *ingesting* data, you have to get the data into {{es}} before you can search it, visualize it, and use it for insights.
1414

15-
% GitHub issue: docs-projects#326
15+
Our ingest tools are flexible, and support a wide range of scenarios. We can help you with everything from popular and straightforward use cases, all the way to advanced use cases that require additional processing in order to modify or reshape your data before it goes to {{es}}.
1616

17-
% Scope notes: Brief introduction on use cases Importance of data ingestion theory / process how to frame these products as living independently from ES? Link to reference architectures
17+
You can ingest:
1818

19-
% Use migrated content from existing pages that map to this page:
19+
* **General content** (data without timestamps), such as HTML pages, catalogs, and files
20+
* **Time series (timestamped) data**, such as logs, metrics, and traces for Elastic Security, Observability, Search solutions, or for your own custom solutions
2021

21-
% - [ ] ./raw-migrated-files/cloud/cloud/ec-cloud-ingest-data.md
22-
% Notes: Use draft Overview from Karen's PR
23-
% - [ ] ./raw-migrated-files/kibana/kibana/connect-to-elasticsearch.md
24-
% Notes: Other existing pages might be used in the "Plan" section
25-
% - [ ] ./raw-migrated-files/docs-content/serverless/elasticsearch-ingest-your-data.md
26-
% - [ ] https://www.elastic.co/customer-success/data-ingestion
27-
% - [ ] ./raw-migrated-files/elasticsearch/elasticsearch-reference/es-ingestion-overview.md
28-
% - [ ] ./raw-migrated-files/ingest-docs/ingest-overview/ingest-intro.md
2922

30-
% Internal links rely on the following IDs being on this page (e.g. as a heading ID, paragraph ID, etc):
23+
## Ingesting general content [ingest-general]
3124

32-
$$$upload-data-kibana$$$
25+
Elastic offer tools designed to ingest specific types of general content. The content type determines the best ingest option.
3326

34-
$$$_add_sample_data$$$
27+
* To index **documents** directly into {{es}}, use the {{es}} [document APIs](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs.html).
28+
* To send **application data** directly to {{es}}, use an [{{es}} language client](https://www.elastic.co/guide/en/elasticsearch/client/index.html).
29+
* To index **web page content**, use the Elastic [web crawler](https://www.elastic.co/web-crawler).
30+
* To sync **data from third-party sources**, use [connectors](https://www.elastic.co/guide/en/elasticsearch/reference/current/es-connectors.html). A connector syncs content from an original data source to an {{es}} index. Using connectors you can create *searchable*, read-only replicas of your data sources.
31+
* To index **single files** for testing in a non-production environment, use the {{kib}} [file uploader](ingest/tools/upload-data-files.md).
3532

36-
$$$ec-ingest-methods$$$
33+
If you would like to try things out before you add your own data, try using our [sample data](ingest/sample-data.md).
3734

38-
$$$ec-data-ingest-pipeline$$$
35+
36+
## Ingesting time series data [ingest-time-series]
37+
38+
::::{admonition} What’s the best approach for ingesting time series data?
39+
The best approach for ingesting data is the *simplest option* that *meets your needs* and *satisfies your use case*.
40+
41+
In most cases, the *simplest option* for ingesting time series data is using {{agent}} paired with an Elastic integration.
42+
43+
* Install [Elastic Agent](https://www.elastic.co/guide/en/fleet/current) on the computer(s) from which you want to collect data.
44+
* Add the [Elastic integration](https://docs.elastic.co/en/integrations) for the data source to your deployment.
45+
46+
Integrations are available for many popular platforms and services, and are a good place to start for ingesting data into Elastic solutions—​Observability, Security, and Search—​or your own search application.
47+
48+
Check out the [Integration quick reference](https://docs.elastic.co/en/integrations/all_integrations) to search for available integrations. If you don’t find an integration for your data source or if you need additional processing to extend the integration, we still have you covered. Refer to [Transform and enrich data](ingest/transform-enrich.md) to learn more.
49+
50+
::::

manage-data/ingest/ingest-reference-architectures/use-case-arch.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ You can host {{es}} on your own hardware or send your data to {{es}} on {{ecloud
1616

1717
**Decision tree**
1818

19-
[Data ingestion pipeline with decision tree](https://www.elastic.co/guide/en/cloud/current/ec-cloud-ingest-data.html#ec-data-ingest-pipeline)
19+
[Data ingestion](../../ingest.md)
2020

2121
| **Ingest architecture** | **Use when** |
2222
| --- | --- |

manage-data/ingest/ingesting-data-from-applications/ingest-logs-from-nodejs-web-application-using-filebeat.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -567,5 +567,5 @@ You can add titles to the visualizations, resize and position them as you like,
567567
568568
2. As your final step, remember to stop Filebeat, the Node.js web server, and the client. Enter *CTRL + C* in the terminal window for each application to stop them.
569569
570-
You now know how to monitor log files from a Node.js web application, deliver the log event data securely into an {{ech}} or {{ece}} deployment, and then visualize the results in Kibana in real time. Consult the [Filebeat documentation](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-overview.html) to learn more about the ingestion and processing options available for your data. You can also explore our [documentation](../../../manage-data/ingest.md#ec-ingest-methods) to learn all about ingesting data.
570+
You now know how to monitor log files from a Node.js web application, deliver the log event data securely into an {{ech}} or {{ece}} deployment, and then visualize the results in Kibana in real time. Consult the [Filebeat documentation](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-overview.html) to learn more about the ingestion and processing options available for your data. You can also explore our [documentation](../../../manage-data/ingest.md) to learn all about ingesting data.
571571

manage-data/ingest/ingesting-data-from-applications/ingest-logs-from-python-application-using-filebeat.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -446,5 +446,5 @@ You can add titles to the visualizations, resize and position them as you like,
446446

447447
2. As your final step, remember to stop Filebeat and the Python script. Enter *CTRL + C* in both your Filebeat terminal and in your `elvis.py` terminal.
448448

449-
You now know how to monitor log files from a Python application, deliver the log event data securely into an {{ech}} or {{ece}} deployment, and then visualize the results in Kibana in real time. Consult the [Filebeat documentation](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-overview.html) to learn more about the ingestion and processing options available for your data. You can also explore our [documentation](../../../manage-data/ingest.md#ec-ingest-methods) to learn all about all about ingesting data.
449+
You now know how to monitor log files from a Python application, deliver the log event data securely into an {{ech}} or {{ece}} deployment, and then visualize the results in Kibana in real time. Consult the [Filebeat documentation](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-overview.html) to learn more about the ingestion and processing options available for your data. You can also explore our [documentation](../../../manage-data/ingest.md) to learn all about all about ingesting data.
450450

manage-data/ingest/ingesting-timeseries-data.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ mapped_pages:
33
- https://www.elastic.co/guide/en/ingest-overview/current/ingest-tools.html
44
---
55

6-
# Ingesting timeseries data [ingest-tools]
6+
# Ingesting time series data [ingest-tools]
77

88
Elastic and others offer tools to help you get your data from the original data source into {{es}}. Some tools are designed for particular data sources, and others are multi-purpose.
99

manage-data/ingest/tools/upload-data-files.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,16 @@ mapped_urls:
44
- https://www.elastic.co/guide/en/kibana/current/connect-to-elasticsearch.html#upload-data-kibana
55
---
66

7-
# Upload data files
7+
# Upload data files [upload-data-kibana]
88

99
% What needs to be done: Align serverless/stateful
1010

1111
% Use migrated content from existing pages that map to this page:
1212

1313
% - [ ] ./raw-migrated-files/docs-content/serverless/elasticsearch-ingest-data-file-upload.md
14-
% - [ ] ./raw-migrated-files/kibana/kibana/connect-to-elasticsearch.md
14+
% - [ ] ./raw-migrated-files/kibana/kibana/connect-to-elasticsearch.md
15+
16+
17+
18+
% Note from David: I've removed the ID $$$upload-data-kibana$$$ from manage-data/ingest.md as those links should instead point to this page. So, please ensure that the following ID is included on this page. I've added it beside the title.
19+

0 commit comments

Comments
 (0)