Skip to content

Commit 1732577

Browse files
authored
Add main 'Ingestion' page (#407)
* Add main 'Ingestion' page * Remove Asciidoc change * Remove raw migrated files * Fix links to Data Visualizer / Upload Files content * Fix broken link
1 parent 98c718e commit 1732577

20 files changed

+51
-120
lines changed

explore-analyze/machine-learning/nlp/ml-nlp-ner-example.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -113,7 +113,7 @@ Using the example text "Elastic is headquartered in Mountain View, California.",
113113

114114
## Add the NER model to an {{infer}} ingest pipeline [ex-ner-ingest]
115115

116-
You can perform bulk {{infer}} on documents as they are ingested by using an [{{infer}} processor](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-processor.html) in your ingest pipeline. The novel *Les Misérables* by Victor Hugo is used as an example for {{infer}} in the following example. [Download](https://github.com/elastic/stack-docs/blob/8.5/docs/en/stack/ml/nlp/data/les-miserables-nd.json) the novel text split by paragraph as a JSON file, then upload it by using the [Data Visualizer](../../../manage-data/ingest.md#upload-data-kibana). Give the new index the name `les-miserables` when uploading the file.
116+
You can perform bulk {{infer}} on documents as they are ingested by using an [{{infer}} processor](https://www.elastic.co/guide/en/elasticsearch/reference/current/inference-processor.html) in your ingest pipeline. The novel *Les Misérables* by Victor Hugo is used as an example for {{infer}} in the following example. [Download](https://github.com/elastic/stack-docs/blob/8.5/docs/en/stack/ml/nlp/data/les-miserables-nd.json) the novel text split by paragraph as a JSON file, then upload it by using the [Data Visualizer](../../../manage-data/ingest/tools/upload-data-files.md). Give the new index the name `les-miserables` when uploading the file.
117117

118118
Now create an ingest pipeline either in the [Stack management UI](ml-nlp-inference.md#ml-nlp-inference-processor) or by using the API:
119119

explore-analyze/machine-learning/nlp/ml-nlp-text-emb-vector-search-example.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ In this step, you load the data that you later use in an ingest pipeline to get
103103

104104
The data set `msmarco-passagetest2019-top1000` is a subset of the MS MARCO Passage Ranking data set used in the testing stage of the 2019 TREC Deep Learning Track. It contains 200 queries and for each query a list of relevant text passages extracted by a simple information retrieval (IR) system. From that data set, all unique passages with their IDs have been extracted and put into a [tsv file](https://github.com/elastic/stack-docs/blob/8.5/docs/en/stack/ml/nlp/data/msmarco-passagetest2019-unique.tsv), totaling 182469 passages. In the following, this file is used as the example data set.
105105

106-
Upload the file by using the [Data Visualizer](../../../manage-data/ingest.md#upload-data-kibana). Name the first column `id` and the second one `text`. The index name is `collection`. After the upload is done, you can see an index named `collection` with 182469 documents.
106+
Upload the file by using the [Data Visualizer](../../../manage-data/ingest/tools/upload-data-files.md). Name the first column `id` and the second one `text`. The index name is `collection`. After the upload is done, you can see an index named `collection` with 182469 documents.
107107

108108
:::{image} ../../../images/machine-learning-ml-nlp-text-emb-data.png
109109
:alt: Importing the data

manage-data/ingest.md

Lines changed: 29 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -10,29 +10,41 @@ mapped_urls:
1010

1111
# Ingestion
1212

13-
% What needs to be done: Finish draft
13+
Bring your data! Whether you call it *adding*, *indexing*, or *ingesting* data, you have to get the data into {{es}} before you can search it, visualize it, and use it for insights.
1414

15-
% GitHub issue: docs-projects#326
15+
Our ingest tools are flexible, and support a wide range of scenarios. We can help you with everything from popular and straightforward use cases, all the way to advanced use cases that require additional processing in order to modify or reshape your data before it goes to {{es}}.
1616

17-
% Scope notes: Brief introduction on use cases Importance of data ingestion theory / process how to frame these products as living independently from ES? Link to reference architectures
17+
You can ingest:
1818

19-
% Use migrated content from existing pages that map to this page:
19+
* **General content** (data without timestamps), such as HTML pages, catalogs, and files
20+
* **Time series (timestamped) data**, such as logs, metrics, and traces for Elastic Security, Observability, Search solutions, or for your own custom solutions
2021

21-
% - [ ] ./raw-migrated-files/cloud/cloud/ec-cloud-ingest-data.md
22-
% Notes: Use draft Overview from Karen's PR
23-
% - [ ] ./raw-migrated-files/kibana/kibana/connect-to-elasticsearch.md
24-
% Notes: Other existing pages might be used in the "Plan" section
25-
% - [ ] ./raw-migrated-files/docs-content/serverless/elasticsearch-ingest-your-data.md
26-
% - [ ] https://www.elastic.co/customer-success/data-ingestion
27-
% - [ ] ./raw-migrated-files/elasticsearch/elasticsearch-reference/es-ingestion-overview.md
28-
% - [ ] ./raw-migrated-files/ingest-docs/ingest-overview/ingest-intro.md
2922

30-
% Internal links rely on the following IDs being on this page (e.g. as a heading ID, paragraph ID, etc):
23+
## Ingesting general content [ingest-general]
3124

32-
$$$upload-data-kibana$$$
25+
Elastic offer tools designed to ingest specific types of general content. The content type determines the best ingest option.
3326

34-
$$$_add_sample_data$$$
27+
* To index **documents** directly into {{es}}, use the {{es}} [document APIs](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs.html).
28+
* To send **application data** directly to {{es}}, use an [{{es}} language client](https://www.elastic.co/guide/en/elasticsearch/client/index.html).
29+
* To index **web page content**, use the Elastic [web crawler](https://www.elastic.co/web-crawler).
30+
* To sync **data from third-party sources**, use [connectors](https://www.elastic.co/guide/en/elasticsearch/reference/current/es-connectors.html). A connector syncs content from an original data source to an {{es}} index. Using connectors you can create *searchable*, read-only replicas of your data sources.
31+
* To index **single files** for testing in a non-production environment, use the {{kib}} [file uploader](ingest/tools/upload-data-files.md).
3532

36-
$$$ec-ingest-methods$$$
33+
If you would like to try things out before you add your own data, try using our [sample data](ingest/sample-data.md).
3734

38-
$$$ec-data-ingest-pipeline$$$
35+
36+
## Ingesting time series data [ingest-time-series]
37+
38+
::::{admonition} What’s the best approach for ingesting time series data?
39+
The best approach for ingesting data is the *simplest option* that *meets your needs* and *satisfies your use case*.
40+
41+
In most cases, the *simplest option* for ingesting time series data is using {{agent}} paired with an Elastic integration.
42+
43+
* Install [Elastic Agent](https://www.elastic.co/guide/en/fleet/current) on the computer(s) from which you want to collect data.
44+
* Add the [Elastic integration](https://docs.elastic.co/en/integrations) for the data source to your deployment.
45+
46+
Integrations are available for many popular platforms and services, and are a good place to start for ingesting data into Elastic solutions—​Observability, Security, and Search—​or your own search application.
47+
48+
Check out the [Integration quick reference](https://docs.elastic.co/en/integrations/all_integrations) to search for available integrations. If you don’t find an integration for your data source or if you need additional processing to extend the integration, we still have you covered. Refer to [Transform and enrich data](ingest/transform-enrich.md) to learn more.
49+
50+
::::

manage-data/ingest/ingest-reference-architectures/use-case-arch.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ You can host {{es}} on your own hardware or send your data to {{es}} on {{ecloud
1616

1717
**Decision tree**
1818

19-
[Data ingestion pipeline with decision tree](https://www.elastic.co/guide/en/cloud/current/ec-cloud-ingest-data.html#ec-data-ingest-pipeline)
19+
[Data ingestion](../../ingest.md)
2020

2121
| **Ingest architecture** | **Use when** |
2222
| --- | --- |

manage-data/ingest/ingesting-data-from-applications/ingest-logs-from-nodejs-web-application-using-filebeat.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -567,5 +567,5 @@ You can add titles to the visualizations, resize and position them as you like,
567567
568568
2. As your final step, remember to stop Filebeat, the Node.js web server, and the client. Enter *CTRL + C* in the terminal window for each application to stop them.
569569
570-
You now know how to monitor log files from a Node.js web application, deliver the log event data securely into an {{ech}} or {{ece}} deployment, and then visualize the results in Kibana in real time. Consult the [Filebeat documentation](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-overview.html) to learn more about the ingestion and processing options available for your data. You can also explore our [documentation](../../../manage-data/ingest.md#ec-ingest-methods) to learn all about ingesting data.
570+
You now know how to monitor log files from a Node.js web application, deliver the log event data securely into an {{ech}} or {{ece}} deployment, and then visualize the results in Kibana in real time. Consult the [Filebeat documentation](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-overview.html) to learn more about the ingestion and processing options available for your data. You can also explore our [documentation](../../../manage-data/ingest.md) to learn all about ingesting data.
571571

manage-data/ingest/ingesting-data-from-applications/ingest-logs-from-python-application-using-filebeat.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -446,5 +446,5 @@ You can add titles to the visualizations, resize and position them as you like,
446446

447447
2. As your final step, remember to stop Filebeat and the Python script. Enter *CTRL + C* in both your Filebeat terminal and in your `elvis.py` terminal.
448448

449-
You now know how to monitor log files from a Python application, deliver the log event data securely into an {{ech}} or {{ece}} deployment, and then visualize the results in Kibana in real time. Consult the [Filebeat documentation](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-overview.html) to learn more about the ingestion and processing options available for your data. You can also explore our [documentation](../../../manage-data/ingest.md#ec-ingest-methods) to learn all about all about ingesting data.
449+
You now know how to monitor log files from a Python application, deliver the log event data securely into an {{ech}} or {{ece}} deployment, and then visualize the results in Kibana in real time. Consult the [Filebeat documentation](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-overview.html) to learn more about the ingestion and processing options available for your data. You can also explore our [documentation](../../../manage-data/ingest.md) to learn all about all about ingesting data.
450450

manage-data/ingest/ingesting-timeseries-data.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ mapped_pages:
33
- https://www.elastic.co/guide/en/ingest-overview/current/ingest-tools.html
44
---
55

6-
# Ingesting timeseries data [ingest-tools]
6+
# Ingesting time series data [ingest-tools]
77

88
Elastic and others offer tools to help you get your data from the original data source into {{es}}. Some tools are designed for particular data sources, and others are multi-purpose.
99

manage-data/ingest/tools/upload-data-files.md

Lines changed: 7 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,16 @@ mapped_urls:
44
- https://www.elastic.co/guide/en/kibana/current/connect-to-elasticsearch.html#upload-data-kibana
55
---
66

7-
# Upload data files
7+
# Upload data files [upload-data-kibana]
88

99
% What needs to be done: Align serverless/stateful
1010

1111
% Use migrated content from existing pages that map to this page:
1212

1313
% - [ ] ./raw-migrated-files/docs-content/serverless/elasticsearch-ingest-data-file-upload.md
14-
% - [ ] ./raw-migrated-files/kibana/kibana/connect-to-elasticsearch.md
14+
% - [ ] ./raw-migrated-files/kibana/kibana/connect-to-elasticsearch.md
15+
16+
17+
18+
% Note from David: I've removed the ID $$$upload-data-kibana$$$ from manage-data/ingest.md as those links should instead point to this page. So, please ensure that the following ID is included on this page. I've added it beside the title.
19+

raw-migrated-files/cloud/cloud/ec-getting-started-search-use-cases-node-logs.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -517,5 +517,5 @@ You can add titles to the visualizations, resize and position them as you like,
517517
518518
2. As your final step, remember to stop Filebeat, the Node.js web server, and the client. Enter *CTRL + C* in the terminal window for each application to stop them.
519519
520-
You now know how to monitor log files from a Node.js web application, deliver the log event data securely into an Elasticsearch Service deployment, and then visualize the results in Kibana in real time. Consult the [Filebeat documentation](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-overview.html) to learn more about the ingestion and processing options available for your data. You can also explore our [documentation](../../../manage-data/ingest.md#ec-ingest-methods) to learn all about working in Elasticsearch Service.
520+
You now know how to monitor log files from a Node.js web application, deliver the log event data securely into an Elasticsearch Service deployment, and then visualize the results in Kibana in real time. Consult the [Filebeat documentation](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-overview.html) to learn more about the ingestion and processing options available for your data. You can also explore our [documentation](../../../manage-data/ingest.md) to learn all about working in Elasticsearch Service.
521521

raw-migrated-files/cloud/cloud/ec-getting-started-search-use-cases-python-logs.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -408,5 +408,5 @@ You can add titles to the visualizations, resize and position them as you like,
408408

409409
2. As your final step, remember to stop Filebeat and the Python script. Enter *CTRL + C* in both your Filebeat terminal and in your `elvis.py` terminal.
410410

411-
You now know how to monitor log files from a Python application, deliver the log event data securely into an Elasticsearch Service deployment, and then visualize the results in Kibana in real time. Consult the [Filebeat documentation](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-overview.html) to learn more about the ingestion and processing options available for your data. You can also explore our [documentation](../../../manage-data/ingest.md#ec-ingest-methods) to learn all about working in Elasticsearch Service.
411+
You now know how to monitor log files from a Python application, deliver the log event data securely into an Elasticsearch Service deployment, and then visualize the results in Kibana in real time. Consult the [Filebeat documentation](https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-overview.html) to learn more about the ingestion and processing options available for your data. You can also explore our [documentation](../../../manage-data/ingest.md) to learn all about working in Elasticsearch Service.
412412

0 commit comments

Comments
 (0)