You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository was archived by the owner on Aug 16, 2022. It is now read-only.
Copy file name to clipboardExpand all lines: docs/ad/index.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -52,7 +52,7 @@ In this case, a feature is the field in your index that you to check for anomali
52
52
53
53
For example, if you choose `min()`, the detector focuses on finding anomalies based on the minimum values of your feature. If you choose `average()`, the detector finds anomalies based on the average values of your feature.
54
54
55
-
A multi-feature model correlates anomalies across all its features. The [curse of dimensionality](https://en.wikipedia.org/wiki/Curse_of_dimensionality) makes it less likely for multi-feature models to identify smaller anomalies as compared to a single-feature model. Adding more features might negatively impact the [precision and recall](https://en.wikipedia.org/wiki/Precision_and_recall) of a model. A higher proportion of noise in your data might further amplify this negative impact. We recommend adding fewer features to your detector for a higher accuracy. By default, the maximum number of features for a detector is 5. You can adjust this limit with the `opendistro.anomaly_detection.max_anomaly_features` setting.
55
+
A multi-feature model correlates anomalies across all its features. The [curse of dimensionality](https://en.wikipedia.org/wiki/Curse_of_dimensionality) makes it less likely for multi-feature models to identify smaller anomalies as compared to a single-feature model. Adding more features might negatively impact the [precision and recall](https://en.wikipedia.org/wiki/Precision_and_recall) of a model. A higher proportion of noise in your data might further amplify this negative impact. Selecting the optimal feature set is usually an iterative process. We recommend experimenting with a historical detector with different feature sets and checking the precision before moving on to real-time detectors. By default, the maximum number of features for a detector is 5. You can adjust this limit with the `opendistro.anomaly_detection.max_anomaly_features` setting.
56
56
{: .note }
57
57
58
58
1. On the **Model configuration** page, enter the **Feature name**.
Copy file name to clipboardExpand all lines: docs/cli/index.md
+22-17Lines changed: 22 additions & 17 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,9 @@ has_children: false
9
9
10
10
The Open Distro for Elasticsearch command line interface (odfe-cli) lets you manage your ODFE cluster from the command line and automate tasks.
11
11
12
-
Currently, odfe-cli only supports the [Anomaly Detection](../ad/) plugin. You can create and delete detectors, start and stop them, and use profiles to easily access different clusters or sign requests with different credentials.
12
+
Currently, odfe-cli supports the [Anomaly Detection](../ad/) and [k-NN](../knn/) plugins, along with arbitrary REST API paths. Among other things, you can use odfe-cli create and delete detectors, start and stop them, and check k-NN statistics.
13
+
14
+
Profiles let you easily access different clusters or sign requests with different credentials. odfe-cli supports unauthenticated requests, HTTP basic signing, and IAM signing for Amazon Web Services.
13
15
14
16
This example moves a detector (`ecommerce-count-quantity`) from a staging cluster to a production cluster:
15
17
@@ -47,28 +49,25 @@ odfe-cli ad delete ecommerce-count-quantity --profile staging
47
49
48
50
## Profiles
49
51
50
-
Profiles let you easily switch between different clusters and user credentials. To get started, run `odfe-cli profile create`and specify a unique profile name:
52
+
Profiles let you easily switch between different clusters and user credentials. To get started, run `odfe-cli profile create`with the `--auth-type`, `--endpoint`, and `--name` options:
Copy file name to clipboardExpand all lines: docs/security/configuration/client-auth.md
+23-3Lines changed: 23 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -45,7 +45,7 @@ You can now assign your certificate's common name (CN) to a role. For this step,
45
45
46
46
After deciding which role you want to map your certificate's CN to, you can use [Kibana](../../access-control/users-roles#map-users-to-roles), [`roles_mapping.yml`](../yaml/#roles_mappingyml), or the [REST API](../../access-control/api/#create-role-mapping) to map your certificate's CN to the role. The following example uses the `REST API` to map the common name `CLIENT1` to the role `readall`.
47
47
48
-
#### Sample request
48
+
**Sample request**
49
49
50
50
```json
51
51
PUT _opendistro/_security/api/rolesmapping/readall
@@ -56,7 +56,7 @@ PUT _opendistro/_security/api/rolesmapping/readall
You can also configure your Beats so that it uses a client certificate for authentication with Elasticsearch. Afterwards, it can start sending output to Elasticsearch.
93
+
94
+
This output configuration specifies which settings you need for client certificate authentication:
Copy file name to clipboardExpand all lines: docs/trace/data-prepper-reference.md
+28-6Lines changed: 28 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,17 @@ nav_order: 25
7
7
8
8
# Data Prepper configuration reference
9
9
10
-
This page lists all supported Data Prepper sources, buffers, processors, and sinks, along with their associated options. For example configuration files, see [Data Prepper](../data-prepper/).
10
+
This page lists all supported Data Prepper sources, buffers, preppers, and sinks, along with their associated options. For example configuration files, see [Data Prepper](../data-prepper/).
11
+
12
+
13
+
## Data Prepper server options
14
+
Option | Required | Description
15
+
:--- | :--- | :---
16
+
ssl | No | Boolean, indicating whether TLS should be used for server APIs. Defaults to true.
17
+
keyStoreFilePath | No | String, path to a .jks or .p12 keystore file. Required if ssl is true.
18
+
keyStorePassword | No | String, password for keystore. Optional, defaults to empty string.
19
+
privateKeyPassword | No | String, password for private key within keystore. Optional, defaults to empty string.
20
+
serverPort | No | Integer, port number to use for server APIs. Defaults to 4900
11
21
12
22
13
23
## General pipeline options
@@ -72,12 +82,12 @@ buffer_size | No | Integer, default 512. The maximum number of records the buffe
72
82
batch_size | No | Integer, default 8. The maximum number of records the buffer drains after each read.
73
83
74
84
75
-
## Processors
85
+
## Preppers
76
86
77
-
Processors perform some action on your data: filter, transform, enrich, etc.
87
+
Preppers perform some action on your data: filter, transform, enrich, etc.
78
88
79
89
80
-
### otel_trace_raw_processor
90
+
### otel_trace_raw_prepper
81
91
82
92
Converts OpenTelemetry data to Elasticsearch-compatible JSON documents. No options.
83
93
@@ -86,10 +96,22 @@ Converts OpenTelemetry data to Elasticsearch-compatible JSON documents. No optio
86
96
87
97
Uses OpenTelemetry data to create a distributed service map for visualization in Kibana. No options.
88
98
99
+
### peer_forwarder
100
+
Forwards ExportTraceServiceRequests via gRPC to other Data Prepper instances. Required for operating Data Prepper in a clustered deployment.
101
+
102
+
Option | Required | Description
103
+
:--- | :--- | :---
104
+
time_out | No | Integer, forwarded request timeout in seconds. Defaults to 3 seconds.
105
+
span_agg_count | No | Integer, batch size for number of spans per request. Defaults to 48.
106
+
discovery_mode | No | String, peer discovery mode to be used. Allowable values are `static` and `dns`. Defaults to `static`.
107
+
static_endpoints | No | List, containing string endpoints of all Data Prepper instances.
108
+
domain_name | No | String, single domain name to query DNS against. Typically used by creating multiple DNS A Records for the same domain.
109
+
ssl | No | Boolean, indicating whether TLS should be used. Default is true.
110
+
sslKeyCertChainFile | No | String, path to the security certificate
89
111
90
112
### string_converter
91
113
92
-
Converts strings to uppercase or lowercase. Mostly useful as an example if you want to develop your own processor.
114
+
Converts strings to uppercase or lowercase. Mostly useful as an example if you want to develop your own prepper.
93
115
94
116
Option | Required | Description
95
117
:--- | :--- | :---
@@ -116,7 +138,7 @@ aws_region | No | String, AWS region for the cluster (e.g. `"us-east-1"`) if you
116
138
trace_analytics_raw | No | Boolean, default false. Whether to export as trace data to the `otel-v1-apm-span-*` index pattern (alias `otel-v1-apm-span`) for use with the Trace Analytics Kibana plugin.
117
139
trace_analytics_service_map | No | Boolean, default false. Whether to export as trace data to the `otel-v1-apm-service-map` index for use with the service map component of the Trace Analytics Kibana plugin.
118
140
index | No | String, name of the index to export to. Only required if you don't use the `trace_analytics_raw` or `trace_analytics_service_map` presets.
119
-
template_file | No | String, the path to a JSON [index template](https://opendistro.github.io/for-elasticsearch-docs/docs/elasticsearch/index-templates/) file (e.g. `/your/local/template-file.json` if you do not use the `trace_analytics_raw` or `trace_analytics_service_map`. See [otel-v1-apm-span-index-template.json](https://github.com/opendistro-for-elasticsearch/simple-ingest-transformation-utility-pipeline/blob/master/situp-plugins/elasticsearch/src/main/resources/otel-v1-apm-span-index-template.json) for an example.
141
+
template_file | No | String, the path to a JSON [index template](https://opendistro.github.io/for-elasticsearch-docs/docs/elasticsearch/index-templates/) file (e.g. `/your/local/template-file.json` if you do not use the `trace_analytics_raw` or `trace_analytics_service_map`. See [otel-v1-apm-span-index-template.json](https://github.com/opendistro-for-elasticsearch/data-prepper/blob/main/data-prepper-plugins/elasticsearch/src/main/resources/otel-v1-apm-span-index-template.json) for an example.
120
142
document_id_field | No | String, the field from the source data to use for the Elasticsearch document ID (e.g. `"my-field"`) if you don't use the `trace_analytics_raw` or `trace_analytics_service_map` presets.
121
143
dlq_file | No | String, the path to your preferred dead letter queue file (e.g. `/your/local/dlq-file`). Data Prepper writes to this file when it fails to index a document on the Elasticsearch cluster.
122
144
bulk_size | No | Integer (long), default 5. The maximum size (in MiB) of bulk requests to the Elasticsearch cluster. Values below 0 indicate an unlimited size. If a single document exceeds the maximum bulk request size, Data Prepper sends it individually.
To use Data Prepper, you define pipelines in a configuration YAML file. Each pipeline is a combination of a source, a buffer, zero or more processors, and one or more sinks:
26
+
To use Data Prepper, you define pipelines in a configuration YAML file. Each pipeline is a combination of a source, a buffer, zero or more preppers, and one or more sinks:
27
27
28
28
```yml
29
29
sample-pipeline:
@@ -38,8 +38,8 @@ sample-pipeline:
38
38
bounded_blocking:
39
39
buffer_size: 1024# max number of records the buffer accepts
40
40
batch_size: 256# max number of records the buffer drains after each read
41
-
processor:
42
-
- otel_trace_raw_processor:
41
+
prepper:
42
+
- otel_trace_raw_prepper:
43
43
sink:
44
44
- elasticsearch:
45
45
hosts: ["https:localhost:9200"]
@@ -55,9 +55,9 @@ sample-pipeline:
55
55
56
56
By default, Data Prepper uses its one and only buffer, the `bounded_blocking` buffer, so you can omit this section unless you developed a custom buffer or need to tune the buffer settings.
57
57
58
-
- Processors perform some action on your data: filter, transform, enrich, etc.
58
+
- Preppers perform some action on your data: filter, transform, enrich, etc.
59
59
60
-
You can have multiple processors, which run sequentially from top to bottom, not in parallel. The `otel_trace_raw_processor` processor converts OpenTelemetry data into Elasticsearch-compatible JSON documents.
60
+
You can have multiple preppers, which run sequentially from top to bottom, not in parallel. The `otel_trace_raw_prepper` prepper converts OpenTelemetry data into Elasticsearch-compatible JSON documents.
61
61
62
62
- Sinks define where your data goes. In this case, the sink is an Open Distro for Elasticsearch cluster.
To learn more, see the [Data Prepper configuration reference](../data-prepper-reference/).
108
109
110
+
## Configure the Data Prepper server
111
+
Data Prepper itself provides administrative HTTP endpoints such as `/list` to list pipelines and `/metrics/prometheus` to provide Prometheus-compatible metrics data. The port which serves these endpoints, as well as TLS configuration, is specified by a separate YAML file. Example:
For production workloads, you likely want to run Data Prepper on a dedicated machine, which makes connectivity a concern. Data Prepper uses port 21890 and must be able to connect to both the OpenTelemetry Collector and the Elasticsearch cluster. In the [sample applications](https://github.com/opendistro-for-elasticsearch/Data-Prepper/tree/master/examples), you can see that all components use the same Docker network and expose the appropriate ports.
135
+
For production workloads, you likely want to run Data Prepper on a dedicated machine, which makes connectivity a concern. Data Prepper uses port 21890 and must be able to connect to both the OpenTelemetry Collector and the Elasticsearch cluster. In the [sample applications](https://github.com/opendistro-for-elasticsearch/Data-Prepper/tree/main/examples), you can see that all components use the same Docker network and expose the appropriate ports.
Copy file name to clipboardExpand all lines: docs/trace/get-started.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ nav_order: 1
7
7
8
8
# Get started with Trace Analytics
9
9
10
-
Open Distro for Elasticsearch Trace Analytics consists of two components---Data Prepper and the Trace Analytics Kibana plugin---that fit into the OpenTelemetry and Elasticsearch ecosystems. The Data Prepper repository has several [sample applications](https://github.com/opendistro-for-elasticsearch/Data-Prepper/tree/master/examples) to help you get started.
10
+
Open Distro for Elasticsearch Trace Analytics consists of two components---Data Prepper and the Trace Analytics Kibana plugin---that fit into the OpenTelemetry and Elasticsearch ecosystems. The Data Prepper repository has several [sample applications](https://github.com/opendistro-for-elasticsearch/data-prepper/tree/main/examples) to help you get started.
11
11
12
12
13
13
## Basic flow of data
@@ -29,7 +29,7 @@ Open Distro for Elasticsearch Trace Analytics consists of two components---Data
29
29
30
30
One Trace Analytics sample application is the Jaeger HotROD demo, which mimics the flow of data through a distributed application.
31
31
32
-
Download or clone the [Data Prepper repository](https://github.com/opendistro-for-elasticsearch/Data-Prepper/tree/master/examples). Then navigate to `examples/jaeger-hotrod/` and open `docker-compose.yml` in a text editor. This file contains a container for each element from [Basic flow of data](#basic-flow-of-data):
32
+
Download or clone the [Data Prepper repository](https://github.com/opendistro-for-elasticsearch/data-prepper). Then navigate to `examples/jaeger-hotrod/` and open `docker-compose.yml` in a text editor. This file contains a container for each element from [Basic flow of data](#basic-flow-of-data):
33
33
34
34
- A distributed application (`jaeger-hot-rod`) with the Jaeger agent (`jaeger-agent`)
35
35
- The [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/getting-started/) (`otel-collector`)
0 commit comments