Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
219 changes: 219 additions & 0 deletions docs/resources/elasticsearch_ml_anomaly_detection_job.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,219 @@

---
# generated by https://github.com/hashicorp/terraform-plugin-docs
page_title: "elasticstack_elasticsearch_ml_anomaly_detection_job Resource - terraform-provider-elasticstack"
subcategory: "Ml"
description: |-
Creates and manages Machine Learning anomaly detection jobs. See the ML Job API documentation https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-put-job.html for more details.
---

# elasticstack_elasticsearch_ml_anomaly_detection_job (Resource)

Creates and manages Machine Learning anomaly detection jobs. See the [ML Job API documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-put-job.html) for more details.

## Example Usage

```terraform
terraform {
required_providers {
elasticstack = {
source = "elastic/elasticstack"
version = "~> 0.11"
}
}
}

provider "elasticstack" {
elasticsearch {}
}

# Basic anomaly detection job
resource "elasticstack_elasticsearch_ml_anomaly_detection_job" "example" {
job_id = "example-anomaly-detector"
description = "Example anomaly detection job for monitoring web traffic"
groups = ["web", "monitoring"]

analysis_config = {
bucket_span = "15m"
detectors = [
{
function = "count"
detector_description = "Count anomalies in web traffic"
},
{
function = "mean"
field_name = "response_time"
detector_description = "Mean response time anomalies"
}
]
influencers = ["client_ip", "status_code"]
}

data_description = {
time_field = "@timestamp"
time_format = "epoch_ms"
}

analysis_limits = {
model_memory_limit = "100mb"
}

model_plot_config = {
enabled = true
}

model_snapshot_retention_days = 30
results_retention_days = 90
}
```

<!-- schema generated by tfplugindocs -->
## Schema

### Required

- `analysis_config` (Attributes) Specifies how to analyze the data. After you create a job, you cannot change the analysis configuration; all the properties are informational. (see [below for nested schema](#nestedatt--analysis_config))
- `data_description` (Attributes) Defines the format of the input data when you send data to the job by using the post data API. (see [below for nested schema](#nestedatt--data_description))
- `job_id` (String) The identifier for the anomaly detection job. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.

### Optional

- `allow_lazy_open` (Boolean) Advanced configuration option. Specifies whether this job can open when there is insufficient machine learning node capacity for it to be immediately assigned to a node.
- `analysis_limits` (Attributes) Limits can be applied for the resources required to hold the mathematical models in memory. (see [below for nested schema](#nestedatt--analysis_limits))
- `background_persist_interval` (String) Advanced configuration option. The time between each periodic persistence of the model.
- `custom_settings` (String) Advanced configuration option. Contains custom meta data about the job. For example, it can contain custom URL information.
- `daily_model_snapshot_retention_after_days` (Number) Advanced configuration option, which affects the automatic removal of old model snapshots for this job.
- `description` (String) A description of the job.
- `elasticsearch_connection` (Block List, Deprecated) Elasticsearch connection configuration block. (see [below for nested schema](#nestedblock--elasticsearch_connection))
- `groups` (Set of String) A set of job groups. A job can belong to no groups or many.
- `model_plot_config` (Attributes) This advanced configuration option stores model information along with the results. It provides a more detailed view into anomaly detection. (see [below for nested schema](#nestedatt--model_plot_config))
- `model_snapshot_retention_days` (Number) Advanced configuration option, which affects the automatic removal of old model snapshots for this job.
- `renormalization_window_days` (Number) Advanced configuration option. The period over which adjustments to the score are applied, as new data is seen.
- `results_index_name` (String) A text string that affects the name of the machine learning results index.
- `results_retention_days` (Number) Advanced configuration option. The period of time (in days) that results are retained.

### Read-Only

- `create_time` (String) The time the job was created.
- `id` (String) Internal identifier of the resource
- `job_type` (String) Reserved for future use, currently set to anomaly_detector.
- `job_version` (String) The version of Elasticsearch when the job was created.
- `model_snapshot_id` (String) A numerical character string that uniquely identifies the model snapshot.

<a id="nestedatt--analysis_config"></a>
### Nested Schema for `analysis_config`

Required:

- `bucket_span` (String) The size of the interval that the analysis is aggregated into, typically between 15m and 1h. If the anomaly detector is expecting to see data at near real-time frequency, then the bucket_span should be set to a value around 10 times the time between ingested documents. For example, if data comes every second, bucket_span should be 10s; if data comes every 5 minutes, bucket_span should be 50m. For sparse or batch data, use larger bucket_span values.
- `detectors` (Attributes List) Detector configuration objects. Detectors identify the anomaly detection functions and the fields on which they operate. (see [below for nested schema](#nestedatt--analysis_config--detectors))

Optional:

- `categorization_field_name` (String) For categorization jobs only. The name of the field to categorize.
- `categorization_filters` (List of String) For categorization jobs only. An array of regular expressions. A categorization message is matched against each regex in the order they are listed in the array.
- `influencers` (List of String) A comma separated list of influencer field names. Typically these can be the by, over, or partition fields that are used in the detector configuration.
- `latency` (String) The size of the window in which to expect data that is out of time order. If you specify a non-zero value, it must be greater than or equal to one second.
- `model_prune_window` (String) Advanced configuration option. The time interval (in days) between pruning the model.
- `multivariate_by_fields` (Boolean) This functionality is reserved for internal use. It is not supported for use in customer environments and is not subject to the support SLA of official GA features.
- `per_partition_categorization` (Attributes) Settings related to how categorization interacts with partition fields. (see [below for nested schema](#nestedatt--analysis_config--per_partition_categorization))
- `summary_count_field_name` (String) If this property is specified, the data that is fed to the job is expected to be pre-summarized.

<a id="nestedatt--analysis_config--detectors"></a>
### Nested Schema for `analysis_config.detectors`

Required:

- `function` (String) The analysis function that is used. For example, count, rare, mean, min, max, sum.

Optional:

- `by_field_name` (String) The field used to split the data. In particular, this property is used for analyzing the splits with respect to their own history. It is used for finding unusual values in the context of the split.
- `custom_rules` (Attributes List) Custom rules enable you to customize the way detectors operate. (see [below for nested schema](#nestedatt--analysis_config--detectors--custom_rules))
- `detector_description` (String) A description of the detector.
- `exclude_frequent` (String) Contains one of the following values: all, none, by, or over.
- `field_name` (String) The field that the detector function analyzes. Some functions require a field. Functions that don't require a field are count, rare, and freq_rare.
- `over_field_name` (String) The field used to split the data. In particular, this property is used for analyzing the splits with respect to the history of all splits. It is used for finding unusual values in the population of all splits.
- `partition_field_name` (String) The field used to segment the analysis. When you use this property, you have completely independent baselines for each value of this field.
- `use_null` (Boolean) Defines whether a new series is used as the null series when there is no value for the by or partition fields.

<a id="nestedatt--analysis_config--detectors--custom_rules"></a>
### Nested Schema for `analysis_config.detectors.custom_rules`

Optional:

- `actions` (List of String) The set of actions to be triggered when the rule applies. If more than one action is specified the effects of all actions are combined.
- `conditions` (Attributes List) An array of numeric conditions when the rule applies. (see [below for nested schema](#nestedatt--analysis_config--detectors--custom_rules--conditions))

<a id="nestedatt--analysis_config--detectors--custom_rules--conditions"></a>
### Nested Schema for `analysis_config.detectors.custom_rules.conditions`

Required:

- `applies_to` (String) Specifies the result property to which the condition applies.
- `operator` (String) Specifies the condition operator.
- `value` (Number) The value that is compared against the applies_to field using the operator.




<a id="nestedatt--analysis_config--per_partition_categorization"></a>
### Nested Schema for `analysis_config.per_partition_categorization`

Optional:

- `enabled` (Boolean) To enable this setting, you must also set the partition_field_name property to the same value in every detector that uses the keyword mlcategory. Otherwise, job creation fails.
- `stop_on_warn` (Boolean) This setting can be set to true only if per-partition categorization is enabled.



<a id="nestedatt--data_description"></a>
### Nested Schema for `data_description`

Optional:

- `field_delimiter` (String) The character used to delimit fields in the data. Only applicable when format is delimited.
- `format` (String) Only JSON format is supported at this time.
- `quote_character` (String) The character used to quote fields in the data. Only applicable when format is delimited.
- `time_field` (String) The name of the field that contains the timestamp.
- `time_format` (String) The time format, which can be epoch, epoch_ms, or a custom pattern.


<a id="nestedatt--analysis_limits"></a>
### Nested Schema for `analysis_limits`

Optional:

- `categorization_examples_limit` (Number) The maximum number of examples stored per category in memory and in the results data store.
- `model_memory_limit` (String) The approximate maximum amount of memory resources that are required for analytical processing.


<a id="nestedblock--elasticsearch_connection"></a>
### Nested Schema for `elasticsearch_connection`

Optional:

- `api_key` (String, Sensitive) API Key to use for authentication to Elasticsearch
- `bearer_token` (String, Sensitive) Bearer Token to use for authentication to Elasticsearch
- `ca_data` (String) PEM-encoded custom Certificate Authority certificate
- `ca_file` (String) Path to a custom Certificate Authority certificate
- `cert_data` (String) PEM encoded certificate for client auth
- `cert_file` (String) Path to a file containing the PEM encoded certificate for client auth
- `endpoints` (List of String, Sensitive) A list of endpoints where the terraform provider will point to, this must include the http(s) schema and port number.
- `es_client_authentication` (String, Sensitive) ES Client Authentication field to be used with the JWT token
- `headers` (Map of String, Sensitive) A list of headers to be sent with each request to Elasticsearch.
- `insecure` (Boolean) Disable TLS certificate validation
- `key_data` (String, Sensitive) PEM encoded private key for client auth
- `key_file` (String) Path to a file containing the PEM encoded private key for client auth
- `password` (String, Sensitive) Password to use for API authentication to Elasticsearch.
- `username` (String) Username to use for API authentication to Elasticsearch.


<a id="nestedatt--model_plot_config"></a>
### Nested Schema for `model_plot_config`

Optional:

- `annotations_enabled` (Boolean) If true, enables calculation and storage of the model change annotations for each entity that is being analyzed.
- `enabled` (Boolean) If true, enables calculation and storage of the model bounds for each entity that is being analyzed.
- `terms` (String) Limits data collection to this comma separated list of partition or by field values. If terms are not specified or it is an empty string, no filtering is applied.
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
terraform {
required_providers {
elasticstack = {
source = "elastic/elasticstack"
version = "~> 0.11"
}
}
}

provider "elasticstack" {
elasticsearch {}
}

# Basic anomaly detection job
resource "elasticstack_elasticsearch_ml_anomaly_detection_job" "example" {
job_id = "example-anomaly-detector"
description = "Example anomaly detection job for monitoring web traffic"
groups = ["web", "monitoring"]

analysis_config = {
bucket_span = "15m"
detectors = [
{
function = "count"
detector_description = "Count anomalies in web traffic"
},
{
function = "mean"
field_name = "response_time"
detector_description = "Mean response time anomalies"
}
]
influencers = ["client_ip", "status_code"]
}

data_description = {
time_field = "@timestamp"
time_format = "epoch_ms"
}

analysis_limits = {
model_memory_limit = "100mb"
}

model_plot_config = {
enabled = true
}

model_snapshot_retention_days = 30
results_retention_days = 90
}
Loading
Loading