diff --git a/deploy-manage/deploy/cloud-on-k8s/required-rbac-permissions.md b/deploy-manage/deploy/cloud-on-k8s/required-rbac-permissions.md index 4177e711ed..d739067096 100644 --- a/deploy-manage/deploy/cloud-on-k8s/required-rbac-permissions.md +++ b/deploy-manage/deploy/cloud-on-k8s/required-rbac-permissions.md @@ -73,12 +73,12 @@ These permissions are needed to manage each {{stack}} application. For example, | Name | API group | Optional? | | --- | --- | --- | -| `Elasticsearch
Elasticsearch/status
Elasticsearch/finalizers` | `elasticsearch.k8s.elastic.co` | no | -| `Kibana
Kibana/status
Kibana/finalizers` | `kibana.k8s.elastic.co` | no | -| `APMServer
APMServer/status
APMServer/finalizers` | `apm.k8s.elastic.co` | no | -| `EnterpriseSearch
EnterpriseSearch/status
EnterpriseSearch/finalizers` | `enterprisesearch.k8s.elastic.co` | no | -| `Beat
Beat/status
Beat/finalizers` | `beat.k8s.elastic.co` | no | -| `Agent
Agent/status
Agent/finalizers` | `agent.k8s.elastic.co` | no | -| `ElasticMapsServer
ElasticMapsServer/status
ElasticMapsServer/finalizers` | `maps.k8s.elastic.co` | no | -| `Logstash
Logstash/status
Logstash/finalizers` | `logstashes.k8s.elastic.co` | no | +| `Elasticsearch`
`Elasticsearch/status`
`Elasticsearch/finalizers` | `elasticsearch.k8s.elastic.co` | no | +| `Kibana`
`Kibana/status`
`Kibana/finalizers` | `kibana.k8s.elastic.co` | no | +| `APMServer`
`APMServer/status`
`APMServer/finalizers` | `apm.k8s.elastic.co` | no | +| `EnterpriseSearch`
`EnterpriseSearch/status`
`EnterpriseSearch/finalizers` | `enterprisesearch.k8s.elastic.co` | no | +| `Beat`
`Beat/status`
`Beat/finalizers` | `beat.k8s.elastic.co` | no | +| `Agent`
`Agent/status`
`Agent/finalizers` | `agent.k8s.elastic.co` | no | +| `ElasticMapsServer`
`ElasticMapsServer/status`
`ElasticMapsServer/finalizers` | `maps.k8s.elastic.co` | no | +| `Logstash`
`Logstash/status`
`Logstash/finalizers` | `logstashes.k8s.elastic.co` | no | diff --git a/deploy-manage/deploy/elastic-cloud/azure-native-isv-service.md b/deploy-manage/deploy/elastic-cloud/azure-native-isv-service.md index 0fe3408186..2b78ef03e2 100644 --- a/deploy-manage/deploy/elastic-cloud/azure-native-isv-service.md +++ b/deploy-manage/deploy/elastic-cloud/azure-native-isv-service.md @@ -204,7 +204,6 @@ $$$azure-integration-azure-tenant-info$$$What Azure tenant information does Elas * Data defined in the marketplace [Saas fulfillment Subscription APIs](https://docs.microsoft.com/en-us/azure/marketplace/partner-center-portal/pc-saas-fulfillment-subscription-api). * The following additional data: - * Marketplace subscription ID * Marketplace plan ID * Azure Account ID @@ -222,7 +221,6 @@ $$$azure-integration-cli-api$$$What other methods are available to deploy {{es}} : Use any of the following methods: * **Deploy using Azure tools** - * The Azure console * [Azure Terraform](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/elastic_cloud_elasticsearch) * The [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/elastic?view=azure-cli-latest) @@ -230,14 +228,12 @@ $$$azure-integration-cli-api$$$What other methods are available to deploy {{es}} * [PowerShell](https://docs.microsoft.com/en-us/powershell/module/az.elastic/?view=azps-8.0.0#elastic) * **Deploy using official Azure SDKs** - * [Python](https://github.com/Azure/azure-sdk-for-python/blob/main/README.md) * [Java](https://github.com/Azure/azure-sdk-for-java/blob/azure-resourcemanager-elastic_1.0.0-beta.1/README.md) * [.NET](https://github.com/Azure/azure-sdk-for-net/blob/main/README.md) * [Rust](https://github.com/Azure/azure-sdk-for-rust/blob/main/services/README.md) * **Deploy using {{ecloud}}** - * The {{ecloud}} [console](https://cloud.elastic.co?page=docs&placement=docs-body) * The {{ecloud}} [REST API](cloud://reference/cloud-hosted/ec-api-restful.md) * The {{ecloud}} [command line tool](ecctl://reference/index.md) @@ -248,7 +244,6 @@ $$$azure-integration-cli-api$$$What other methods are available to deploy {{es}} $$$azure-integration-migrate$$$How do I migrate my data from the classic Azure marketplace account to the native integration? : First create a new account configured with {{ecloud}} Azure Native ISV Service, then perform the migration as follows: - 1. From your classic Azure marketplace account, navigate to the deployment and [configure a custom snapshot repository using Azure Blog Storage](../../tools/snapshot-and-restore/ec-azure-snapshotting.md). 2. Using the newly configured snapshot repository, [create a snapshot](../../tools/snapshot-and-restore/create-snapshots.md) of the data to migrate. 3. Navigate to Azure and log in as the user that manages the {{es}} resources. diff --git a/reference/fleet/add_kubernetes_metadata-processor.md b/reference/fleet/add_kubernetes_metadata-processor.md index e39d3def6e..b40cd257bd 100644 --- a/reference/fleet/add_kubernetes_metadata-processor.md +++ b/reference/fleet/add_kubernetes_metadata-processor.md @@ -97,18 +97,72 @@ This configuration disables the default indexers and matchers, and then enables {{agent}} processors execute *before* ingest pipelines, which means that they process the raw event data rather than the final event sent to {{es}}. For related limitations, refer to [What are some limitations of using processors?](/reference/fleet/agent-processors.md#limitations) :::: +`host` +: (Optional) Node to scope {{agent}} to in case it cannot be accurately detected, as when running {{agent}} in host network mode. -| Name | Required | Default | Description | -| --- | --- | --- | --- | -| `host` | No | | Node to scope {{agent}} to in case it cannot be accurately detected, as when running {{agent}} in host network mode. | -| `scope` | No | `node` | Whether the processor should have visibility at the node level (`node`) or at the entire cluster level (`cluster`). | -| `namespace` | No | | Namespace to collect the metadata from. If no namespaces is specified, collects metadata from all namespaces. | -| `add_resource_metadata` | No | | Filters and configuration for adding extra metadata to the event. This setting accepts the following settings:

* `node` or `namespace`: Labels and annotations filters for the extra metadata coming from node and namespace. By default all labels are included, but annotations are not. To change the default behavior, you can set `include_labels`, `exclude_labels`, and `include_annotations`. These settings are useful when storing labels and annotations that require special handling to avoid overloading the storage output. Wildcards are supported in these settings by using `use_regex_include: true` in combination with `include_labels`, and respectively by setting `use_regex_exclude: true` in combination with `exclude_labels`. To turn off enrichment of `node` or `namespace` metadata individually, set `enabled: false`.
* `deployment`: If the resource is `pod` and it is created from a `deployment`, the deployment name is not added by default. To enable this behavior, set `deployment: true`.
* `cronjob`: If the resource is `pod` and it is created from a `cronjob`, the cronjob name is not added by default. To enable this behavior, set `cronjob: true`.

::::{dropdown} Expand this to see an example
```yaml
add_resource_metadata:
namespace:
include_labels: ["namespacelabel1"]
# use_regex_include: false
# use_regex_exclude: false
# exclude_labels: ["namespacelabel2"]
#labels.dedot: true
#annotations.dedot: true
node:
# use_regex_include: false
include_labels: ["nodelabel2"]
include_annotations: ["nodeannotation1"]
# use_regex_exclude: false
# exclude_annotations: ["nodeannotation2"]
#labels.dedot: true
#annotations.dedot: true
deployment: true
cronjob: true
```

::::

| -| `kube_config` | No | `KUBECONFIG` environment variable, if present | Config file to use as the configuration for the Kubernetes client. | -| `kube_client_options` | No | | Additional configuration options for the Kubernetes client. Currently client QPS and burst are supported. If this setting is not configured, the Kubernetes client’s [default QPS and burst](https://pkg.go.dev/k8s.io/client-go/rest#pkg-constants) is used.

::::{dropdown} Expand this to see an example
```yaml
kube_client_options:
qps: 5
burst: 10
```

::::

| -| `cleanup_timeout` | No | `60s` | Time of inactivity before stopping the running configuration for a container. | -| `sync_period` | No | | Timeout for listing historical resources. | -| `labels.dedot` | No | `true` | Whether to replace dots (`.`) in labels with underscores (`_`).
`annotations.dedot` | +`scope` +: (Optional) Whether the processor should have visibility at the node level (`node`) or at the entire cluster level (`cluster`). + + **Default**: `node` + +`namespace` +: (Optional) Namespace to collect the metadata from. If no namespaces is specified, collects metadata from all namespaces. + +`add_resource_metadata` +: (Optional) Filters and configuration for adding extra metadata to the event. This setting accepts the following settings: + * `node` or `namespace`: Labels and annotations filters for the extra metadata coming from node and namespace. By default all labels are included, but annotations are not. To change the default behavior, you can set `include_labels`, `exclude_labels`, and `include_annotations`. These settings are useful when storing labels and annotations that require special handling to avoid overloading the storage output. Wildcards are supported in these settings by using `use_regex_include: true` in combination with `include_labels`, and respectively by setting `use_regex_exclude: true` in combination with `exclude_labels`. To turn off enrichment of `node` or `namespace` metadata individually, set `enabled: false`. + * `deployment`: If the resource is `pod` and it is created from a `deployment`, the deployment name is not added by default. To enable this behavior, set `deployment: true`. + * `cronjob`: If the resource is `pod` and it is created from a `cronjob`, the cronjob name is not added by default. To enable this behavior, set `cronjob: true`. + + ::::{dropdown} Expand this to see an example + ```yaml + add_resource_metadata: + namespace: + include_labels: ["namespacelabel1"] + # use_regex_include: false + # use_regex_exclude: false + # exclude_labels: ["namespacelabel2"] + #labels.dedot: true + #annotations.dedot: true + node: + # use_regex_include: false + include_labels: ["nodelabel2"] + include_annotations: ["nodeannotation1"] + # use_regex_exclude: false + # exclude_annotations: ["nodeannotation2"] + #labels.dedot: true + #annotations.dedot: true + deployment: true + cronjob: true + ``` + :::: + +`kube_config` +: (Optional) `KUBECONFIG` environment variable, if present | Config file to use as the configuration for the Kubernetes client. + +`kube_client_options` +: (Optional) Additional configuration options for the Kubernetes client. Currently client QPS and burst are supported. If this setting is not configured, the Kubernetes client’s [default QPS and burst](https://pkg.go.dev/k8s.io/client-go/rest#pkg-constants) is used. + + ::::{dropdown} Expand this to see an example + ```yaml + kube_client_options: + qps: 5 + burst: 10 + ``` + :::: + +`cleanup_timeout` +: (Optional) Time of inactivity before stopping the running configuration for a container. + + **Default**: `60s` + +`sync_period` +: (Optional) Timeout for listing historical resources. + +`labels.dedot` +: (Optional) Whether to replace dots (`.`) in labels with underscores (`_`). `annotations.dedot` + + **Default**: `true` ## Indexers and matchers [kubernetes-indexers-and-matchers] @@ -194,12 +248,9 @@ Available matchers are: `resource_type` : (Optional) Type of the resource to obtain the ID of. Valid `resource_type`: - * `pod`: to make the lookup based on the Pod UID. When `resource_type` is set to `pod`, `logs_path` must be set as well, supported path in this case: - * `/var/lib/kubelet/pods/` used to read logs from mounted into the Pod volumes, those logs end up under `/var/lib/kubelet/pods//volumes//...` To use `/var/lib/kubelet/pods/` as a `log_path`, `/var/lib/kubelet/pods` must be mounted into the filebeat Pods. * `/var/log/pods/` Note: when using `resource_type: 'pod'` logs will be enriched only with Pod metadata: Pod id, Pod name, etc., not container metadata. - * `container`: to make the lookup based on the container ID, `logs_path` must be set to `/var/log/containers/`. It defaults to `container`. diff --git a/reference/fleet/add_nomad_metadata-processor.md b/reference/fleet/add_nomad_metadata-processor.md index 852c75220b..d30d6fc76b 100644 --- a/reference/fleet/add_nomad_metadata-processor.md +++ b/reference/fleet/add_nomad_metadata-processor.md @@ -35,17 +35,56 @@ Each event is annotated with the following information: {{agent}} processors execute *before* ingest pipelines, which means that they process the raw event data rather than the final event sent to {{es}}. For related limitations, refer to [What are some limitations of using processors?](/reference/fleet/agent-processors.md#limitations) :::: +`address` +: (Optional) URL of the agent API used to request the metadata. -| Name | Required | Default | Description | -| --- | --- | --- | --- | -| `address` | No | `http://127.0.0.1:4646` | URL of the agent API used to request the metadata. | -| `namespace` | No | | Namespace to watch. If set, only events for allocations in this namespace are annotated. | -| `region` | No | | Region to watch. If set, only events for allocations in this region are annotated. | -| `secret_id` | No | | SecretID to use when connecting with the agent API. This is an example ACL policy to apply to the token.

```json
namespace "*" {
policy = "read"
}
node {
policy = "read"
}
agent {
policy = "read"
}
```
| -| `refresh_interval` | No | `30s` | Interval used to update the cached metadata. | -| `cleanup_timeout` | No | `60s` | Time to wait before cleaning up an allocation’s associated resources after it has been removed.This is useful if you expect to receive events after an allocation has been removed, which can happen when collecting logs. | -| `scope` | No | `node` | Scope of the resources to watch.Specify `node` to get metadata for the allocations in a single agent, or `global`, to get metadata for allocations running on any agent. | -| `node` | No | | When using `scope: node`, use `node` to specify the name of the local node if it cannot be discovered automatically.

For example, you can use the following configuration when {{agent}} is collecting events from all the allocations in the cluster:

```yaml
- add_nomad_metadata:
scope: global
```
| + **Default**: `http://127.0.0.1:4646` + +`namespace` +: (Optional) Namespace to watch. If set, only events for allocations in this namespace are annotated. + +`region` +: (Optional) Region to watch. If set, only events for allocations in this region are annotated. + +`secret_id` +: (Optional) SecretID to use when connecting with the agent API. This is an example ACL policy to apply to the token. + + ```json + namespace "*" { + policy = "read" + } + node { + policy = "read" + } + agent { + policy = "read" + } + ``` + +`refresh_interval` +: (Optional) Interval used to update the cached metadata. + + **Default**: `30s` + +`cleanup_timeout` +: (Optional) Time to wait before cleaning up an allocation’s associated resources after it has been removed.This is useful if you expect to receive events after an allocation has been removed, which can happen when collecting logs. + + **Default**: `60s` + +`scope` +: (Optional) Scope of the resources to watch.Specify `node` to get metadata for the allocations in a single agent, or `global`, to get metadata for allocations running on any agent. + + **Default**: `node` + +`node` +: (Optional) When using `scope: node`, use `node` to specify the name of the local node if it cannot be discovered automatically. + + For example, you can use the following configuration when {{agent}} is collecting events from all the allocations in the cluster: + + ```yaml + - add_nomad_metadata: + scope: global + ``` ## Indexers and matchers [_indexers_and_matchers] diff --git a/reference/fleet/elastic-agent-ssl-configuration.md b/reference/fleet/elastic-agent-ssl-configuration.md index b58b5ae72d..791c6a5d8c 100644 --- a/reference/fleet/elastic-agent-ssl-configuration.md +++ b/reference/fleet/elastic-agent-ssl-configuration.md @@ -17,37 +17,275 @@ There are a number of SSL configuration settings available depending on whether For more information about using certificates, refer to [Secure connections](/reference/fleet/secure.md). :::: - $$$common-ssl-options$$$ -| Setting | Description | -| --- | --- | -| $$$ssl.ca_sha256-common-setting$$$
`ssl.ca_sha256`
| (string) This configures a certificate pin that you can use to ensure that a specific certificate is part of the verified chain.

The pin is a base64 encoded string of the SHA-256 of the certificate.

::::{note}
This check is not a replacement for the normal SSL validation, but it adds additional validation. If this setting is used with `verification_mode` set to `none`, the check will always fail because it will not receive any verified chains.
::::

| -| $$$ssl.cipher_suites-common-setting$$$
`ssl.cipher_suites`
| (list) The list of cipher suites to use. The first entry has the highest priority. If this option is omitted, the Go crypto library’s [default suites](https://golang.org/pkg/crypto/tls/) are used (recommended). Note that TLS 1.3 cipher suites are not individually configurable in Go, so they are not included in this list.

The following cipher suites are available:

* ECDHE-ECDSA-AES-128-CBC-SHA
* ECDHE-ECDSA-AES-128-CBC-SHA256: TLS 1.2 only. Disabled by default.
* ECDHE-ECDSA-AES-128-GCM-SHA256: TLS 1.2 only.
* ECDHE-ECDSA-AES-256-CBC-SHA
* ECDHE-ECDSA-AES-256-GCM-SHA384: TLS 1.2 only.
* ECDHE-ECDSA-CHACHA20-POLY1305: TLS 1.2 only.
* ECDHE-ECDSA-RC4-128-SHA: Disabled by default. RC4 not recommended.
* ECDHE-RSA-3DES-CBC3-SHA
* ECDHE-RSA-AES-128-CBC-SHA
* ECDHE-RSA-AES-128-CBC-SHA256: TLS 1.2 only. Disabled by default.
* ECDHE-RSA-AES-128-GCM-SHA256: TLS 1.2 only.
* ECDHE-RSA-AES-256-CBC-SHA
* ECDHE-RSA-AES-256-GCM-SHA384: TLS 1.2 only.
* ECDHE-RSA-CHACHA20-POLY1205: TLS 1.2 only.
* ECDHE-RSA-RC4-128-SHA: Disabled by default. RC4 not recommended.
* RSA-3DES-CBC3-SHA
* RSA-AES-128-CBC-SHA
* RSA-AES-128-CBC-SHA256: TLS 1.2 only. Disabled by default.
* RSA-AES-128-GCM-SHA256: TLS 1.2 only.
* RSA-AES-256-CBC-SHA
* RSA-AES-256-GCM-SHA384: TLS 1.2 only.
* RSA-RC4-128-SHA: Disabled by default. RC4 not recommended.

Here is a list of acronyms used in defining the cipher suites:

* 3DES: Cipher suites using triple DES
* AES-128/256: Cipher suites using AES with 128/256-bit keys.
* CBC: Cipher using Cipher Block Chaining as block cipher mode.
* ECDHE: Cipher suites using Elliptic Curve Diffie-Hellman (DH) ephemeral key exchange.
* ECDSA: Cipher suites using Elliptic Curve Digital Signature Algorithm for authentication.
* GCM: Galois/Counter mode is used for symmetric key cryptography.
* RC4: Cipher suites using RC4.
* RSA: Cipher suites using RSA.
* SHA, SHA256, SHA384: Cipher suites using SHA-1, SHA-256 or SHA-384.
| -| $$$ssl.curve_types-common-setting$$$
`ssl.curve_types`
| (list) The list of curve types for ECDHE (Elliptic Curve Diffie-Hellman ephemeral key exchange).

The following elliptic curve types are available:

* P-256
* P-384
* P-521
* X25519
| -| $$$ssl.enabled-common-setting$$$
`ssl.enabled`
| (boolean) Enables or disables the SSL configuration.

**Default:** `true`

::::{note}
SSL settings are disabled if either `enabled` is set to `false` or the `ssl` section is missing.
::::

| -| $$$ssl.supported_protocols-common-setting$$$
`ssl.supported_protocols`
| (list) List of allowed SSL/TLS versions. If the SSL/TLS server supports none of the specified versions, the connection will be dropped during or after the handshake. The list of allowed protocol versions include: `TLSv1.1`, `TLSv1.2`, and `TLSv1.3`.

**Default:** `[TLSv1.2, TLSv1.3]`
| +`ssl.ca_sha256` $$$ssl.ca_sha256-common-setting$$$ +: (string) This configures a certificate pin that you can use to ensure that a specific certificate is part of the verified chain. + + The pin is a base64 encoded string of the SHA-256 of the certificate. + + ::::{note} + This check is not a replacement for the normal SSL validation, but it adds additional validation. If this setting is used with `verification_mode` set to `none`, the check will always fail because it will not receive any verified chains. + :::: + + +`ssl.cipher_suites` $$$ssl.cipher_suites-common-setting$$$ +: (list) The list of cipher suites to use. The first entry has the highest priority. If this option is omitted, the Go crypto library’s [default suites](https://golang.org/pkg/crypto/tls/) are used (recommended). Note that TLS 1.3 cipher suites are not individually configurable in Go, so they are not included in this list. + + The following cipher suites are available: + + * ECDHE-ECDSA-AES-128-CBC-SHA + * ECDHE-ECDSA-AES-128-CBC-SHA256: TLS 1.2 only. Disabled by default. + * ECDHE-ECDSA-AES-128-GCM-SHA256: TLS 1.2 only. + * ECDHE-ECDSA-AES-256-CBC-SHA + * ECDHE-ECDSA-AES-256-GCM-SHA384: TLS 1.2 only. + * ECDHE-ECDSA-CHACHA20-POLY1305: TLS 1.2 only. + * ECDHE-ECDSA-RC4-128-SHA: Disabled by default. RC4 not recommended. + * ECDHE-RSA-3DES-CBC3-SHA + * ECDHE-RSA-AES-128-CBC-SHA + * ECDHE-RSA-AES-128-CBC-SHA256: TLS 1.2 only. Disabled by default. + * ECDHE-RSA-AES-128-GCM-SHA256: TLS 1.2 only. + * ECDHE-RSA-AES-256-CBC-SHA + * ECDHE-RSA-AES-256-GCM-SHA384: TLS 1.2 only. + * ECDHE-RSA-CHACHA20-POLY1205: TLS 1.2 only. + * ECDHE-RSA-RC4-128-SHA: Disabled by default. RC4 not recommended. + * RSA-3DES-CBC3-SHA + * RSA-AES-128-CBC-SHA + * RSA-AES-128-CBC-SHA256: TLS 1.2 only. Disabled by default. + * RSA-AES-128-GCM-SHA256: TLS 1.2 only. + * RSA-AES-256-CBC-SHA + * RSA-AES-256-GCM-SHA384: TLS 1.2 only. + * RSA-RC4-128-SHA: Disabled by default. RC4 not recommended. + + Here is a list of acronyms used in defining the cipher suites: + + * 3DES: Cipher suites using triple DES + * AES-128/256: Cipher suites using AES with 128/256-bit keys. + * CBC: Cipher using Cipher Block Chaining as block cipher mode. + * ECDHE: Cipher suites using Elliptic Curve Diffie-Hellman (DH) ephemeral key exchange. + * ECDSA: Cipher suites using Elliptic Curve Digital Signature Algorithm for authentication. + * GCM: Galois/Counter mode is used for symmetric key cryptography. + * RC4: Cipher suites using RC4. + * RSA: Cipher suites using RSA. + * SHA, SHA256, SHA384: Cipher suites using SHA-1, SHA-256 or SHA-384. + +`ssl.curve_types` $$$ssl.curve_types-common-setting$$$ +: (list) The list of curve types for ECDHE (Elliptic Curve Diffie-Hellman ephemeral key exchange). + + The following elliptic curve types are available: + + * P-256 + * P-384 + * P-521 + * X25519 + +`ssl.enabled` $$$ssl.enabled-common-setting$$$ +: (boolean) Enables or disables the SSL configuration. + + **Default:** `true` + + ::::{note} + SSL settings are disabled if either `enabled` is set to `false` or the `ssl` section is missing. + :::: + + +`ssl.supported_protocols` $$$ssl.supported_protocols-common-setting$$$ +: (list) List of allowed SSL/TLS versions. If the SSL/TLS server supports none of the specified versions, the connection will be dropped during or after the handshake. The list of allowed protocol versions include: `TLSv1.1`, `TLSv1.2`, and `TLSv1.3`. + + **Default:** `[TLSv1.2, TLSv1.3]` $$$client-ssl-options$$$ -| Setting | Description | -| --- | --- | -| $$$ssl.certificate-client-setting$$$
`ssl.certificate`
| (string) The path to the certificate for SSL client authentication. This setting is only required if `client_authentication` is specified. If `certificate` is not specified, client authentication is not available, and the connection might fail if the server requests client authentication. If the SSL server does not require client authentication, the certificate will be loaded, but not requested or used by the server.

Example:

```yaml
ssl.certificate: "/path/to/cert.pem"
```

When this setting is configured, the `ssl.key` setting is also required.

Specify a path, or embed a certificate directly in the `YAML` configuration:

```yaml
ssl.certificate: |
-----BEGIN CERTIFICATE-----
CERTIFICATE CONTENT APPEARS HERE
-----END CERTIFICATE-----
```
| -| $$$ssl.certificate_authorities-client-setting$$$
`ssl.certificate` `_authorities`
| (list) The list of root certificates for verifications (required). If `certificate_authorities` is empty or not set, the system keystore is used. If `certificate_authorities` is self-signed, the host system needs to trust that CA cert as well.

Example:

```yaml
ssl.certificate_authorities: ["/path/to/root/ca.pem"]
```

Specify a list of files that {{agent}} will read, or embed a certificate directly in the `YAML` configuration:

```yaml
ssl.certificate_authorities:
- |
-----BEGIN CERTIFICATE-----
CERTIFICATE CONTENT APPEARS HERE
-----END CERTIFICATE-----
```
| -| $$$ssl.key-client-setting$$$
`ssl.key`
| (string) The client certificate key used for client authentication. Only required if `client_authentication` is configured.

Example:

```yaml
ssl.key: "/path/to/cert.key"
```

Specify a path, or embed the private key directly in the `YAML` configuration:

```yaml
ssl.key: |
-----BEGIN PRIVATE KEY-----
KEY CONTENT APPEARS HERE
-----END PRIVATE KEY-----
```
| -| $$$ssl.key_passphrase-client-setting$$$
`ssl.key_passphrase`
| (string) The passphrase used to decrypt an encrypted key stored in the configured `key` file.
| -| $$$ssl.verification_mode-client-setting$$$
`ssl.verification` `_mode`
| (string) Controls the verification of server certificates. Valid values are:

`full`
: Verifies that the provided certificate is signed by a trusted authority (CA) and also verifies that the server’s hostname (or IP address) matches the names identified within the certificate.

`strict`
: Verifies that the provided certificate is signed by a trusted authority (CA) and also verifies that the server’s hostname (or IP address) matches the names identified within the certificate. If the Subject Alternative Name is empty, it returns an error.

`certificate`
: Verifies that the provided certificate is signed by a trusted authority (CA), but does not perform any hostname verification.

`none`
: Performs *no verification* of the server’s certificate. This mode disables many of the security benefits of SSL/TLS and should only be used after cautious consideration. It is primarily intended as a temporary diagnostic mechanism when attempting to resolve TLS errors; its use in production environments is strongly discouraged.

**Default:** `full`
| -| $$$ssl.ca_trusted_fingerprint$$$
`ssl.ca_trusted` `_fingerprint`
| (string) A HEX encoded SHA-256 of a CA certificate. If this certificate is present in the chain during the handshake, it will be added to the `certificate_authorities` list and the handshake will continue normally.

Example:

```yaml
ssl.ca_trusted_fingerprint: 3b24d33844d6553...826
```
| +`ssl.certificate` $$$ssl.certificate-client-setting$$$ +: (string) The path to the certificate for SSL client authentication. This setting is only required if `client_authentication` is specified. If `certificate` is not specified, client authentication is not available, and the connection might fail if the server requests client authentication. If the SSL server does not require client authentication, the certificate will be loaded, but not requested or used by the server. + + Example: + + ```yaml + ssl.certificate: "/path/to/cert.pem" + ``` + + When this setting is configured, the `ssl.key` setting is also required. + + Specify a path, or embed a certificate directly in the `YAML` configuration: + + ```yaml + ssl.certificate: | + -----BEGIN CERTIFICATE----- + CERTIFICATE CONTENT APPEARS HERE + -----END CERTIFICATE----- + ``` + +`ssl.certificate_authorities` $$$ssl.certificate_authorities-client-setting$$$ +: (list) The list of root certificates for verifications (required). If `certificate_authorities` is empty or not set, the system keystore is used. If `certificate_authorities` is self-signed, the host system needs to trust that CA cert as well. + + Example: + + ```yaml + ssl.certificate_authorities: ["/path/to/root/ca.pem"] + ``` + + Specify a list of files that {{agent}} will read, or embed a certificate directly in the `YAML` configuration: + + ```yaml + ssl.certificate_authorities: + - | + -----BEGIN CERTIFICATE----- + CERTIFICATE CONTENT APPEARS HERE + -----END CERTIFICATE----- + ``` + +`ssl.key` $$$ssl.key-client-setting$$$ +: (string) The client certificate key used for client authentication. Only required if `client_authentication` is configured. + + Example: + + ```yaml + ssl.key: "/path/to/cert.key" + ``` + + Specify a path, or embed the private key directly in the `YAML` configuration: + + ```yaml + ssl.key: | + -----BEGIN PRIVATE KEY----- + KEY CONTENT APPEARS HERE + -----END PRIVATE KEY----- + ``` + +`ssl.key_passphrase` $$$ssl.key_passphrase-client-setting$$$ +: (string) The passphrase used to decrypt an encrypted key stored in the configured `key` file. + +`ssl.verification_mode` $$$ssl.verification_mode-client-setting$$$ +: (string) Controls the verification of server certificates. Valid values are: + + `full` + : Verifies that the provided certificate is signed by a trusted authority (CA) and also verifies that the server’s hostname (or IP address) matches the names identified within the certificate. + + `strict` + : Verifies that the provided certificate is signed by a trusted authority (CA) and also verifies that the server’s hostname (or IP address) matches the names identified within the certificate. If the Subject Alternative Name is empty, it returns an error. + + `certificate` + : Verifies that the provided certificate is signed by a trusted authority (CA), but does not perform any hostname verification. + + `none` + : Performs *no verification* of the server’s certificate. This mode disables many of the security benefits of SSL/TLS and should only be used after cautious consideration. It is primarily intended as a temporary diagnostic mechanism when attempting to resolve TLS errors; its use in production environments is strongly discouraged. + + **Default:** `full` + +`ssl.ca_trusted_fingerprint` $$$ssl.ca_trusted_fingerprint$$$ +: (string) A HEX encoded SHA-256 of a CA certificate. If this certificate is present in the chain during the handshake, it will be added to the `certificate_authorities` list and the handshake will continue normally. + + Example: + + ```yaml + ssl.ca_trusted_fingerprint: 3b24d33844d6553...826 + ``` $$$server-ssl-options$$$ -| Setting | Description | -| --- | --- | -| $$$ssl.certificate-server-setting$$$
`ssl.certificate`
| (string) The path to the certificate for SSL server authentication. If the certificate is not specified, startup will fail.

Example:

```yaml
ssl.certificate: "/path/to/server/cert.pem"
```

When this setting is configured, the `key` setting is also required.

Specify a path, or embed a certificate directly in the `YAML` configuration:

```yaml
ssl.certificate: |
-----BEGIN CERTIFICATE-----
CERTIFICATE CONTENT APPEARS HERE
-----END CERTIFICATE-----
```
| -| $$$ssl.certificate_authorities-server-setting$$$
`ssl.certificate` `_authorities`
| (list) The list of root certificates for client verifications is only required if `client_authentication` is configured. If `certificate_authorities` is empty or not set, and `client_authentication` is configured, the system keystore is used. If `certificate_authorities` is self-signed, the host system needs to trust that CA cert too.

Example:

```yaml
ssl.certificate_authorities: ["/path/to/root/ca.pem"]
```

Specify a list of files that {{agent}} will read, or embed a certificate directly in the `YAML` configuration:

```yaml
ssl.certificate_authorities:
- |
-----BEGIN CERTIFICATE-----
CERTIFICATE CONTENT APPEARS HERE
-----END CERTIFICATE-----
```
| -| $$$ssl.client_authentication-server-setting$$$
`ssl.client_` `authentication`
| (string) Configures client authentication. The valid options are:

`none`
: Disables client authentication.

`optional`
: When a client certificate is supplied, the server will verify it.

`required`
: Requires clients to provide a valid certificate.

**Default:** `required` (if `certificate_authorities` is set); otherwise, `none`
| -| $$$ssl.key-server-setting$$$
`ssl.key`
| (string) The server certificate key used for authentication (required).

Example:

```yaml
ssl.key: "/path/to/server/cert.key"
```

Specify a path, or embed the private key directly in the `YAML` configuration:

```yaml
ssl.key: |
-----BEGIN PRIVATE KEY-----
KEY CONTENT APPEARS HERE
-----END PRIVATE KEY-----
```
| -| $$$ssl.key_passphrase-server-setting$$$
`ssl.key_passphrase`
| (string) The passphrase used to decrypt an encrypted key stored in the configured `key` file.
| -| $$$ssl.renegotiation-server-setting$$$
`ssl.renegotiation`
| (string) Configures the type of TLS renegotiation to support. The valid options are:

`never`
: Disables renegotiation.

`once`
: Allows a remote server to request renegotiation once per connection.

`freely`
: Allows a remote server to request renegotiation repeatedly.

**Default:** `never`
| -| $$$ssl.verification_mode-server-setting$$$
`ssl.verification` `_mode`
| (string) Controls the verification of client certificates. Valid values are:

`full`
: Verifies that the provided certificate is signed by a trusted authority (CA) and also verifies that the server’s hostname (or IP address) matches the names identified within the certificate.

`strict`
: Verifies that the provided certificate is signed by a trusted authority (CA) and also verifies that the server’s hostname (or IP address) matches the names identified within the certificate. If the Subject Alternative Name is empty, it returns an error.

`certificate`
: Verifies that the provided certificate is signed by a trusted authority (CA), but does not perform any hostname verification.

`none`
: Performs *no verification* of the server’s certificate. This mode disables many of the security benefits of SSL/TLS and should only be used after cautious consideration. It is primarily intended as a temporary diagnostic mechanism when attempting to resolve TLS errors; its use in production environments is strongly discouraged.

**Default:** `full`
| +`ssl.certificate` $$$ssl.certificate-server-setting$$$ +: (string) The path to the certificate for SSL server authentication. If the certificate is not specified, startup will fail. + + Example: + + ```yaml + ssl.certificate: "/path/to/server/cert.pem" + ``` + + When this setting is configured, the `key` setting is also required. + + Specify a path, or embed a certificate directly in the `YAML` configuration: + + ```yaml + ssl.certificate: | + -----BEGIN CERTIFICATE----- + CERTIFICATE CONTENT APPEARS HERE + -----END CERTIFICATE----- + ``` + +`ssl.certificate_authorities` $$$ssl.certificate_authorities-server-setting$$$ +: (list) The list of root certificates for client verifications is only required if `client_authentication` is configured. If `certificate_authorities` is empty or not set, and `client_authentication` is configured, the system keystore is used. If `certificate_authorities` is self-signed, the host system needs to trust that CA cert too. + + Example: + + ```yaml + ssl.certificate_authorities: ["/path/to/root/ca.pem"] + ``` + + Specify a list of files that {{agent}} will read, or embed a certificate directly in the `YAML` configuration: + + ```yaml + ssl.certificate_authorities: + - | + -----BEGIN CERTIFICATE----- + CERTIFICATE CONTENT APPEARS HERE + -----END CERTIFICATE----- + ``` + +`ssl.client_authentication` $$$ssl.client_authentication-server-setting$$$ +: (string) Configures client authentication. The valid options are: + + `none` + : Disables client authentication. + + `optional` + : When a client certificate is supplied, the server will verify it. + + `required` + : Requires clients to provide a valid certificate. + + **Default:** `required` (if `certificate_authorities` is set); otherwise, `none` + +`ssl.key` $$$ssl.key-server-setting$$$ +: (string) The server certificate key used for authentication (required). + + Example: + + ```yaml + ssl.key: "/path/to/server/cert.key" + ``` + + Specify a path, or embed the private key directly in the `YAML` configuration: + + ```yaml + ssl.key: | + -----BEGIN PRIVATE KEY----- + KEY CONTENT APPEARS HERE + -----END PRIVATE KEY----- + ``` + +`ssl.key_passphrase` $$$ssl.key_passphrase-server-setting$$$ +: (string) The passphrase used to decrypt an encrypted key stored in the configured `key` file. + +`ssl.renegotiation` $$$ssl.renegotiation-server-setting$$$ +: (string) Configures the type of TLS renegotiation to support. The valid options are: + + `never` + : Disables renegotiation. + + `once` + : Allows a remote server to request renegotiation once per connection. + + `freely` + : Allows a remote server to request renegotiation repeatedly. + + **Default:** `never` + +`ssl.verification_mode` $$$ssl.verification_mode-server-setting$$$ +: (string) Controls the verification of client certificates. Valid values are: + + `full` + : Verifies that the provided certificate is signed by a trusted authority (CA) and also verifies that the server’s hostname (or IP address) matches the names identified within the certificate. + + `strict` + : Verifies that the provided certificate is signed by a trusted authority (CA) and also verifies that the server’s hostname (or IP address) matches the names identified within the certificate. If the Subject Alternative Name is empty, it returns an error. + + `certificate` + : Verifies that the provided certificate is signed by a trusted authority (CA), but does not perform any hostname verification. + + `none` + : Performs *no verification* of the server’s certificate. This mode disables many of the security benefits of SSL/TLS and should only be used after cautious consideration. It is primarily intended as a temporary diagnostic mechanism when attempting to resolve TLS errors; its use in production environments is strongly discouraged. + + **Default:** `full` diff --git a/reference/fleet/elasticsearch-output.md b/reference/fleet/elasticsearch-output.md index 5250fa2533..6dea91773a 100644 --- a/reference/fleet/elasticsearch-output.md +++ b/reference/fleet/elasticsearch-output.md @@ -52,14 +52,46 @@ The `elasticsearch` output type supports the following settings, grouped by cate ## Commonly used settings [output-elasticsearch-commonly-used-settings] -| Setting | Description | -| --- | --- | -| $$$output-elasticsearch-enabled-setting$$$
`enabled`
| (boolean) Enables or disables the output. If set to `false`, the output is disabled.

**Default:** `true`
| -| $$$output-elasticsearch-hosts-setting$$$
`hosts`
| (list) The list of {{es}} nodes to connect to. The events are distributed to these nodes in round robin order. If one node becomes unreachable, the event is automatically sent to another node. Each {{es}} node can be defined as a `URL` or `IP:PORT`. For example: `http://192.15.3.2`, `https://es.found.io:9230` or `192.24.3.2:9300`. If no port is specified, `9200` is used.

::::{note}
When a node is defined as an `IP:PORT`, the *scheme* and *path* are taken from the `protocol` and `path` settings.
::::


```yaml
outputs:
default:
type: elasticsearch
hosts: ["10.45.3.2:9220", "10.45.3.1:9230"] <1>
protocol: https
path: /elasticsearch
```

1. In this example, the {{es}} nodes are available at `https://10.45.3.2:9220/elasticsearch` and `https://10.45.3.1:9230/elasticsearch`.


Note that Elasticsearch Nodes in the [{{serverless-full}}](/deploy-manage/deploy/elastic-cloud/serverless.md) environment are exposed on port 443.
| -| $$$output-elasticsearch-protocol-setting$$$
`protocol`
| (string) The name of the protocol {{es}} is reachable on. The options are: `http` or `https`. The default is `http`. However, if you specify a URL for `hosts`, the value of `protocol` is overridden by whatever scheme you specify in the URL.
| -| $$$output-elasticsearch-proxy_disable-setting$$$
`proxy_disable`
| (boolean) If set to `true`, all proxy settings, including `HTTP_PROXY` and `HTTPS_PROXY` variables, are ignored.

**Default:** `false`
| -| $$$output-elasticsearch-proxy_headers-setting$$$
`proxy_headers`
| (string) Additional headers to send to proxies during CONNECT requests.
| -| $$$output-elasticsearch-proxy_url-setting$$$
`proxy_url`
| (string) The URL of the proxy to use when connecting to the {{es}} servers. The value may be either a complete URL or a `host[:port]`, in which case the `http` scheme is assumed. If a value is not specified through the configuration file then proxy environment variables are used. See the [Go documentation](https://golang.org/pkg/net/http/#ProxyFromEnvironment) for more information about the environment variables.
| +`enabled` $$$output-elasticsearch-enabled-setting$$$ +: (boolean) Enables or disables the output. If set to `false`, the output is disabled. + + **Default:** `true` + +`hosts` $$$output-elasticsearch-hosts-setting$$$ +: (list) The list of {{es}} nodes to connect to. The events are distributed to these nodes in round robin order. If one node becomes unreachable, the event is automatically sent to another node. Each {{es}} node can be defined as a `URL` or `IP:PORT`. For example: `http://192.15.3.2`, `https://es.found.io:9230` or `192.24.3.2:9300`. If no port is specified, `9200` is used. + + ::::{note} + When a node is defined as an `IP:PORT`, the *scheme* and *path* are taken from the `protocol` and `path` settings. + :::: + + + ```yaml + outputs: + default: + type: elasticsearch + hosts: ["10.45.3.2:9220", "10.45.3.1:9230"] <1> + protocol: https + path: /elasticsearch + ``` + + 1. In this example, the {{es}} nodes are available at `https://10.45.3.2:9220/elasticsearch` and `https://10.45.3.1:9230/elasticsearch`. + + + Note that Elasticsearch Nodes in the [{{serverless-full}}](/deploy-manage/deploy/elastic-cloud/serverless.md) environment are exposed on port 443. + +`protocol` $$$output-elasticsearch-protocol-setting$$$ +: (string) The name of the protocol {{es}} is reachable on. The options are: `http` or `https`. The default is `http`. However, if you specify a URL for `hosts`, the value of `protocol` is overridden by whatever scheme you specify in the URL. + +`proxy_disable` $$$output-elasticsearch-proxy_disable-setting$$$ +: (boolean) If set to `true`, all proxy settings, including `HTTP_PROXY` and `HTTPS_PROXY` variables, are ignored. + + **Default:** `false` + +`proxy_headers` $$$output-elasticsearch-proxy_headers-setting$$$ +: (string) Additional headers to send to proxies during CONNECT requests. + +`proxy_url` $$$output-elasticsearch-proxy_url-setting$$$ +: (string) The URL of the proxy to use when connecting to the {{es}} servers. The value may be either a complete URL or a `host[:port]`, in which case the `http` scheme is assumed. If a value is not specified through the configuration file then proxy environment variables are used. See the [Go documentation](https://golang.org/pkg/net/http/#ProxyFromEnvironment) for more information about the environment variables. ## Authentication settings [output-elasticsearch-authentication-settings] @@ -82,10 +114,15 @@ outputs: password: "your-password" ``` -| Setting | Description | -| --- | --- | -| $$$output-elasticsearch-password-setting$$$
`password`
| (string) The basic authentication password for connecting to {{es}}.
| -| $$$output-elasticsearch-username-setting$$$
`username`
| (string) The basic authentication username for connecting to {{es}}.

This user needs the privileges required to publish events to {{es}}.

Note that in an [{{serverless-full}}](/deploy-manage/deploy/elastic-cloud/serverless.md) environment you need to use [token-based (API key) authentication](#output-elasticsearch-apikey-authentication-settings).
| +`password` $$$output-elasticsearch-password-setting$$$ +: (string) The basic authentication password for connecting to {{es}}. + +`username` $$$output-elasticsearch-username-setting$$$ +: (string) The basic authentication username for connecting to {{es}}. + + This user needs the privileges required to publish events to {{es}}. + + Note that in an [{{serverless-full}}](/deploy-manage/deploy/elastic-cloud/serverless.md) environment you need to use [token-based (API key) authentication](#output-elasticsearch-apikey-authentication-settings). ### Token-based (API key) authentication [output-elasticsearch-apikey-authentication-settings] @@ -98,9 +135,8 @@ outputs: api_key: "KnR6yE41RrSowb0kQ0HWoA" ``` -| Setting | Description | -| --- | --- | -| $$$output-elasticsearch-api_key-setting$$$
`api_key`
| (string) Instead of using a username and password, you can use [API keys](/deploy-manage/api-keys/elasticsearch-api-keys.md) to secure communication with {{es}}. The value must be the ID of the API key and the API key joined by a colon: `id:api_key`. Token-based authentication is required in an [{{serverless-full}}](/deploy-manage/deploy/elastic-cloud/serverless.md) environment.
| +`api_key` $$$output-elasticsearch-api_key-setting$$$ +: (string) Instead of using a username and password, you can use [API keys](/deploy-manage/api-keys/elasticsearch-api-keys.md) to secure communication with {{es}}. The value must be the ID of the API key and the API key joined by a colon: `id:api_key`. Token-based authentication is required in an [{{serverless-full}}](/deploy-manage/deploy/elastic-cloud/serverless.md) environment. ### Public Key Infrastructure (PKI) certificates [output-elasticsearch-pki-certs-authentication-settings] @@ -146,34 +182,142 @@ The service principal name for the {{es}} instance is constructed from these opt `HTTP/my-elasticsearch.elastic.co@ELASTIC.CO` -| Setting | Description | -| --- | --- | -| $$$output-elasticsearch-kerberos.auth_type-setting$$$
`kerberos.auth_type`
| (string) The type of authentication to use with Kerberos KDC:

`password`
: When specified, also set `kerberos.username` and `kerberos.password`.

`keytab`
: When specified, also set `kerberos.username` and `kerberos.keytab`. The keytab must contain the keys of the selected principal, or authentication fails.

**Default:** `password`
| -| $$$output-elasticsearch-kerberos.config_path$$$
`kerberos.config_path`
| (string) Path to the `krb5.conf`. {{agent}} uses this setting to find the Kerberos KDC to retrieve a ticket.
| -| $$$output-elasticsearch-kerberos.enabled-setting$$$
`kerberos.enabled`
| (boolean) Enables or disables the Kerberos configuration.

::::{note}
Kerberos settings are disabled if either `enabled` is set to `false` or the `kerberos` section is missing.
::::

| -| $$$output-elasticsearch-kerberos.enable_krb5_fast$$$
`kerberos.enable_krb5_fast`
| (boolean) If `true`, enables Kerberos FAST authentication. This may conflict with some Active Directory installations.

**Default:** `false`
| -| $$$output-elasticsearch-kerberos.keytab$$$
`kerberos.keytab`
| (string) If `kerberos.auth_type` is `keytab`, provide the path to the keytab of the selected principal.
| -| $$$output-elasticsearch-kerberos.password$$$
`kerberos.password`
| (string) If `kerberos.auth_type` is `password`, provide a password for the selected principal.
| -| $$$output-elasticsearch-kerberos.realm$$$
`kerberos.realm`
| (string) Name of the realm where the output resides.
| -| $$$output-elasticsearch-kerberos.username$$$
`kerberos.username`
| (string) Name of the principal used to connect to the output.
| +`kerberos.auth_type` $$$output-elasticsearch-kerberos.auth_type-setting$$$ +: (string) The type of authentication to use with Kerberos KDC: + + `password` + : When specified, also set `kerberos.username` and `kerberos.password`. + + `keytab` + : When specified, also set `kerberos.username` and `kerberos.keytab`. The keytab must contain the keys of the selected principal, or authentication fails. + + **Default:** `password` + +`kerberos.config_path` $$$output-elasticsearch-kerberos.config_path$$$ +: (string) Path to the `krb5.conf`. {{agent}} uses this setting to find the Kerberos KDC to retrieve a ticket. + +`kerberos.enabled` $$$output-elasticsearch-kerberos.enabled-setting$$$ +: (boolean) Enables or disables the Kerberos configuration. + + ::::{note} + Kerberos settings are disabled if either `enabled` is set to `false` or the `kerberos` section is missing. + :::: + +`kerberos.enable_krb5_fast` $$$output-elasticsearch-kerberos.enable_krb5_fast$$$ +: (boolean) If `true`, enables Kerberos FAST authentication. This may conflict with some Active Directory installations. + + **Default:** `false` + +`kerberos.keytab` $$$output-elasticsearch-kerberos.keytab$$$ +: (string) If `kerberos.auth_type` is `keytab`, provide the path to the keytab of the selected principal. + +`kerberos.password` $$$output-elasticsearch-kerberos.password$$$ +: (string) If `kerberos.auth_type` is `password`, provide a password for the selected principal. + +`kerberos.realm` $$$output-elasticsearch-kerberos.realm$$$ +: (string) Name of the realm where the output resides. + +`kerberos.username` $$$output-elasticsearch-kerberos.username$$$ +: (string) Name of the principal used to connect to the output. ### Compatibility setting [output-elasticsearch-compatibility-setting] -| Setting | Description | -| --- | --- | -| $$$output-elasticsearch-allow_older_versions-setting$$$
`allow_older_versions`
| Allow {{agent}} to connect and send output to an {{es}} instance that is running an earlier version than the agent version.

Note that this setting does not affect {{agent}}'s ability to connect to {{fleet-server}}. {{fleet-server}} will not accept a connection from an agent at a later major or minor version. It will accept a connection from an agent at a later patch version. For example, an {{agent}} at version 8.14.3 can connect to a {{fleet-server}} on version 8.14.0, but an agent at version 8.15.0 or later is not able to connect.

**Default:** `true`
| +`allow_older_versions` $$$output-elasticsearch-allow_older_versions-setting$$$ +: Allow {{agent}} to connect and send output to an {{es}} instance that is running an earlier version than the agent version. + + Note that this setting does not affect {{agent}}'s ability to connect to {{fleet-server}}. {{fleet-server}} will not accept a connection from an agent at a later major or minor version. It will accept a connection from an agent at a later patch version. For example, an {{agent}} at version 8.14.3 can connect to a {{fleet-server}} on version 8.14.0, but an agent at version 8.15.0 or later is not able to connect. + + **Default:** `true` ### Data parsing, filtering, and manipulation settings [output-elasticsearch-data-parsing-settings] Settings used to parse, filter, and transform data. -| Setting | Description | -| --- | --- | -| $$$output-elasticsearch-escape_html-setting$$$
`escape_html`
| (boolean) Configures escaping of HTML in strings. Set to `true` to enable escaping.

**Default:** `false`
| -| $$$output-elasticsearch-pipeline-setting$$$
`pipeline`
| (string) A format string value that specifies the [ingest pipeline](/manage-data/ingest/transform-enrich/ingest-pipelines.md) to write events to.

```yaml
outputs:
default:
type: elasticsearchoutput.elasticsearch:
hosts: ["http://localhost:9200"]
pipeline: my_pipeline_id
```

You can set the ingest pipeline dynamically by using a format string to access any event field. For example, this configuration uses a custom field, `fields.log_type`, to set the pipeline for each event:

```yaml
outputs:
default:
type: elasticsearch hosts: ["http://localhost:9200"]
pipeline: "%{[fields.log_type]}_pipeline"
```

With this configuration, all events with `log_type: normal` are sent to a pipeline named `normal_pipeline`, and all events with `log_type: critical` are sent to a pipeline named `critical_pipeline`.

::::{tip}
To learn how to add custom fields to events, see the `fields` option.
::::


See the `pipelines` setting for other ways to set the ingest pipeline dynamically.
| -| $$$output-elasticsearch-pipelines-setting$$$
`pipelines`
| An array of pipeline selector rules. Each rule specifies the [ingest pipeline](/manage-data/ingest/transform-enrich/ingest-pipelines.md) to use for events that match the rule. During publishing, {{agent}} uses the first matching rule in the array. Rules can contain conditionals, format string-based fields, and name mappings. If the `pipelines` setting is missing or no rule matches, the `pipeline` setting is used.

Rule settings:

**`pipeline`**
: The pipeline format string to use. If this string contains field references, such as `%{[fields.name]}`, the fields must exist, or the rule fails.

**`mappings`**
: A dictionary that takes the value returned by `pipeline` and maps it to a new name.

**`default`**
: The default string value to use if `mappings` does not find a match.

**`when`**
: A condition that must succeed in order to execute the current rule.

All the conditions supported by processors are also supported here.

The following example sends events to a specific pipeline based on whether the `message` field contains the specified string:

```yaml
outputs:
default:
type: elasticsearch hosts: ["http://localhost:9200"]
pipelines:
- pipeline: "warning_pipeline"
when.contains:
message: "WARN"
- pipeline: "error_pipeline"
when.contains:
message: "ERR"
```

The following example sets the pipeline by taking the name returned by the `pipeline` format string and mapping it to a new name that’s used for the pipeline:

```yaml
outputs:
default:
type: elasticsearch
hosts: ["http://localhost:9200"]
pipelines:
- pipeline: "%{[fields.log_type]}"
mappings:
critical: "sev1_pipeline"
normal: "sev2_pipeline"
default: "sev3_pipeline"
```

With this configuration, all events with `log_type: critical` are sent to `sev1_pipeline`, all events with `log_type: normal` are sent to a `sev2_pipeline`, and all other events are sent to `sev3_pipeline`.
| +`escape_html` $$$output-elasticsearch-escape_html-setting$$$ +: (boolean) Configures escaping of HTML in strings. Set to `true` to enable escaping. + + **Default:** `false` + +`pipeline` $$$output-elasticsearch-pipeline-setting$$$ +: (string) A format string value that specifies the [ingest pipeline](/manage-data/ingest/transform-enrich/ingest-pipelines.md) to write events to. + + ```yaml + outputs: + default: + type: elasticsearchoutput.elasticsearch: + hosts: ["http://localhost:9200"] + pipeline: my_pipeline_id + ``` + + You can set the ingest pipeline dynamically by using a format string to access any event field. For example, this configuration uses a custom field, `fields.log_type`, to set the pipeline for each event: + + ```yaml + outputs: + default: + type: elasticsearch hosts: ["http://localhost:9200"] + pipeline: "%{[fields.log_type]}_pipeline" + ``` + + With this configuration, all events with `log_type: normal` are sent to a pipeline named `normal_pipeline`, and all events with `log_type: critical` are sent to a pipeline named `critical_pipeline`. + + ::::{tip} + To learn how to add custom fields to events, see the `fields` option. + :::: + + See the `pipelines` setting for other ways to set the ingest pipeline dynamically. + +`pipelines` $$$output-elasticsearch-pipelines-setting$$$ +: An array of pipeline selector rules. Each rule specifies the [ingest pipeline](/manage-data/ingest/transform-enrich/ingest-pipelines.md) to use for events that match the rule. During publishing, {{agent}} uses the first matching rule in the array. Rules can contain conditionals, format string-based fields, and name mappings. If the `pipelines` setting is missing or no rule matches, the `pipeline` setting is used. + + Rule settings: + + **`pipeline`** + : The pipeline format string to use. If this string contains field references, such as `%{[fields.name]}`, the fields must exist, or the rule fails. + + **`mappings`** + : A dictionary that takes the value returned by `pipeline` and maps it to a new name. + + **`default`** + : The default string value to use if `mappings` does not find a match. + + **`when`** + : A condition that must succeed in order to execute the current rule. + + All the conditions supported by processors are also supported here. + + The following example sends events to a specific pipeline based on whether the `message` field contains the specified string: + + ```yaml + outputs: + default: + type: elasticsearch hosts: ["http://localhost:9200"] + pipelines: + - pipeline: "warning_pipeline" + when.contains: + message: "WARN" + - pipeline: "error_pipeline" + when.contains: + message: "ERR" + ``` + + The following example sets the pipeline by taking the name returned by the `pipeline` format string and mapping it to a new name that’s used for the pipeline: + + ```yaml + outputs: + default: + type: elasticsearch + hosts: ["http://localhost:9200"] + pipelines: + - pipeline: "%{[fields.log_type]}" + mappings: + critical: "sev1_pipeline" + normal: "sev2_pipeline" + default: "sev3_pipeline" + ``` + + With this configuration, all events with `log_type: critical` are sent to `sev1_pipeline`, all events with `log_type: normal` are sent to a `sev2_pipeline`, and all other events are sent to `sev3_pipeline`. @@ -181,11 +325,26 @@ Settings used to parse, filter, and transform data. Settings that modify the HTTP requests sent to {{es}}. -| Setting | Description | -| --- | --- | -| $$$output-elasticsearch-headers-setting$$$
`headers`
| Custom HTTP headers to add to each request created by the {{es}} output.

Example:

```yaml
outputs:
default:
type: elasticsearch
headers:
X-My-Header: Header contents
```

Specify multiple header values for the same header name by separating them with a comma.
| -| $$$output-elasticsearch-parameters-setting$$$
`parameters`
| Dictionary of HTTP parameters to pass within the URL with index operations.
| -| $$$output-elasticsearch-path-setting$$$
`path`
| (string) An HTTP path prefix that is prepended to the HTTP API calls. This is useful for the cases where {{es}} listens behind an HTTP reverse proxy that exports the API under a custom prefix.
| +`headers` $$$output-elasticsearch-headers-setting$$$ +: Custom HTTP headers to add to each request created by the {{es}} output. + + Example: + + ```yaml + outputs: + default: + type: elasticsearch + headers: + X-My-Header: Header contents + ``` + + Specify multiple header values for the same header name by separating them with a comma. + +`parameters` $$$output-elasticsearch-parameters-setting$$$ +: Dictionary of HTTP parameters to pass within the URL with index operations. + +`path` $$$output-elasticsearch-path-setting$$$ +: (string) An HTTP path prefix that is prepended to the HTTP API calls. This is useful for the cases where {{es}} listens behind an HTTP reverse proxy that exports the API under a custom prefix. ## Memory queue settings [output-elasticsearch-memory-queue-settings] @@ -212,11 +371,20 @@ This sample configuration forwards events to the output when there are enough ev queue.mem.flush.timeout: 5s ``` -| Setting | Description | -| --- | --- | -| $$$output-elasticsearch-queue.mem.events-setting$$$
`queue.mem.events`
| The number of events the queue can store. This value should be evenly divisible by the smaller of `queue.mem.flush.min_events` or `bulk_max_size` to avoid sending partial batches to the output.

**Default:** `3200 events`
| -| $$$output-elasticsearch-queue.mem.flush.min_events-setting$$$
`queue.mem.flush.min_events`
| `flush.min_events` is a legacy parameter, and new configurations should prefer to control batch size with `bulk_max_size`. As of 8.13, there is never a performance advantage to limiting batch size with `flush.min_events` instead of `bulk_max_size`

**Default:** `1600 events`
| -| $$$output-elasticsearch-queue.mem.flush.timeout-setting$$$
`queue.mem.flush.timeout`
| (int) The maximum wait time for `queue.mem.flush.min_events` to be fulfilled. If set to 0s, events are available to the output immediately.

**Default:** `10s`
| +`queue.mem.events` $$$output-elasticsearch-queue.mem.events-setting$$$ +: The number of events the queue can store. This value should be evenly divisible by the smaller of `queue.mem.flush.min_events` or `bulk_max_size` to avoid sending partial batches to the output. + + **Default:** `3200 events` + +`queue.mem.flush.min_events` $$$output-elasticsearch-queue.mem.flush.min_events-setting$$$ +: `flush.min_events` is a legacy parameter, and new configurations should prefer to control batch size with `bulk_max_size`. As of 8.13, there is never a performance advantage to limiting batch size with `flush.min_events` instead of `bulk_max_size` + + **Default:** `1600 events` + +`queue.mem.flush.timeout` $$$output-elasticsearch-queue.mem.flush.timeout-setting$$$ +: (int) The maximum wait time for `queue.mem.flush.min_events` to be fulfilled. If set to 0s, events are available to the output immediately. + + **Default:** `10s` ## Performance tuning settings [output-elasticsearch-performance-tuning-settings] @@ -227,15 +395,73 @@ Use the `preset` option to automatically configure the group of performance tuni The performance tuning `preset` values take precedence over any settings that may be defined separately. If you want to change any setting, set `preset` to `custom` and specify the performance tuning settings individually. -| Setting | Description | -| --- | --- | -| $$$output-elasticsearch-backoff.init-setting$$$
`backoff.init`
| (string) The number of seconds to wait before trying to reconnect to {{es}} after a network error. After waiting `backoff.init` seconds, {{agent}} tries to reconnect. If the attempt fails, the backoff timer is increased exponentially up to `backoff.max`. After a successful connection, the backoff timer is reset.

**Default:** `1s`
| -| $$$output-elasticsearch-backoff.max-setting$$$
`backoff.max`
| (string) The maximum number of seconds to wait before attempting to connect to {{es}} after a network error.

**Default:** `60s`
| -| $$$output-elasticsearch-bulk_max_size-setting$$$
`bulk_max_size`
| (int) The maximum number of events to bulk in a single {{es}} bulk API index request.

Events can be collected into batches. {{agent}} will split batches larger than `bulk_max_size` into multiple batches.

Specifying a larger batch size can improve performance by lowering the overhead of sending events. However big batch sizes can also increase processing times, which might result in API errors, killed connections, timed-out publishing requests, and, ultimately, lower throughput.

Setting `bulk_max_size` to values less than or equal to 0 turns off the splitting of batches. When splitting is disabled, the queue decides on the number of events to be contained in a batch.

**Default:** `1600`
| -| $$$output-elasticsearch-compression_level-setting$$$
`compression_level`
| (int) The gzip compression level. Set this value to `0` to disable compression. The compression level must be in the range of `1` (best speed) to `9` (best compression).

Increasing the compression level reduces network usage but increases CPU usage.

**Default:** `1`
| -| $$$output-elasticsearch-max_retries-setting$$$
`max_retries`
| (int) The number of times to retry publishing an event after a publishing failure. After the specified number of retries, the events are typically dropped.

Set `max_retries` to a value less than 0 to retry until all events are published.

**Default:** `3`
| -| $$$output-elasticsearch-preset-setting$$$
`preset`
| Configures the full group of [performance tuning settings](#output-elasticsearch-performance-tuning-settings) to optimize your {{agent}} performance when sending data to an {{es}} output.

Refer to [Performance tuning settings](/reference/fleet/es-output-settings.md#es-output-settings-performance-tuning-settings) for a table showing the group of values associated with any preset, and another table showing EPS (events per second) results from testing the different preset options.

Performance tuning preset settings:

**`balanced`**
: Configure the default tuning setting values for "out-of-the-box" performance.

**`throughput`**
: Optimize the {{es}} output for throughput.

**`scale`**
: Optimize the {{es}} output for scale.

**`latency`**
: Optimize the {{es}} output to reduce latence.

**`custom`**
: Use the `custom` option to fine-tune the performance tuning settings individually.

**Default:** `balanced`
| -| $$$output-elasticsearch-timeout-setting$$$
`timeout`
| (string) The HTTP request timeout in seconds for the {{es}} request.

**Default:** `90s`
| -| $$$output-elasticsearch-worker-setting$$$
`worker`
| (int) The number of workers per configured host publishing events. Example: If you have two hosts and three workers, in total six workers are started (three for each host).

**Default:** `1`
| +`backoff.init` $$$output-elasticsearch-backoff.init-setting$$$ +: (string) The number of seconds to wait before trying to reconnect to {{es}} after a network error. After waiting `backoff.init` seconds, {{agent}} tries to reconnect. If the attempt fails, the backoff timer is increased exponentially up to `backoff.max`. After a successful connection, the backoff timer is reset. + + **Default:** `1s` + +`backoff.max` $$$output-elasticsearch-backoff.max-setting$$$ +: (string) The maximum number of seconds to wait before attempting to connect to {{es}} after a network error. + + **Default:** `60s` + +`bulk_max_size` $$$output-elasticsearch-bulk_max_size-setting$$$ +: (int) The maximum number of events to bulk in a single {{es}} bulk API index request. + + Events can be collected into batches. {{agent}} will split batches larger than `bulk_max_size` into multiple batches. + + Specifying a larger batch size can improve performance by lowering the overhead of sending events. However big batch sizes can also increase processing times, which might result in API errors, killed connections, timed-out publishing requests, and, ultimately, lower throughput. + + Setting `bulk_max_size` to values less than or equal to 0 turns off the splitting of batches. When splitting is disabled, the queue decides on the number of events to be contained in a batch. + + **Default:** `1600` + +`compression_level` $$$output-elasticsearch-compression_level-setting$$$ +: (int) The gzip compression level. Set this value to `0` to disable compression. The compression level must be in the range of `1` (best speed) to `9` (best compression). + + Increasing the compression level reduces network usage but increases CPU usage. + + **Default:** `1` + +`max_retries` $$$output-elasticsearch-max_retries-setting$$$ +: (int) The number of times to retry publishing an event after a publishing failure. After the specified number of retries, the events are typically dropped. + + Set `max_retries` to a value less than 0 to retry until all events are published. + + **Default:** `3` + +`preset` $$$output-elasticsearch-preset-setting$$$ +: Configures the full group of [performance tuning settings](#output-elasticsearch-performance-tuning-settings) to optimize your {{agent}} performance when sending data to an {{es}} output. + + Refer to [Performance tuning settings](/reference/fleet/es-output-settings.md#es-output-settings-performance-tuning-settings) for a table showing the group of values associated with any preset, and another table showing EPS (events per second) results from testing the different preset options. + + Performance tuning preset settings: + + **`balanced`** + : Configure the default tuning setting values for "out-of-the-box" performance. + + **`throughput`** + : Optimize the {{es}} output for throughput. + + **`scale`** + : Optimize the {{es}} output for scale. + + **`latency`** + : Optimize the {{es}} output to reduce latence. + + **`custom`** + : Use the `custom` option to fine-tune the performance tuning settings individually. + + **Default:** `balanced` + +`timeout` $$$output-elasticsearch-timeout-setting$$$ +: (string) The HTTP request timeout in seconds for the {{es}} request. + + **Default:** `90s` + +`worker` $$$output-elasticsearch-worker-setting$$$ +: (int) The number of workers per configured host publishing events. Example: If you have two hosts and three workers, in total six workers are started (three for each host). + + **Default:** `1` diff --git a/reference/fleet/es-output-settings.md b/reference/fleet/es-output-settings.md index 346d721b09..2add2ed796 100644 --- a/reference/fleet/es-output-settings.md +++ b/reference/fleet/es-output-settings.md @@ -9,29 +9,125 @@ Specify these settings to send data over a secure connection to {{es}}. In the { | | | | --- | --- | -| $$$es-output-hosts-setting$$$
**Hosts**
| The {{es}} URLs where {{agent}}s will send data. By default, {{es}} is exposed on the following ports:

`9200`
: Default {{es}} port for self-managed clusters

`443`
: Default {{es}} port for {{ecloud}}

**Examples:**

* `https://192.0.2.0:9200`
* `https://1d7a52f5eb344de18ea04411fe09e564.fleet.eu-west-1.aws.qa.cld.elstc.co:443`
* `https://[2001:db8::1]:9200`

Refer to the [{{fleet-server}}](/reference/fleet/fleet-server.md) documentation for default ports and other configuration details.
| -| $$$es-trusted-fingerprint-yaml-setting$$$
**{{es}} CA trusted fingerprint**
| HEX encoded SHA-256 of a CA certificate. If this certificate is present in the chain during the handshake, it will be added to the `certificate_authorities` list and the handshake will continue normally. To learn more about trusted fingerprints, refer to the [{{es}} security documentation](/deploy-manage/deploy/self-managed/installing-elasticsearch.md).
| -| $$$es-agent-proxy-output$$$
**Proxy**
| Select a proxy URL for {{agent}} to connect to {{es}}. To learn about proxy configuration, refer to [Using a proxy server with {{agent}} and {{fleet}}](/reference/fleet/fleet-agent-proxy-support.md).
| -| $$$es-output-advanced-yaml-setting$$$
**Advanced YAML configuration**
| YAML settings that will be added to the {{es}} output section of each policy that uses this output. Make sure you specify valid YAML. The UI does not currently provide validation.

See [Advanced YAML configuration](#es-output-settings-yaml-config) for descriptions of the available settings.
| -| $$$es-agent-integrations-output$$$
**Make this output the default for agent integrations**
| When this setting is on, {{agent}}s use this output to send data if no other output is set in the [agent policy](/reference/fleet/agent-policy.md).
| -| $$$es-agent-monitoring-output$$$
**Make this output the default for agent monitoring**
| When this setting is on, {{agent}}s use this output to send [agent monitoring data](/reference/fleet/monitor-elastic-agent.md) if no other output is set in the [agent policy](/reference/fleet/agent-policy.md).
| -| $$$es-agent-performance-tuning$$$
**Performance tuning**
| Choose one of the menu options to tune your {{agent}} performance when sending data to an {{es}} output. You can optimize for throughput, scale, latency, or you can choose a balanced (the default) set of performance specifications. Refer to [Performance tuning settings](#es-output-settings-performance-tuning-settings) for details about the setting values and their potential impact on performance.

You can also use the [Advanced YAML configuration](#es-output-settings-yaml-config) field to set custom values. Note that if you adjust any of the performance settings described in the following **Advanced YAML configuration*** section, the ***Performance tuning*** option automatically changes to `Custom` and cannot be changed.

Performance tuning preset values take precedence over any settings that may be defined separately. If you want to change any setting, you need to use the `Custom` ***Performance tuning*** option and specify the settings in the ***Advanced YAML configuration*** field.

For example, if you would like to use the balanced preset values except that you prefer a higher compression level, you can do so as follows:

1. In {{fleet}}, open the ***Settings*** tab.
2. In the ***Outputs*** section, select ***Add output*** to create a new output, or select the edit icon to edit an existing output.
3. In the ***Add new output*** or the ***Edit output*** flyout, set ***Performance tuning** to `Custom`.
4. Refer to the list of [performance tuning preset values](#es-output-settings-performance-tuning-settings), and add the settings you prefer into the **Advanced YAML configuration** field. For the `balanced` presets, the yaml configuration would be as shown:

```yaml
bulk_max_size: 1600
worker: 1
queue.mem.events: 3200
queue.mem.flush.min_events: 1600
queue.mem.flush.timeout: 10s
compression_level: 1
idle_connection_timeout: 3s
```

5. Adjust any settings as preferred. For example, you can update the `compression_level` setting to `4`.

When you create an {{agent}} policy using this output, the output will use the balanced preset options except with the higher compression level, as specified.
| + +**Hosts** $$$es-output-hosts-setting$$$ +: The {{es}} URLs where {{agent}}s will send data. By default, {{es}} is exposed on the following ports: + + `9200` + : Default {{es}} port for self-managed clusters + + `443` + : Default {{es}} port for {{ecloud}} + + **Examples:** + * `https://192.0.2.0:9200` + * `https://1d7a52f5eb344de18ea04411fe09e564.fleet.eu-west-1.aws.qa.cld.elstc.co:443` + * `https://[2001:db8::1]:9200` + + Refer to the [{{fleet-server}}](/reference/fleet/fleet-server.md) documentation for default ports and other configuration details. + +**{{es}} CA trusted fingerprint** $$$es-trusted-fingerprint-yaml-setting$$$ +: HEX encoded SHA-256 of a CA certificate. If this certificate is present in the chain during the handshake, it will be added to the `certificate_authorities` list and the handshake will continue normally. To learn more about trusted fingerprints, refer to the [{{es}} security documentation](/deploy-manage/deploy/self-managed/installing-elasticsearch.md). + +**Proxy** $$$es-agent-proxy-output$$$ +: Select a proxy URL for {{agent}} to connect to {{es}}. To learn about proxy configuration, refer to [Using a proxy server with {{agent}} and {{fleet}}](/reference/fleet/fleet-agent-proxy-support.md). + +**Advanced YAML configuration** $$$es-output-advanced-yaml-setting$$$ +: YAML settings that will be added to the {{es}} output section of each policy that uses this output. Make sure you specify valid YAML. The UI does not currently provide validation. + See [Advanced YAML configuration](#es-output-settings-yaml-config) for descriptions of the available settings. + +**Make this output the default for agent integrations** $$$es-agent-integrations-output$$$ +: When this setting is on, {{agent}}s use this output to send data if no other output is set in the [agent policy](/reference/fleet/agent-policy.md). + +**Make this output the default for agent monitoring** $$$es-agent-monitoring-output$$$ +: When this setting is on, {{agent}}s use this output to send [agent monitoring data](/reference/fleet/monitor-elastic-agent.md) if no other output is set in the [agent policy](/reference/fleet/agent-policy.md). + +**Performance tuning** $$$es-agent-performance-tuning$$$ +: Choose one of the menu options to tune your {{agent}} performance when sending data to an {{es}} output. You can optimize for throughput, scale, latency, or you can choose a balanced (the default) set of performance specifications. Refer to [Performance tuning settings](#es-output-settings-performance-tuning-settings) for details about the setting values and their potential impact on performance. + + You can also use the [Advanced YAML configuration](#es-output-settings-yaml-config) field to set custom values. Note that if you adjust any of the performance settings described in the following **Advanced YAML configuration*** section, the ***Performance tuning*** option automatically changes to `Custom` and cannot be changed. + + Performance tuning preset values take precedence over any settings that may be defined separately. If you want to change any setting, you need to use the `Custom` ***Performance tuning*** option and specify the settings in the ***Advanced YAML configuration*** field. + + For example, if you would like to use the balanced preset values except that you prefer a higher compression level, you can do so as follows: + 1. In {{fleet}}, open the ***Settings*** tab. + 2. In the ***Outputs*** section, select ***Add output*** to create a new output, or select the edit icon to edit an existing output. + 3. In the ***Add new output*** or the ***Edit output*** flyout, set ***Performance tuning** to `Custom`. + 4. Refer to the list of [performance tuning preset values](#es-output-settings-performance-tuning-settings), and add the settings you prefer into the **Advanced YAML configuration** field. For the `balanced` presets, the yaml configuration would be as shown: + + ```yaml + bulk_max_size: 1600 + worker: 1 + queue.mem.events: 3200 + queue.mem.flush.min_events: 1600 + queue.mem.flush.timeout: 10s + compression_level: 1 + idle_connection_timeout: 3s + ``` + + 5. Adjust any settings as preferred. For example, you can update the `compression_level` setting to `4`. + When you create an {{agent}} policy using this output, the output will use the balanced preset options except with the higher compression level, as specified. ## Advanced YAML configuration [es-output-settings-yaml-config] -| Setting | Description | -| --- | --- | -| $$$output-elasticsearch-fleet-settings-allow_older_versions-setting$$$
`allow_older_versions`
| Allow {{agent}} to connect and send output to an {{es}} instance that is running an earlier version than the agent version.

Note that this setting does not affect {{agent}}'s ability to connect to {{fleet-server}}. {{fleet-server}} will not accept a connection from an agent at a later major or minor version. It will accept a connection from an agent at a later patch version. For example, an {{agent}} at version 8.14.3 can connect to a {{fleet-server}} on version 8.14.0, but an agent at version 8.15.0 or later is not able to connect.

**Default:** `true`
| -| $$$output-elasticsearch-fleet-settings-backoff.init-setting$$$
`backoff.init`
| (string) The number of seconds to wait before trying to reconnect to {{es}} after a network error. After waiting `backoff.init` seconds, {{agent}} tries to reconnect. If the attempt fails, the backoff timer is increased exponentially up to `backoff.max`. After a successful connection, the backoff timer is reset.

**Default:** `1s`
| -| $$$output-elasticsearch-fleet-settings-backoff.max-setting$$$
`backoff.max`
| (string) The maximum number of seconds to wait before attempting to connect to {{es}} after a network error.

**Default:** `60s`
| -| $$$output-elasticsearch-fleet-settings-bulk_max_size-setting$$$
`bulk_max_size`
| (int) The maximum number of events to bulk in a single {{es}} bulk API index request.

Events can be collected into batches. {{agent}} will split batches larger than `bulk_max_size` into multiple batches.

Specifying a larger batch size can improve performance by lowering the overhead of sending events. However big batch sizes can also increase processing times, which might result in API errors, killed connections, timed-out publishing requests, and, ultimately, lower throughput.

Setting `bulk_max_size` to values less than or equal to 0 turns off the splitting of batches. When splitting is disabled, the queue decides on the number of events to be contained in a batch.

**Default:** `1600`
| -| $$$output-elasticsearch-fleet-settings-compression_level-setting$$$
`compression_level`
| (int) The gzip compression level. Set this value to `0` to disable compression. The compression level must be in the range of `1` (best speed) to `9` (best compression).

Increasing the compression level reduces network usage but increases CPU usage.
| -| $$$output-elasticsearch-fleet-settings-max_retries-setting$$$
`max_retries`
| (int) The number of times to retry publishing an event after a publishing failure. After the specified number of retries, the events are typically dropped.

Set `max_retries` to a value less than 0 to retry until all events are published.

**Default:** `3`
| -| $$$output-elasticsearch-fleet-settings-queue.mem.events-setting$$$
`queue.mem.events`
| The number of events the queue can store. This value should be evenly divisible by the smaller of `queue.mem.flush.min_events` or `bulk_max_size` to avoid sending partial batches to the output.

**Default:** `3200 events`
| -| $$$output-elasticsearch-fleet-settings-queue.mem.flush.min_events-setting$$$
`queue.mem.flush.min_events`
| `flush.min_events` is a legacy parameter, and new configurations should prefer to control batch size with `bulk_max_size`. As of 8.13, there is never a performance advantage to limiting batch size with `flush.min_events` instead of `bulk_max_size`

**Default:** `1600 events`
| -| $$$output-elasticsearch-fleet-settings-queue.mem.flush.timeout-setting$$$
`queue.mem.flush.timeout`
| (int) The maximum wait time for `queue.mem.flush.min_events` to be fulfilled. If set to 0s, events are available to the output immediately.

**Default:** `10s`
| -| $$$output-elasticsearch-fleet-settings-timeout-setting$$$
`timeout`
| (string) The HTTP request timeout in seconds for the {{es}} request.

**Default:** `90s`
| -| $$$output-elasticsearch-fleet-settings-worker-setting$$$
`worker`
| (int) The number of workers per configured host publishing events. Example: If you have two hosts and three workers, in total six workers are started (three for each host).

**Default:** `1`
| +`allow_older_versions` $$$output-elasticsearch-fleet-settings-allow_older_versions-setting$$$ +: Allow {{agent}} to connect and send output to an {{es}} instance that is running an earlier version than the agent version. + Note that this setting does not affect {{agent}}'s ability to connect to {{fleet-server}}. {{fleet-server}} will not accept a connection from an agent at a later major or minor version. It will accept a connection from an agent at a later patch version. For example, an {{agent}} at version 8.14.3 can connect to a {{fleet-server}} on version 8.14.0, but an agent at version 8.15.0 or later is not able to connect. + + **Default:** `true` + +`backoff.init` $$$output-elasticsearch-fleet-settings-backoff.init-setting$$$ +: (string) The number of seconds to wait before trying to reconnect to {{es}} after a network error. After waiting `backoff.init` seconds, {{agent}} tries to reconnect. If the attempt fails, the backoff timer is increased exponentially up to `backoff.max`. After a successful connection, the backoff timer is reset. + + **Default:** `1s` + +`backoff.max` $$$output-elasticsearch-fleet-settings-backoff.max-setting$$$ +: (string) The maximum number of seconds to wait before attempting to connect to {{es}} after a network error. + + **Default:** `60s` + +`bulk_max_size` $$$output-elasticsearch-fleet-settings-bulk_max_size-setting$$$ +: (int) The maximum number of events to bulk in a single {{es}} bulk API index request. + Events can be collected into batches. {{agent}} will split batches larger than `bulk_max_size` into multiple batches. + Specifying a larger batch size can improve performance by lowering the overhead of sending events. However big batch sizes can also increase processing times, which might result in API errors, killed connections, timed-out publishing requests, and, ultimately, lower throughput. + Setting `bulk_max_size` to values less than or equal to 0 turns off the splitting of batches. When splitting is disabled, the queue decides on the number of events to be contained in a batch. + + **Default:** `1600` + +`compression_level` $$$output-elasticsearch-fleet-settings-compression_level-setting$$$ +: (int) The gzip compression level. Set this value to `0` to disable compression. The compression level must be in the range of `1` (best speed) to `9` (best compression). + Increasing the compression level reduces network usage but increases CPU usage. + +`max_retries` $$$output-elasticsearch-fleet-settings-max_retries-setting$$$ +: (int) The number of times to retry publishing an event after a publishing failure. After the specified number of retries, the events are typically dropped. + Set `max_retries` to a value less than 0 to retry until all events are published. + + **Default:** `3` + +`queue.mem.events` $$$output-elasticsearch-fleet-settings-queue.mem.events-setting$$$ +: The number of events the queue can store. This value should be evenly divisible by the smaller of `queue.mem.flush.min_events` or `bulk_max_size` to avoid sending partial batches to the output. + + **Default:** `3200 events` + +`queue.mem.flush.min_events` $$$output-elasticsearch-fleet-settings-queue.mem.flush.min_events-setting$$$ +: `flush.min_events` is a legacy parameter, and new configurations should prefer to control batch size with `bulk_max_size`. As of 8.13, there is never a performance advantage to limiting batch size with `flush.min_events` instead of `bulk_max_size` + + **Default:** `1600 events` + +`queue.mem.flush.timeout` $$$output-elasticsearch-fleet-settings-queue.mem.flush.timeout-setting$$$ +: (int) The maximum wait time for `queue.mem.flush.min_events` to be fulfilled. If set to 0s, events are available to the output immediately. + + **Default:** `10s` + +`timeout` $$$output-elasticsearch-fleet-settings-timeout-setting$$$ +: (string) The HTTP request timeout in seconds for the {{es}} request. + + **Default:** `90s` + +`worker` $$$output-elasticsearch-fleet-settings-worker-setting$$$ +: (int) The number of workers per configured host publishing events. Example: If you have two hosts and three workers, in total six workers are started (three for each host). + + **Default:** `1` ## Performance tuning settings [es-output-settings-performance-tuning-settings] diff --git a/reference/fleet/kafka-output-settings.md b/reference/fleet/kafka-output-settings.md index c7864d4c17..de91d7c15d 100644 --- a/reference/fleet/kafka-output-settings.md +++ b/reference/fleet/kafka-output-settings.md @@ -15,104 +15,299 @@ If you plan to use {{ls}} to modify {{agent}} output data before it’s sent to ### General settings [_general_settings] -| | | -| --- | --- | -| $$$kafka-output-version$$$
**Kafka version**
| The Kafka protocol version that {{agent}} will request when connecting. Defaults to `1.0.0`. Currently Kafka versions from `0.8.2.0` to `2.6.0` are supported, however the latest Kafka version (`3.x.x`) is expected to be compatible when version `2.6.0` is selected. When using Kafka 4.0 and newer, the version must be set to at least `2.1.0`.
| -| $$$kafka-output-hosts$$$
**Hosts**
| The addresses your {{agent}}s will use to connect to one or more Kafka brokers. Use the format `host:port` (without any protocol `http://`). Click **Add row** to specify additional addresses.

**Examples:**

* `localhost:9092`
* `mykafkahost:9092`

Refer to the [{{fleet-server}}](/reference/fleet/fleet-server.md) documentation for default ports and other configuration details.
| +**Kafka version** $$$kafka-output-version$$$ +: The Kafka protocol version that {{agent}} will request when connecting. Defaults to `1.0.0`. Currently Kafka versions from `0.8.2.0` to `2.6.0` are supported, however the latest Kafka version (`3.x.x`) is expected to be compatible when version `2.6.0` is selected. When using Kafka 4.0 and newer, the version must be set to at least `2.1.0`. + +**Hosts** $$$kafka-output-hosts$$$ +: The addresses your {{agent}}s will use to connect to one or more Kafka brokers. Use the format `host:port` (without any protocol `http://`). Click **Add row** to specify additional addresses. + + **Examples:** + * `localhost:9092` + * `mykafkahost:9092` + + Refer to the [{{fleet-server}}](/reference/fleet/fleet-server.md) documentation for default ports and other configuration details. ### Authentication settings [_authentication_settings] Select the mechanism that {{agent}} uses to authenticate with Kafka. -| | | -| --- | --- | -| $$$kafka-output-authentication-none$$$
**None**
| No authentication is used between {{agent}} and Kafka. This is the default option. In production, it’s recommended to have an authentication method selected.

Plaintext
: Set this option for traffic between {{agent}} and Kafka to be sent as plaintext, without any transport layer security.

This is the default option when no authentication is set.


Encryption
: Set this option for traffic between {{agent}} and Kafka to use transport layer security.

When **Encryption*** is selected, the ***Server SSL certificate authorities** and **Verification mode** mode options become available.

| -| $$$kafka-output-authentication-basic$$$
**Username / Password**
| Connect to Kafka with a username and password.

Provide your username and password, and select a SASL (Simple Authentication and Security Layer) mechanism for your login credentials.

When SCRAM is enabled, {{agent}} uses the [SCRAM](https://en.wikipedia.org/wiki/Salted_Challenge_Response_Authentication_Mechanism) mechanism to authenticate the user credential. SCRAM is based on the IETF RFC5802 standard which describes a challenge-response mechanism for authenticating users.

* Plain - SCRAM is not used to authenticate
* SCRAM-SHA-256 - uses the SHA-256 hashing function
* SCRAM-SHA-512 - uses the SHA-512 hashing function

To prevent unauthorized access your Kafka password is stored as a secret value. While secret storage is recommended, you can choose to override this setting and store the password as plain text in the agent policy definition. Secret storage requires {{fleet-server}} version 8.12 or higher.

Note that this setting can also be stored as a secret value or as plain text for preconfigured outputs. See [Preconfiguration settings](kibana://reference/configuration-reference/fleet-settings.md#_preconfiguration_settings_for_advanced_use_cases) in the {{kib}} Guide to learn more.
| -| $$$kafka-output-authentication-ssl$$$
**SSL**
| Authenticate using the Secure Sockets Layer (SSL) protocol. Provide the following details for your SSL certificate:

Client SSL certificate
: The certificate generated for the client. Copy and paste in the full contents of the certificate. This is the certificate that all the agents will use to connect to Kafka.

In cases where each client has a unique certificate, the local path to that certificate can be placed here. The agents will pick the certificate in that location when establishing a connection to Kafka.


Client SSL certificate key
: The private key generated for the client. This must be in PKCS 8 key. Copy and paste in the full contents of the certificate key. This is the certificate key that all the agents will use to connect to Kafka.

In cases where each client has a unique certificate key, the local path to that certificate key can be placed here. The agents will pick the certificate key in that location when establishing a connection to Kafka.

To prevent unauthorized access the certificate key is stored as a secret value. While secret storage is recommended, you can choose to override this setting and store the key as plain text in the agent policy definition. Secret storage requires {{fleet-server}} version 8.12 or higher.

Note that this setting can also be stored as a secret value or as plain text for preconfigured outputs. See [Preconfiguration settings](kibana://reference/configuration-reference/fleet-settings.md#_preconfiguration_settings_for_advanced_use_cases) in the {{kib}} Guide to learn more.

| -| **Server SSL certificate authorities**
| The CA certificate to use to connect to Kafka. This is the CA used to generate the certificate and key for Kafka. Copy and paste in the full contents for the CA certificate.

This setting is optional. This setting is not available when the authentication `None` and `Plaintext` options are selected.

Click **Add row** to specify additional certificate authories.
| -| **Verification mode**
| Controls the verification of server certificates. Valid values are:

`Full`
: Verifies that the provided certificate is signed by a trusted authority (CA) and also verifies that the server’s hostname (or IP address) matches the names identified within the certificate.

`None`
: Performs *no verification* of the server’s certificate. This mode disables many of the security benefits of SSL/TLS and should only be used after cautious consideration. It is primarily intended as a temporary diagnostic mechanism when attempting to resolve TLS errors; its use in production environments is strongly discouraged.

`Strict`
: Verifies that the provided certificate is signed by a trusted authority (CA) and also verifies that the server’s hostname (or IP address) matches the names identified within the certificate. If the Subject Alternative Name is empty, it returns an error.

`Certificate`
: Verifies that the provided certificate is signed by a trusted authority (CA), but does not perform any hostname verification.

The default value is `Full`. This setting is not available when the authentication `None` and `Plaintext` options are selected.
| +**None** $$$kafka-output-authentication-none$$$ +: No authentication is used between {{agent}} and Kafka. This is the default option. In production, it’s recommended to have an authentication method selected. + + Plaintext + : Set this option for traffic between {{agent}} and Kafka to be sent as plaintext, without any transport layer security. + + This is the default option when no authentication is set. + + + Encryption + : Set this option for traffic between {{agent}} and Kafka to use transport layer security. + + When **Encryption*** is selected, the ***Server SSL certificate authorities** and **Verification mode** mode options become available. + + +**Username / Password** $$$kafka-output-authentication-basic$$$ +: Connect to Kafka with a username and password. + + Provide your username and password, and select a SASL (Simple Authentication and Security Layer) mechanism for your login credentials. + + When SCRAM is enabled, {{agent}} uses the [SCRAM](https://en.wikipedia.org/wiki/Salted_Challenge_Response_Authentication_Mechanism) mechanism to authenticate the user credential. SCRAM is based on the IETF RFC5802 standard which describes a challenge-response mechanism for authenticating users. + + * Plain - SCRAM is not used to authenticate + * SCRAM-SHA-256 - uses the SHA-256 hashing function + * SCRAM-SHA-512 - uses the SHA-512 hashing function + + To prevent unauthorized access your Kafka password is stored as a secret value. While secret storage is recommended, you can choose to override this setting and store the password as plain text in the agent policy definition. Secret storage requires {{fleet-server}} version 8.12 or higher. + + Note that this setting can also be stored as a secret value or as plain text for preconfigured outputs. See [Preconfiguration settings](kibana://reference/configuration-reference/fleet-settings.md#_preconfiguration_settings_for_advanced_use_cases) in the {{kib}} Guide to learn more. + +**SSL** $$$kafka-output-authentication-ssl$$$ +: Authenticate using the Secure Sockets Layer (SSL) protocol. Provide the following details for your SSL certificate: + + Client SSL certificate + : The certificate generated for the client. Copy and paste in the full contents of the certificate. This is the certificate that all the agents will use to connect to Kafka. + + In cases where each client has a unique certificate, the local path to that certificate can be placed here. The agents will pick the certificate in that location when establishing a connection to Kafka. + + + Client SSL certificate key + : The private key generated for the client. This must be in PKCS 8 key. Copy and paste in the full contents of the certificate key. This is the certificate key that all the agents will use to connect to Kafka. + + In cases where each client has a unique certificate key, the local path to that certificate key can be placed here. The agents will pick the certificate key in that location when establishing a connection to Kafka. + + To prevent unauthorized access the certificate key is stored as a secret value. While secret storage is recommended, you can choose to override this setting and store the key as plain text in the agent policy definition. Secret storage requires {{fleet-server}} version 8.12 or higher. + + Note that this setting can also be stored as a secret value or as plain text for preconfigured outputs. See [Preconfiguration settings](kibana://reference/configuration-reference/fleet-settings.md#_preconfiguration_settings_for_advanced_use_cases) in the {{kib}} Guide to learn more. + +**Server SSL certificate authorities** +: The CA certificate to use to connect to Kafka. This is the CA used to generate the certificate and key for Kafka. Copy and paste in the full contents for the CA certificate. + + This setting is optional. This setting is not available when the authentication `None` and `Plaintext` options are selected. + + Click **Add row** to specify additional certificate authories. + +**Verification mode** +: Controls the verification of server certificates. Valid values are: + + `Full` + : Verifies that the provided certificate is signed by a trusted authority (CA) and also verifies that the server’s hostname (or IP address) matches the names identified within the certificate. + + `None` + : Performs *no verification* of the server’s certificate. This mode disables many of the security benefits of SSL/TLS and should only be used after cautious consideration. It is primarily intended as a temporary diagnostic mechanism when attempting to resolve TLS errors; its use in production environments is strongly discouraged. + + `Strict` + : Verifies that the provided certificate is signed by a trusted authority (CA) and also verifies that the server’s hostname (or IP address) matches the names identified within the certificate. If the Subject Alternative Name is empty, it returns an error. + + `Certificate` + : Verifies that the provided certificate is signed by a trusted authority (CA), but does not perform any hostname verification. + + The default value is `Full`. This setting is not available when the authentication `None` and `Plaintext` options are selected. ### Partitioning settings [_partitioning_settings] The number of partitions created is set automatically by the Kafka broker based on the list of topics. Records are then published to partitions either randomly, in round-robin order, or according to a calculated hash. -| | | -| --- | --- | -| $$$kafka-output-partitioning-random$$$
**Random**
| Publish records to Kafka output broker event partitions randomly. Specify the number of events to be published to the same partition before the partitioner selects a new partition.
| -| $$$kafka-output-partitioning-roundrobin$$$
**Round robin**
| Publish records to Kafka output broker event partitions in a round-robin fashion. Specify the number of events to be published to the same partition before the partitioner selects a new partition.
| -| $$$kafka-output-partitioning-hash$$$
**Hash**
| Publish records to Kafka output broker event partitions based on a hash computed from the specified list of fields. If a field is not specified, the Kafka event key value is used.
| +**Random** $$$kafka-output-partitioning-random$$$ +: Publish records to Kafka output broker event partitions randomly. Specify the number of events to be published to the same partition before the partitioner selects a new partition. + +**Round robin** $$$kafka-output-partitioning-roundrobin$$$ +: Publish records to Kafka output broker event partitions in a round-robin fashion. Specify the number of events to be published to the same partition before the partitioner selects a new partition. + +**Hash** $$$kafka-output-partitioning-hash$$$ +: Publish records to Kafka output broker event partitions based on a hash computed from the specified list of fields. If a field is not specified, the Kafka event key value is used. ### Topics settings [_topics_settings] Use this option to set the Kafka topic for each {{agent}} event. -| | | -| --- | --- | -| $$$kafka-output-topics-default$$$
**Default topic**
| Set a default topic to use for events sent by {{agent}} to the Kafka output.

You can set a static topic, for example `elastic-agent`, or you can choose to set a topic dynamically based on an [Elastic Common Scheme (ECS)][Elastic Common Schema (ECS)](ecs://reference/index.md)) field. Available fields include:

* `data_stream_type`
* `data_stream.dataset`
* `data_stream.namespace`
* `@timestamp`
* `event-dataset`

You can also set a custom field. This is useful if you’re using the [`add_fields` processor](/reference/fleet/add_fields-processor.md) as part of your {{agent}} input. Otherwise, setting a custom field is not recommended.
| +**Default topic** $$$kafka-output-topics-default$$$ +: Set a default topic to use for events sent by {{agent}} to the Kafka output. + + You can set a static topic, for example `elastic-agent`, or you can choose to set a topic dynamically based on an [Elastic Common Scheme (ECS)][Elastic Common Schema (ECS)](ecs://reference/index.md)) field. Available fields include: + + * `data_stream_type` + * `data_stream.dataset` + * `data_stream.namespace` + * `@timestamp` + * `event-dataset` + + You can also set a custom field. This is useful if you’re using the [`add_fields` processor](/reference/fleet/add_fields-processor.md) as part of your {{agent}} input. Otherwise, setting a custom field is not recommended. ### Header settings [_header_settings] A header is a key-value pair, and multiple headers can be included with the same key. Only string values are supported. These headers will be included in each produced Kafka message. -| | | -| --- | --- | -| $$$kafka-output-headers-key$$$
**Key**
| The key to set in the Kafka header.
| -| $$$kafka-output-headers-value$$$
**Value**
| The value to set in the Kafka header.

Click **Add header** to configure additional headers to be included in each Kafka message.
| -| $$$kafka-output-headers-clientid$$$
**Client ID**
| The configurable ClientID used for logging, debugging, and auditing purposes. The default is `Elastic`. The Client ID is part of the protocol to identify where the messages are coming from.
| +**Key** $$$kafka-output-headers-key$$$ +: The key to set in the Kafka header. + +**Value** $$$kafka-output-headers-value$$$ +: The value to set in the Kafka header. + + Click **Add header** to configure additional headers to be included in each Kafka message. + +**Client ID** $$$kafka-output-headers-clientid$$$ +: The configurable ClientID used for logging, debugging, and auditing purposes. The default is `Elastic`. The Client ID is part of the protocol to identify where the messages are coming from. ### Compression settings [_compression_settings] You can enable compression to reduce the volume of Kafka output. -| | | -| --- | --- | -| $$$kafka-output-compression-codec$$$
**Codec**
| Select a compression codec to use. Supported codecs are `snappy`, `lz4` and `gzip`.
| -| $$$kafka-output-compression-level$$$
**Level**
| For the `gzip` codec you can choose a compression level. The level must be in the range of `1` (best speed) to `9` (best compression).

Increasing the compression level reduces the network usage but increases the CPU usage. The default value is 4.
| +**Codec** $$$kafka-output-compression-codec$$$ +: Select a compression codec to use. Supported codecs are `snappy`, `lz4` and `gzip`. + +**Level** $$$kafka-output-compression-level$$$ +: For the `gzip` codec you can choose a compression level. The level must be in the range of `1` (best speed) to `9` (best compression). + + Increasing the compression level reduces the network usage but increases the CPU usage. The default value is 4. ### Broker settings [_broker_settings] Configure timeout and buffer size values for the Kafka brokers. -| | | -| --- | --- | -| $$$kafka-output-broker-timeout$$$
**Broker timeout**
| The maximum length of time a Kafka broker waits for the required number of ACKs before timing out (see the `ACK reliability` setting further in). The default is 30 seconds.
| -| $$$kafka-output-broker-reachability-timeout$$$
**Broker reachability timeout**
| The maximum length of time that an {{agent}} waits for a response from a Kafka broker before timing out. The default is 30 seconds.
| -| $$$kafka-output-broker-ack-reliability$$$
**ACK reliability**
| The ACK reliability level required from broker. Options are:

* Wait for local commit
* Wait for all replicas to commit
* Do not wait

The default is `Wait for local commit`.

Note that if ACK reliability is set to `Do not wait` no ACKs are returned by Kafka. Messages might be lost silently in the event of an error.
| +**Broker timeout** $$$kafka-output-broker-timeout$$$ +: The maximum length of time a Kafka broker waits for the required number of ACKs before timing out (see the `ACK reliability` setting further in). The default is 30 seconds. + +**Broker reachability timeout** $$$kafka-output-broker-reachability-timeout$$$ +: The maximum length of time that an {{agent}} waits for a response from a Kafka broker before timing out. The default is 30 seconds. + +**ACK reliability** $$$kafka-output-broker-ack-reliability$$$ +: The ACK reliability level required from broker. Options are: + + * Wait for local commit + * Wait for all replicas to commit + * Do not wait + + The default is `Wait for local commit`. + + Note that if ACK reliability is set to `Do not wait` no ACKs are returned by Kafka. Messages might be lost silently in the event of an error. ### Other settings [_other_settings] -| | | -| --- | --- | -| $$$kafka-output-other-key$$$
**Key**
| An optional formatted string specifying the Kafka event key. If configured, the event key can be extracted from the event using a format string.

See the [Kafka documentation](https://kafka.apache.org/intro#intro_topics) for the implications of a particular choice of key; by default, the key is chosen by the Kafka cluster.
| -| $$$kafka-output-other-proxy$$$
**Proxy**
| Select a proxy URL for {{agent}} to connect to Kafka. To learn about proxy configuration, refer to [Using a proxy server with {{agent}} and {{fleet}}](/reference/fleet/fleet-agent-proxy-support.md).
| -| $$$kafka-output-advanced-yaml-setting$$$
**Advanced YAML configuration**
| YAML settings that will be added to the Kafka output section of each policy that uses this output. Make sure you specify valid YAML. The UI does not currently provide validation.

See [Advanced YAML configuration](#kafka-output-settings-yaml-config) for descriptions of the available settings.
| -| $$$kafka-output-agent-integrations$$$
**Make this output the default for agent integrations**
| When this setting is on, {{agent}}s use this output to send data if no other output is set in the [agent policy](/reference/fleet/agent-policy.md).
| -| $$$kafka-output-agent-monitoring$$$
**Make this output the default for agent monitoring**
| When this setting is on, {{agent}}s use this output to send [agent monitoring data](/reference/fleet/monitor-elastic-agent.md) if no other output is set in the [agent policy](/reference/fleet/agent-policy.md).
| +**Key** $$$kafka-output-other-key$$$ +: An optional formatted string specifying the Kafka event key. If configured, the event key can be extracted from the event using a format string. + + See the [Kafka documentation](https://kafka.apache.org/intro#intro_topics) for the implications of a particular choice of key; by default, the key is chosen by the Kafka cluster. + +**Proxy** $$$kafka-output-other-proxy$$$ +: Select a proxy URL for {{agent}} to connect to Kafka. To learn about proxy configuration, refer to [Using a proxy server with {{agent}} and {{fleet}}](/reference/fleet/fleet-agent-proxy-support.md). + +**Advanced YAML configuration** $$$kafka-output-advanced-yaml-setting$$$ +: YAML settings that will be added to the Kafka output section of each policy that uses this output. Make sure you specify valid YAML. The UI does not currently provide validation. + + See [Advanced YAML configuration](#kafka-output-settings-yaml-config) for descriptions of the available settings. + +**Make this output the default for agent integrations** $$$kafka-output-agent-integrations$$$ +: When this setting is on, {{agent}}s use this output to send data if no other output is set in the [agent policy](/reference/fleet/agent-policy.md). + +**Make this output the default for agent monitoring** $$$kafka-output-agent-monitoring$$$ +: When this setting is on, {{agent}}s use this output to send [agent monitoring data](/reference/fleet/monitor-elastic-agent.md) if no other output is set in the [agent policy](/reference/fleet/agent-policy.md). ## Advanced YAML configuration [kafka-output-settings-yaml-config] -| Setting | Description | -| --- | --- | -| $$$output-kafka-fleet-settings-backoff.init-setting$$$
`backoff.init`
| (string) The number of seconds to wait before trying to reconnect to Kafka after a network error. After waiting `backoff.init` seconds, {{agent}} tries to reconnect. If the attempt fails, the backoff timer is increased exponentially up to `backoff.max`. After a successful connection, the backoff timer is reset.

**Default:** `1s`
| -| $$$output-kafka-fleet-settings-backoff.max-setting$$$
`backoff.max`
| (string) The maximum number of seconds to wait before attempting to connect to Kafka after a network error.

**Default:** `60s`
| -| $$$output-kafka-fleet-settings-bulk_max_size-setting$$$
`bulk_max_size`
| (int) The maximum number of events to bulk in a single Kafka request.

**Default:** `2048`
| -| $$$output-kafka-fleet-settings-flush_frequency-setting$$$
`bulk_flush_frequency`
| (int) Duration to wait before sending bulk Kafka request. `0` is no delay.

**Default:** `0`
| -| $$$output-kafka-fleet-settings-channel_buffer_size-setting$$$
`channel_buffer_size`
| (int) Per Kafka broker number of messages buffered in output pipeline.

**Default:** `256`
| -| $$$output-kafka-fleet-settings-client_id-setting$$$
`client_id`
| (string) The configurable ClientID used for logging, debugging, and auditing purposes.

**Default:** `Elastic Agent`
| -| $$$output-kafka-fleet-settings-codec-setting$$$
`codec`
| Output codec configuration. You can specify either the `json` or `format` codec. By default the `json` codec is used.

**`json.pretty`**: If `pretty` is set to true, events will be nicely formatted. The default is false.

**`json.escape_html`**: If `escape_html` is set to true, html symbols will be escaped in strings. The default is false.

Example configuration that uses the `json` codec with pretty printing enabled to write events to the console:

```yaml
output.console:
codec.json:
pretty: true
escape_html: false
```

**`format.string`**: Configurable format string used to create a custom formatted message.

Example configurable that uses the `format` codec to print the events timestamp and message field to console:

```yaml
output.console:
codec.format:
string: '%{[@timestamp]} %{[message]}'
```

**Default:** `json`
| -| $$$output-kafka-fleet-settings-keep_alive-setting$$$
`keep_alive`
| (string) The keep-alive period for an active network connection. If `0s`, keep-alives are disabled.

**Default:** `0s`
| -| $$$output-kafka-fleet-settings-max_message_bytes-setting$$$
`max_message_bytes`
| (int) The maximum permitted size of JSON-encoded messages. Bigger messages will be dropped. This value should be equal to or less than the broker’s `message.max.bytes`.

**Default:** `1000000` (bytes)
| -| $$$output-kafka-fleet-settings-metadata-setting$$$
`metadata`
| Kafka metadata update settings. The metadata contains information about brokers, topics, partition, and active leaders to use for publishing.

**`refresh_frequency`**
: Metadata refresh interval. Defaults to 10 minutes.

**`full`**
: Strategy to use when fetching metadata. When this option is `true`, the client will maintain a full set of metadata for all the available topics. When set to `false` it will only refresh the metadata for the configured topics. The default is false.

**`retry.max`**
: Total number of metadata update retries. The default is 3.

**`retry.backoff`**
: Waiting time between retries. The default is 250ms.
| -| $$$output-kafka-fleet-settings-queue.mem.events-setting$$$
`queue.mem.events`
| The number of events the queue can store. This value should be evenly divisible by the smaller of `queue.mem.flush.min_events` or `bulk_max_size` to avoid sending partial batches to the output.

**Default:** `3200 events`
| -| $$$output-kafka-fleet-settings-queue.mem.flush.min_events-setting$$$
`queue.mem.flush.min_events`
| `flush.min_events` is a legacy parameter, and new configurations should prefer to control batch size with `bulk_max_size`. As of 8.13, there is never a performance advantage to limiting batch size with `flush.min_events` instead of `bulk_max_size`

**Default:** `1600 events`
| -| $$$output-kafka-fleet-settings-queue.mem.flush.timeout-setting$$$
`queue.mem.flush.timeout`
| (int) The maximum wait time for `queue.mem.flush.min_events` to be fulfilled. If set to 0s, events are available to the output immediately.

**Default:** `10s`
| +`backoff.init` $$$output-kafka-fleet-settings-backoff.init-setting$$$ +: (string) The number of seconds to wait before trying to reconnect to Kafka after a network error. After waiting `backoff.init` seconds, {{agent}} tries to reconnect. If the attempt fails, the backoff timer is increased exponentially up to `backoff.max`. After a successful connection, the backoff timer is reset. + + **Default:** `1s` + +`backoff.max` $$$output-kafka-fleet-settings-backoff.max-setting$$$ +: (string) The maximum number of seconds to wait before attempting to connect to Kafka after a network error. + + **Default:** `60s` + +`bulk_max_size` $$$output-kafka-fleet-settings-bulk_max_size-setting$$$ +: (int) The maximum number of events to bulk in a single Kafka request. + + **Default:** `2048` + +`bulk_flush_frequency` $$$output-kafka-fleet-settings-flush_frequency-setting$$$ +: (int) Duration to wait before sending bulk Kafka request. `0` is no delay. + + **Default:** `0` + +`channel_buffer_size` $$$output-kafka-fleet-settings-channel_buffer_size-setting$$$ +: (int) Per Kafka broker number of messages buffered in output pipeline. + + **Default:** `256` + +`client_id` $$$output-kafka-fleet-settings-client_id-setting$$$ +: (string) The configurable ClientID used for logging, debugging, and auditing purposes. + + **Default:** `Elastic Agent` + +`codec` $$$output-kafka-fleet-settings-codec-setting$$$ +: Output codec configuration. You can specify either the `json` or `format` codec. By default the `json` codec is used. + + **`json.pretty`**: If `pretty` is set to true, events will be nicely formatted. The default is false. + + **`json.escape_html`**: If `escape_html` is set to true, html symbols will be escaped in strings. The default is false. + + Example configuration that uses the `json` codec with pretty printing enabled to write events to the console: + + ```yaml + output.console: + codec.json: + pretty: true + escape_html: false + ``` + + **`format.string`**: Configurable format string used to create a custom formatted message. + + Example configurable that uses the `format` codec to print the events timestamp and message field to console: + + ```yaml + output.console: + codec.format: + string: '%{[@timestamp]} %{[message]}' + ``` + + **Default:** `json` + +`keep_alive` $$$output-kafka-fleet-settings-keep_alive-setting$$$ +: (string) The keep-alive period for an active network connection. If `0s`, keep-alives are disabled. + + **Default:** `0s` + +`max_message_bytes` $$$output-kafka-fleet-settings-max_message_bytes-setting$$$ +: (int) The maximum permitted size of JSON-encoded messages. Bigger messages will be dropped. This value should be equal to or less than the broker’s `message.max.bytes`. + + **Default:** `1000000` (bytes) + +`metadata` $$$output-kafka-fleet-settings-metadata-setting$$$ +: Kafka metadata update settings. The metadata contains information about brokers, topics, partition, and active leaders to use for publishing. + + **`refresh_frequency`** + : Metadata refresh interval. Defaults to 10 minutes. + + **`full`** + : Strategy to use when fetching metadata. When this option is `true`, the client will maintain a full set of metadata for all the available topics. When set to `false` it will only refresh the metadata for the configured topics. The default is false. + + **`retry.max`** + : Total number of metadata update retries. The default is 3. + + **`retry.backoff`** + : Waiting time between retries. The default is 250ms. + +`queue.mem.events` $$$output-kafka-fleet-settings-queue.mem.events-setting$$$ +: The number of events the queue can store. This value should be evenly divisible by the smaller of `queue.mem.flush.min_events` or `bulk_max_size` to avoid sending partial batches to the output. + + **Default:** `3200 events` + +`queue.mem.flush.min_events` $$$output-kafka-fleet-settings-queue.mem.flush.min_events-setting$$$ +: `flush.min_events` is a legacy parameter, and new configurations should prefer to control batch size with `bulk_max_size`. As of 8.13, there is never a performance advantage to limiting batch size with `flush.min_events` instead of `bulk_max_size` + + **Default:** `1600 events` + +`queue.mem.flush.timeout` $$$output-kafka-fleet-settings-queue.mem.flush.timeout-setting$$$ +: (int) The maximum wait time for `queue.mem.flush.min_events` to be fulfilled. If set to 0s, events are available to the output immediately. + + **Default:** `10s` ## Kafka output and using {{ls}} to index data to {{es}} [kafka-output-settings-ls-warning] diff --git a/reference/fleet/kafka-output.md b/reference/fleet/kafka-output.md index f5a062d470..16af387b47 100644 --- a/reference/fleet/kafka-output.md +++ b/reference/fleet/kafka-output.md @@ -76,21 +76,42 @@ The `kafka` output supports the following settings, grouped by category. Many of ## Commonly used settings [output-kafka-commonly-used-settings] -| Setting | Description | -| --- | --- | -| $$$output-kafka-enabled-setting$$$
`enabled`
| (boolean) Enables or disables the output. If set to `false`, the output is disabled.
| -| $$$kafka-hosts-setting$$$
`hosts`
| The addresses your {{agent}}s will use to connect to one or more Kafka brokers.

Following is an example `hosts` setting with three hosts defined:

```yaml
hosts:
- 'localhost:9092'
- 'mykafkahost01:9092'
- 'mykafkahost02:9092'
```
| -| $$$kafka-version-setting$$$
`version`
| Kafka protocol version that {{agent}} will request when connecting. Defaults to 1.0.0.

The protocol version controls the Kafka client features available to {{agent}}; it does not prevent {{agent}} from connecting to Kafka versions newer than the protocol version.
| +`enabled` $$$output-kafka-enabled-setting$$$ +: (boolean) Enables or disables the output. If set to `false`, the output is disabled. + +`hosts` $$$kafka-hosts-setting$$$ +: The addresses your {{agent}}s will use to connect to one or more Kafka brokers. + + Following is an example `hosts` setting with three hosts defined: + + ```yml + hosts: + - 'localhost:9092' + - 'mykafkahost01:9092' + - 'mykafkahost02:9092' + ``` +`version` $$$kafka-version-setting$$$ +: Kafka protocol version that {{agent}} will request when connecting. Defaults to 1.0.0. + + The protocol version controls the Kafka client features available to {{agent}}; it does not prevent {{agent}} from connecting to Kafka versions newer than the protocol version. ## Authentication settings [output-kafka-authentication-settings] -| Setting | Description | -| --- | --- | -| $$$kafka-username-setting$$$
`username`
| The username for connecting to Kafka. If username is configured, the password must be configured as well.
| -| $$$kafka-password-setting$$$
`password`
| The password for connecting to Kafka.
| -| $$$kafka-sasl.mechanism-setting$$$
`sasl.mechanism`
| The SASL mechanism to use when connecting to Kafka. It can be one of:

* `PLAIN` for SASL/PLAIN.
* `SCRAM-SHA-256` for SCRAM-SHA-256.
* `SCRAM-SHA-512` for SCRAM-SHA-512. If `sasl.mechanism` is not set, `PLAIN` is used if `username` and `password` are provided. Otherwise, SASL authentication is disabled.
| -| $$$kafka-ssl-setting$$$
`ssl`
| When sending data to a secured cluster through the `kafka` output, {{agent}} can use SSL/TLS. For a list of available settings, refer to [SSL/TLS](/reference/fleet/elastic-agent-ssl-configuration.md), specifically the settings under [Table 7, Common configuration options](/reference/fleet/elastic-agent-ssl-configuration.md#common-ssl-options) and [Table 8, Client configuration options](/reference/fleet/elastic-agent-ssl-configuration.md#client-ssl-options).
| +`username` $$$kafka-username-setting$$$ +: The username for connecting to Kafka. If username is configured, the password must be configured as well. + +`password` $$$kafka-password-setting$$$ +: The password for connecting to Kafka. + +`sasl.mechanism` $$$kafka-sasl.mechanism-setting$$$ +: The SASL mechanism to use when connecting to Kafka. It can be one of: + * `PLAIN` for SASL/PLAIN. + * `SCRAM-SHA-256` for SCRAM-SHA-256. + * `SCRAM-SHA-512` for SCRAM-SHA-512. If `sasl.mechanism` is not set, `PLAIN` is used if `username` and `password` are provided. Otherwise, SASL authentication is disabled. + +`ssl` $$$kafka-ssl-setting$$$ +: When sending data to a secured cluster through the `kafka` output, {{agent}} can use SSL/TLS. For a list of available settings, refer to [SSL/TLS](/reference/fleet/elastic-agent-ssl-configuration.md), specifically the settings under [Table 7, Common configuration options](/reference/fleet/elastic-agent-ssl-configuration.md#common-ssl-options) and [Table 8, Client configuration options](/reference/fleet/elastic-agent-ssl-configuration.md#client-ssl-options). ## Memory queue settings [output-kafka-memory-queue-settings] @@ -117,20 +138,28 @@ This sample configuration forwards events to the output when there are enough ev queue.mem.flush.timeout: 5s ``` -| Setting | Description | -| --- | --- | -| $$$output-kafka-queue.mem.events-setting$$$
`queue.mem.events`
| The number of events the queue can store. This value should be evenly divisible by the smaller of `queue.mem.flush.min_events` or `bulk_max_size` to avoid sending partial batches to the output.

**Default:** `3200 events`
| -| $$$output-kafka-queue.mem.flush.min_events-setting$$$
`queue.mem.flush.min_events`
| `flush.min_events` is a legacy parameter, and new configurations should prefer to control batch size with `bulk_max_size`. As of 8.13, there is never a performance advantage to limiting batch size with `flush.min_events` instead of `bulk_max_size`

**Default:** `1600 events`
| -| $$$output-kafka-queue.mem.flush.timeout-setting$$$
`queue.mem.flush.timeout`
| (int) The maximum wait time for `queue.mem.flush.min_events` to be fulfilled. If set to 0s, events are available to the output immediately.

**Default:** `10s`
| +`queue.mem.events` $$$output-kafka-queue.mem.events-setting$$$ +: The number of events the queue can store. This value should be evenly divisible by the smaller of `queue.mem.flush.min_events` or `bulk_max_size` to avoid sending partial batches to the output. + + **Default:** `3200 events` + +`queue.mem.flush.min_events` $$$output-kafka-queue.mem.flush.min_events-setting$$$ +: `flush.min_events` is a legacy parameter, and new configurations should prefer to control batch size with `bulk_max_size`. As of 8.13, there is never a performance advantage to limiting batch size with `flush.min_events` instead of `bulk_max_size` + + **Default:** `1600 events` + +`queue.mem.flush.timeout` $$$output-kafka-queue.mem.flush.timeout-setting$$$ +: (int) The maximum wait time for `queue.mem.flush.min_events` to be fulfilled. If set to 0s, events are available to the output immediately. + + **Default:** `10s` ## Topics settings [output-kafka-topics-settings] Use these options to set the Kafka topic for each {{agent}} event. -| Setting | Description | -| --- | --- | -| $$$kafka-topic-setting$$$
`topic`
| The default Kafka topic used for produced events.
| +`topic` $$$kafka-topic-setting$$$ +: The default Kafka topic used for produced events. ## Partition settings [output-kafka-partition-settings] @@ -145,44 +174,129 @@ In the following example, after each event is published to a partition, the part group_events: 1 ``` -| Setting | Description | -| --- | --- | -| $$$kafka-random.group-events-setting$$$
`random.group_events`
| Sets the number of events to be published to the same partition, before the partitioner selects a new partition by random. The default value is 1 meaning after each event a new partition is picked randomly.
| -| $$$kafka-round_robin.group_events-setting$$$
`round_robin.group_events`
| Sets the number of events to be published to the same partition, before the partitioner selects the next partition. The default value is 1 meaning after each event the next partition will be selected.
| -| $$$kafka-hash.hash-setting$$$
`hash.hash`
| List of fields used to compute the partitioning hash value from. If no field is configured, the events key value will be used.
| -| $$$kafka-hash.random-setting$$$
`hash.random`
| Randomly distribute events if no hash or key value can be computed.
| +`random.group_events` $$$kafka-random.group-events-setting$$$ +: Sets the number of events to be published to the same partition, before the partitioner selects a new partition by random. The default value is 1 meaning after each event a new partition is picked randomly. + +`round_robin.group_events` $$$kafka-round_robin.group_events-setting$$$ +: Sets the number of events to be published to the same partition, before the partitioner selects the next partition. The default value is 1 meaning after each event the next partition will be selected. + +`hash.hash` $$$kafka-hash.hash-setting$$$ +: List of fields used to compute the partitioning hash value from. If no field is configured, the events key value will be used. + +`hash.random` $$$kafka-hash.random-setting$$$ +: Randomly distribute events if no hash or key value can be computed. ## Header settings [output-kafka-header-settings] A header is a key-value pair, and multiple headers can be included with the same key. Only string values are supported. These headers will be included in each produced Kafka message. -| Setting | Description | -| --- | --- | -| $$$kafka-key-setting$$$
`key`
| The key to set in the Kafka header.
| -| $$$kafka-value-setting$$$
`value`
| The value to set in the Kafka header.
| -| $$$kafka-client_id-setting$$$
`client_id`
| The configurable ClientID used for logging, debugging, and auditing purposes. The default is `Elastic`. The Client ID is part of the protocol to identify where the messages are coming from.
| +`key` $$$kafka-key-setting$$$ +: The key to set in the Kafka header. + +`value` $$$kafka-value-setting$$$ +: The value to set in the Kafka header. + +`client_id` $$$kafka-client_id-setting$$$ +: The configurable ClientID used for logging, debugging, and auditing purposes. The default is `Elastic`. The Client ID is part of the protocol to identify where the messages are coming from. ## Other configuration settings [output-kafka-configuration-settings] You can specify these various other options in the `kafka-output` section of the agent configuration file. -| Setting | Description | -| --- | --- | -| $$$output-kafka-backoff.init-setting$$$
`backoff.init`
| (string) The number of seconds to wait before trying to reconnect to Kafka after a network error. After waiting `backoff.init` seconds, {{agent}} tries to reconnect. If the attempt fails, the backoff timer is increased exponentially up to `backoff.max`. After a successful connection, the backoff timer is reset.

**Default:** `1s`
| -| $$$kafka-backoff.max-setting$$$
`backoff.max`
| (string) The maximum number of seconds to wait before attempting to connect to Kafka after a network error.

**Default:** `60s`
| -| $$$kafka-broker_timeout-setting$$$
`broker_timeout`
| The maximum length of time a Kafka broker waits for the required number of ACKs before timing out (see the `required_acks` setting further in).

**Default:** `30` (seconds)
| -| $$$kafka-bulk_flush_frequency-setting$$$
`bulk_flush_frequency`
| (int) Duration to wait before sending bulk Kafka request. `0`` is no delay.

**Default:** `0`
| -| $$$kafka-bulk_max_size-setting$$$
`bulk_max_size`
| (int) The maximum number of events to bulk in a single Kafka request.

**Default:** `2048`
| -| $$$kafka-channel_buffer_size-setting$$$
`channel_buffer_size`
| (int) Per Kafka broker number of messages buffered in output pipeline.

**Default:** `256`
| -| $$$kafka-codec-setting$$$
`codec`
| Output codec configuration. You can specify either the `json` or `format` codec. By default the `json` codec is used.

**`json.pretty`**: If `pretty` is set to true, events will be nicely formatted. The default is false.

**`json.escape_html`**: If `escape_html` is set to true, html symbols will be escaped in strings. The default is false.

Example configuration that uses the `json` codec with pretty printing enabled to write events to the console:

```yaml
output.console:
codec.json:
pretty: true
escape_html: false
```

**`format.string`**: Configurable format string used to create a custom formatted message.

Example configurable that uses the `format` codec to print the events timestamp and message field to console:

```yaml
output.console:
codec.format:
string: '%{[@timestamp]} %{[message]}'
```
| -| $$$kafka-compression-setting$$$
`compression`
| Select a compression codec to use. Supported codecs are `snappy`, `lz4` and `gzip`.
| -| $$$kafka-compression_level-setting$$$
`compression_level`
| For the `gzip` codec you can choose a compression level. The level must be in the range of `1` (best speed) to `9` (best compression).

Increasing the compression level reduces the network usage but increases the CPU usage.

**Default:** `4`.
| -| $$$kafka-keep_alive-setting$$$
`keep_alive`
| (string) The keep-alive period for an active network connection. If `0s`, keep-alives are disabled.

**Default:** `0s`
| -| $$$kafka-max_message_bytes-setting$$$
`max_message_bytes`
| (int) The maximum permitted size of JSON-encoded messages. Bigger messages will be dropped. This value should be equal to or less than the broker’s `message.max.bytes`.

**Default:** `1000000` (bytes)
| -| $$$kafka-metadata-setting$$$
`metadata`
| Kafka metadata update settings. The metadata contains information about brokers, topics, partition, and active leaders to use for publishing.

**`refresh_frequency`**
: Metadata refresh interval. Defaults to 10 minutes.

**`full`**
: Strategy to use when fetching metadata. When this option is `true`, the client will maintain a full set of metadata for all the available topics. When set to `false` it will only refresh the metadata for the configured topics. The default is false.

**`retry.max`**
: Total number of metadata update retries. The default is 3.

**`retry.backoff`**
: Waiting time between retries. The default is 250ms.
| -| $$$kafka-required_acks-setting$$$
`required_acks`
| The ACK reliability level required from broker. 0=no response, 1=wait for local commit, -1=wait for all replicas to commit. The default is 1.

Note: If set to 0, no ACKs are returned by Kafka. Messages might be lost silently on error.

**Default:** `1` (wait for local commit)
| -| $$$kafka-timeout-setting$$$
`timeout`
| The number of seconds to wait for responses from the Kafka brokers before timing out. The default is 30 (seconds).

**Default:** `1000000` (bytes)
| +`backoff.init` $$$output-kafka-backoff.init-setting$$$ +: (string) The number of seconds to wait before trying to reconnect to Kafka after a network error. After waiting `backoff.init` seconds, {{agent}} tries to reconnect. If the attempt fails, the backoff timer is increased exponentially up to `backoff.max`. After a successful connection, the backoff timer is reset. + + **Default:** `1s` + +`backoff.max` $$$kafka-backoff.max-setting$$$ +: (string) The maximum number of seconds to wait before attempting to connect to Kafka after a network error. + + **Default:** `60s` + +`broker_timeout` $$$kafka-broker_timeout-setting$$$ +: The maximum length of time a Kafka broker waits for the required number of ACKs before timing out (see the `required_acks` setting further in). + + **Default:** `30` (seconds) + +`bulk_flush_frequency` $$$kafka-bulk_flush_frequency-setting$$$ +: (int) Duration to wait before sending bulk Kafka request. `0` is no delay. + + **Default:** `0` + +`bulk_max_size` $$$kafka-bulk_max_size-setting$$$ +: (int) The maximum number of events to bulk in a single Kafka request. + + **Default:** `2048` + +`channel_buffer_size` $$$kafka-channel_buffer_size-setting$$$ +: (int) Per Kafka broker number of messages buffered in output pipeline. + + **Default:** `256` + +`codec` $$$kafka-codec-setting$$$ +: Output codec configuration. You can specify either the `json` or `format` codec. By default the `json` codec is used. + **`json.pretty`**: If `pretty` is set to true, events will be nicely formatted. The default is false. + **`json.escape_html`**: If `escape_html` is set to true, html symbols will be escaped in strings. The default is false. + Example configuration that uses the `json` codec with pretty printing enabled to write events to the console: + + ```yml + output.console: + codec.json: + pretty: true + escape_html: false + ``` + + **`format.string`**: Configurable format string used to create a custom formatted message. + Example configurable that uses the `format` codec to print the events timestamp and message field to console: + + ```yml + output.console: + codec.format: + string: '%{[@timestamp]} %{[message]}' + ``` + +`compression` $$$kafka-compression-setting$$$ +: Select a compression codec to use. Supported codecs are `snappy`, `lz4` and `gzip`. + +`compression_level` $$$kafka-compression_level-setting$$$ +: For the `gzip` codec you can choose a compression level. The level must be in the range of `1` (best speed) to `9` (best compression). + Increasing the compression level reduces the network usage but increases the CPU usage. + + **Default:** `4`. + +`keep_alive` $$$kafka-keep_alive-setting$$$ +: (string) The keep-alive period for an active network connection. If `0s`, keep-alives are disabled. + + **Default:** `0s` + +`max_message_bytes` $$$kafka-max_message_bytes-setting$$$ +: (int) The maximum permitted size of JSON-encoded messages. Bigger messages will be dropped. This value should be equal to or less than the broker’s `message.max.bytes`. + + **Default:** `1000000` (bytes) + +`metadata` $$$kafka-metadata-setting$$$ +: Kafka metadata update settings. The metadata contains information about brokers, topics, partition, and active leaders to use for publishing. + `refresh_frequency` + : Metadata refresh interval. Defaults to 10 minutes. + + `full` + : Strategy to use when fetching metadata. When this option is `true`, the client will maintain a full set of metadata for all the available topics. When set to `false` it will only refresh the metadata for the configured topics. The default is false. + + `retry.max` + : Total number of metadata update retries. The default is 3. + + `retry.backoff` + : Waiting time between retries. The default is 250ms. + +`required_acks` $$$kafka-required_acks-setting$$$ +: The ACK reliability level required from broker. 0=no response, 1=wait for local commit, -1=wait for all replicas to commit. The default is 1. + Note: If set to 0, no ACKs are returned by Kafka. Messages might be lost silently on error. + **Default:** `1` (wait for local commit) + +`timeout` $$$kafka-timeout-setting$$$ +: The number of seconds to wait for responses from the Kafka brokers before timing out. The default is 30 (seconds). + **Default:** `1000000` (bytes) diff --git a/reference/fleet/logstash-output.md b/reference/fleet/logstash-output.md index 18cf11e7eb..9970188181 100644 --- a/reference/fleet/logstash-output.md +++ b/reference/fleet/logstash-output.md @@ -25,7 +25,7 @@ outputs: To receive the events in {{ls}}, you also need to create a {{ls}} configuration pipeline. The {{ls}} configuration pipeline listens for incoming {{agent}} connections, processes received events, and then sends the events to {{es}}. -Be aware that the structure of the documents sent from {{agent}} to {{ls}} must not be modified by the pipeline. We recommend that the pipeline doesn’t edit or remove the fields and their contents. Editing the structure of the documents coming from {{agent}} can prevent the {{es}} ingest pipelines associated to the integrations in use to work correctly. We cannot guarantee that the {{es}} ingest pipelines associated to the integrations using {agent} can work with missing or modified fields. +Be aware that the structure of the documents sent from {{agent}} to {{ls}} must not be modified by the pipeline. We recommend that the pipeline doesn’t edit or remove the fields and their contents. Editing the structure of the documents coming from {{agent}} can prevent the {{es}} ingest pipelines associated to the integrations in use to work correctly. We cannot guarantee that the {{es}} ingest pipelines associated to the integrations using {{agent}} can work with missing or modified fields. The following {{ls}} pipeline definition example configures a pipeline that listens on port `5044` for incoming {{agent}} connections and routes received events to {{es}}. @@ -74,13 +74,38 @@ The `logstash` output supports the following settings, grouped by category. Many ## Commonly used settings [output-logstash-commonly-used-settings] -| Setting | Description | -| --- | --- | -| $$$output-logstash-enabled-setting$$$
`enabled`
| (boolean) Enables or disables the output. If set to `false`, the output is disabled.
| -| $$$output-logstash-escape_html-setting$$$
`escape_html`
| (boolean) Configures escaping of HTML in strings. Set to `true` to enable escaping.

**Default:** `false`
| -| $$$output-logstash-hosts-setting$$$
`hosts`
| (list) The list of known {{ls}} servers to connect to. If load balancing is disabled, but multiple hosts are configured, one host is selected randomly (there is no precedence). If one host becomes unreachable, another one is selected randomly.

All entries in this list can contain a port number. If no port is specified, `5044` is used.
| -| $$$output-logstash-proxy_url-setting$$$
`proxy_url`
| (string) The URL of the SOCKS5 proxy to use when connecting to the {{ls}} servers. The value must be a URL with a scheme of `socks5://`. The protocol used to communicate to {{ls}} is not based on HTTP, so you cannot use a web proxy.

If the SOCKS5 proxy server requires client authentication, embed a username and password in the URL as shown in the example.

When using a proxy, hostnames are resolved on the proxy server instead of on the client. To change this behavior, set `proxy_use_local_resolver`.

```yaml
outputs:
default:
type: logstash
hosts: ["remote-host:5044"]
proxy_url: socks5://user:password@socks5-proxy:2233
```
| -| $$$output-logstash-proxy_use_local_resolver-setting$$$
`proxy_use_` `local_resolver`
| (boolean) Determines whether {{ls}} hostnames are resolved locally when using a proxy. If `false` and a proxy is used, name resolution occurs on the proxy server.

**Default:** `false`
| +`enabled` $$$output-logstash-enabled-setting$$$ +: (boolean) Enables or disables the output. If set to `false`, the output is disabled. + +`escape_html` $$$output-logstash-escape_html-setting$$$ +: (boolean) Configures escaping of HTML in strings. Set to `true` to enable escaping. + + **Default:** `false` + +`hosts` $$$output-logstash-hosts-setting$$$ +: (list) The list of known {{ls}} servers to connect to. If load balancing is disabled, but multiple hosts are configured, one host is selected randomly (there is no precedence). If one host becomes unreachable, another one is selected randomly. + + All entries in this list can contain a port number. If no port is specified, `5044` is used. + +`proxy_url` $$$output-logstash-proxy_url-setting$$$ +: (string) The URL of the SOCKS5 proxy to use when connecting to the {{ls}} servers. The value must be a URL with a scheme of `socks5://`. The protocol used to communicate to {{ls}} is not based on HTTP, so you cannot use a web proxy. + + If the SOCKS5 proxy server requires client authentication, embed a username and password in the URL as shown in the example. + + When using a proxy, hostnames are resolved on the proxy server instead of on the client. To change this behavior, set `proxy_use_local_resolver`. + + ```yaml + outputs: + default: + type: logstash + hosts: ["remote-host:5044"] + proxy_url: socks5://user:password@socks5-proxy:2233 + ``` + +`proxy_use_local_resolver` $$$output-logstash-proxy_use_local_resolver-setting$$$ +: (boolean) Determines whether {{ls}} hostnames are resolved locally when using a proxy. If `false` and a proxy is used, name resolution occurs on the proxy server. + + **Default:** `false` ## Authentication settings [output-logstash-authentication-settings] @@ -119,29 +144,114 @@ This sample configuration forwards events to the output when there are enough ev queue.mem.flush.timeout: 5s ``` -| Setting | Description | -| --- | --- | -| $$$output-logstash-queue.mem.events-setting$$$
`queue.mem.events`
| The number of events the queue can store. This value should be evenly divisible by the smaller of `queue.mem.flush.min_events` or `bulk_max_size` to avoid sending partial batches to the output.

**Default:** `3200 events`
| -| $$$output-logstash-queue.mem.flush.min_events-setting$$$
`queue.mem.flush.min_events`
| `flush.min_events` is a legacy parameter, and new configurations should prefer to control batch size with `bulk_max_size`. As of 8.13, there is never a performance advantage to limiting batch size with `flush.min_events` instead of `bulk_max_size`

**Default:** `1600 events`
| -| $$$output-logstash-queue.mem.flush.timeout-setting$$$
`queue.mem.flush.timeout`
| (int) The maximum wait time for `queue.mem.flush.min_events` to be fulfilled. If set to 0s, events are available to the output immediately.

**Default:** `10s`
| +`queue.mem.events` $$$output-logstash-queue.mem.events-setting$$$ +: The number of events the queue can store. This value should be evenly divisible by the smaller of `queue.mem.flush.min_events` or `bulk_max_size` to avoid sending partial batches to the output. + + **Default:** `3200 events` + +`queue.mem.flush.min_events` $$$output-logstash-queue.mem.flush.min_events-setting$$$ +: `flush.min_events` is a legacy parameter, and new configurations should prefer to control batch size with `bulk_max_size`. As of 8.13, there is never a performance advantage to limiting batch size with `flush.min_events` instead of `bulk_max_size` + + **Default:** `1600 events` + +`queue.mem.flush.timeout` $$$output-logstash-queue.mem.flush.timeout-setting$$$ +: (int) The maximum wait time for `queue.mem.flush.min_events` to be fulfilled. If set to 0s, events are available to the output immediately. + + **Default:** `10s` ## Performance tuning settings [output-logstash-performance-tuning-settings] Settings that may affect performance. -| Setting | Description | -| --- | --- | -| $$$output-logstash-backoff.init-setting$$$
`backoff.init`
| (string) The number of seconds to wait before trying to reconnect to {{ls}} after a network error. After waiting `backoff.init` seconds, {{agent}} tries to reconnect. If the attempt fails, the backoff timer is increased exponentially up to `backoff.max`. After a successful connection, the backoff timer is reset.

**Default:** `1s`
| -| $$$output-logstash-backoff.max-setting$$$
`backoff.max`
| (string) The maximum number of seconds to wait before attempting to connect to {{es}} after a network error.

**Default:** `60s`
| -| $$$output-logstash-bulk_max_size-setting$$$
`bulk_max_size`
| (int) The maximum number of events to bulk in a single {{ls}} request.

Events can be collected into batches. {{agent}} will split batches larger than `bulk_max_size` into multiple batches.

Specifying a larger batch size can improve performance by lowering the overhead of sending events. However big batch sizes can also increase processing times, which might result in API errors, killed connections, timed-out publishing requests, and, ultimately, lower throughput.

Set this value to `0` to turn off the splitting of batches. When splitting is turned off, the queue determines the number of events to be contained in a batch.

**Default:** `2048`
| -| $$$output-logstash-compression_level-setting$$$
`compression_level`
| (int) The gzip compression level. Set this value to `0` to disable compression. The compression level must be in the range of `1` (best speed) to `9` (best compression).

Increasing the compression level reduces network usage but increases CPU usage.

**Default:** `3`
| -| $$$output-logstash-loadbalance-setting$$$
`loadbalance`
| If `true` and multiple {{ls}} hosts are configured, the output plugin load balances published events onto all {{ls}} hosts. If `false`, the output plugin sends all events to one host (determined at random) and switches to another host if the selected one becomes unresponsive.

With `loadbalance` enabled:

* {{agent}} reads batches of events and sends each batch to one {{ls}} worker dynamically, based on a work-queue shared between the outputs.
* If a connection drops, {{agent}} takes the disconnected {{ls}} worker out of its pool.
* {{agent}} tries to reconnect. If it succeeds, it re-adds the {{ls}} worker to the pool.
* If one of the {{ls}} nodes is slow but "healthy", it sends a keep-alive signal until the full batch of data is processed. This prevents {{agent}} from sending further data until it receives an acknowledgement signal back from {{ls}}. {{agent}} keeps all events in memory until after that acknowledgement occurs.

Without `loadbalance` enabled:

* {{agent}} picks a random {{ls}} host and sends batches of events to it. Due to the random algorithm, the load on the {{ls}} nodes should be roughly equal.
* In case of any errors, {{agent}} picks another {{ls}} node, also at random. If a connection to a host fails, the host is retried only if there are errors on the new connection.

**Default:** `false`

Example:

```yaml
outputs:
default:
type: logstash
hosts: ["localhost:5044", "localhost:5045"]
loadbalance: true
```
| -| $$$output-logstash-max_retries-setting$$$
`max_retries`
| (int) The number of times to retry publishing an event after a publishing failure. After the specified number of retries, the events are typically dropped.

Set `max_retries` to a value less than 0 to retry until all events are published.

**Default:** `3`
| -| $$$output-logstash-pipelining-setting$$$
`pipelining`
| (int) The number of batches to send asynchronously to {{ls}} while waiting for an ACK from {{ls}}. The output becomes blocking after the specified number of batches are written. Specify `0` to turn off pipelining.

**Default:** `2`
| -| $$$output-logstash-slow_start-setting$$$
`slow_start`
| (boolean) If `true`, only a subset of events in a batch of events is transferred per transaction. The number of events to be sent increases up to `bulk_max_size` if no error is encountered. On error, the number of events per transaction is reduced again.

**Default:** `false`
| -| $$$output-logstash-timeout-setting$$$
`timeout`
| (string) The number of seconds to wait for responses from the {{ls}} server before timing out.

**Default:** `30s`
| -| $$$output-logstash-ttl-setting$$$
`ttl`
| (string) Time to live for a connection to {{ls}} after which the connection will be reestablished. This setting is useful when {{ls}} hosts represent load balancers. Because connections to {{ls}} hosts are sticky, operating behind load balancers can lead to uneven load distribution across instances. Specify a TTL on the connection to achieve equal connection distribution across instances.

**Default:** `0` (turns off the feature)

::::{note}
The `ttl` option is not yet supported on an asynchronous {{ls}} client (one with the `pipelining` option set).
::::

| -| $$$output-logstash-worker-setting$$$
`worker`
| (int) The number of workers per configured host publishing events. Example: If you have two hosts and three workers, in total six workers are started (three for each host).

**Default:** `1`
| + +`backoff.init` $$$output-logstash-backoff.init-setting$$$ +: (string) The number of seconds to wait before trying to reconnect to {{ls}} after a network error. After waiting `backoff.init` seconds, {{agent}} tries to reconnect. If the attempt fails, the backoff timer is increased exponentially up to `backoff.max`. After a successful connection, the backoff timer is reset. + + **Default:** `1s` + +`backoff.max` $$$output-logstash-backoff.max-setting$$$ +: (string) The maximum number of seconds to wait before attempting to connect to {{es}} after a network error. + + **Default:** `60s` + +`bulk_max_size` $$$output-logstash-bulk_max_size-setting$$$ +: (int) The maximum number of events to bulk in a single {{ls}} request. + + Events can be collected into batches. {{agent}} will split batches larger than `bulk_max_size` into multiple batches. + + Specifying a larger batch size can improve performance by lowering the overhead of sending events. However big batch sizes can also increase processing times, which might result in API errors, killed connections, timed-out publishing requests, and, ultimately, lower throughput. + + Set this value to `0` to turn off the splitting of batches. When splitting is turned off, the queue determines the number of events to be contained in a batch. + + **Default:** `2048` + +`compression_level` $$$output-logstash-compression_level-setting$$$ +: (int) The gzip compression level. Set this value to `0` to disable compression. The compression level must be in the range of `1` (best speed) to `9` (best compression). + + Increasing the compression level reduces network usage but increases CPU usage. + + **Default:** `3` + +`loadbalance` $$$output-logstash-loadbalance-setting$$$ +: If `true` and multiple {{ls}} hosts are configured, the output plugin load balances published events onto all {{ls}} hosts. If `false`, the output plugin sends all events to one host (determined at random) and switches to another host if the selected one becomes unresponsive. + + With `loadbalance` enabled: + * {{agent}} reads batches of events and sends each batch to one {{ls}} worker dynamically, based on a work-queue shared between the outputs. + * If a connection drops, {{agent}} takes the disconnected {{ls}} worker out of its pool. + * {{agent}} tries to reconnect. If it succeeds, it re-adds the {{ls}} worker to the pool. + * If one of the {{ls}} nodes is slow but "healthy", it sends a keep-alive signal until the full batch of data is processed. This prevents {{agent}} from sending further data until it receives an acknowledgement signal back from {{ls}}. {{agent}} keeps all events in memory until after that acknowledgement occurs. + + Without `loadbalance` enabled: + * {{agent}} picks a random {{ls}} host and sends batches of events to it. Due to the random algorithm, the load on the {{ls}} nodes should be roughly equal. + * In case of any errors, {{agent}} picks another {{ls}} node, also at random. If a connection to a host fails, the host is retried only if there are errors on the new connection. + + **Default:** `false` + + Example: + + ```yaml + outputs: + default: + type: logstash + hosts: ["localhost:5044", "localhost:5045"] + loadbalance: true + ``` + +`max_retries` $$$output-logstash-max_retries-setting$$$ +: (int) The number of times to retry publishing an event after a publishing failure. After the specified number of retries, the events are typically dropped. + + Set `max_retries` to a value less than 0 to retry until all events are published. + + **Default:** `3` + +`pipelining` $$$output-logstash-pipelining-setting$$$ +: (int) The number of batches to send asynchronously to {{ls}} while waiting for an ACK from {{ls}}. The output becomes blocking after the specified number of batches are written. Specify `0` to turn off pipelining. + + **Default:** `2` + +`slow_start` $$$output-logstash-slow_start-setting$$$ +: (boolean) If `true`, only a subset of events in a batch of events is transferred per transaction. The number of events to be sent increases up to `bulk_max_size` if no error is encountered. On error, the number of events per transaction is reduced again. + + **Default:** `false` + +`timeout` $$$output-logstash-timeout-setting$$$ +: (string) The number of seconds to wait for responses from the {{ls}} server before timing out. + + **Default:** `30s` + +`ttl` $$$output-logstash-ttl-setting$$$ +: (string) Time to live for a connection to {{ls}} after which the connection will be reestablished. This setting is useful when {{ls}} hosts represent load balancers. Because connections to {{ls}} hosts are sticky, operating behind load balancers can lead to uneven load distribution across instances. Specify a TTL on the connection to achieve equal connection distribution across instances. + + **Default:** `0` (turns off the feature) + + ::::{note} + The `ttl` option is not yet supported on an asynchronous {{ls}} client (one with the `pipelining` option set). + :::: + +`worker` $$$output-logstash-worker-setting$$$ +: (int) The number of workers per configured host publishing events. Example: If you have two hosts and three workers, in total six workers are started (three for each host). + + **Default:** `1` diff --git a/solutions/observability/apm/data-streams.md b/solutions/observability/apm/data-streams.md index 574878443d..f2c7c5f651 100644 --- a/solutions/observability/apm/data-streams.md +++ b/solutions/observability/apm/data-streams.md @@ -43,22 +43,20 @@ Metrics * APM service summary metrics: `metrics-apm.service_summary.-` * Application metrics: `metrics-apm.app.-` - Application metrics include the instrumented service’s name—​defined in each {{apm-agent}}'s configuration—​in the data stream name. Service names therefore must follow certain index naming rules. + Application metrics include the instrumented service’s name—​defined in each {{apm-agent}}'s configuration—​in the data stream name. Service names therefore must follow certain index naming rules. - ::::{dropdown} Service name rules - * Service names are case-insensitive and must be unique. For example, you cannot have a service named `Foo` and another named `foo`. - * Special characters will be removed from service names and replaced with underscores (`_`). Special characters include: + ::::{dropdown} Service name rules + * Service names are case-insensitive and must be unique. For example, you cannot have a service named `Foo` and another named `foo`. + * Special characters will be removed from service names and replaced with underscores (`_`). Special characters include: + ```text + '\\', '/', '*', '?', '"', '<', '>', '|', ' ', ',', '#', ':', '-' + ``` + :::: - ```text - '\\', '/', '*', '?', '"', '<', '>', '|', ' ', ',', '#', ':', '-' - ``` + ::::{important} + Additional storage efficiencies provided by [Synthetic `_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md) are available to users with an [appropriate license](https://www.elastic.co/subscriptions). - :::: - - ::::{important} - Additional storage efficiencies provided by [Synthetic `_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md) are available to users with an [appropriate license](https://www.elastic.co/subscriptions). - - :::: + :::: Logs : Logs include application error events and application logs. Logs are stored in the following data streams: diff --git a/solutions/observability/synthetics/configure-lightweight-monitors.md b/solutions/observability/synthetics/configure-lightweight-monitors.md index 305dfda90f..db2de243ae 100644 --- a/solutions/observability/synthetics/configure-lightweight-monitors.md +++ b/solutions/observability/synthetics/configure-lightweight-monitors.md @@ -454,8 +454,7 @@ $$$monitor-http-response$$$ : Controls the indexing of the HTTP response body contents to the `http.response.body.contents` field. **`include_body`** (`"on_error"` | `"never"` | `"always"`) - : Set `response.include_body` to one of the options listed below. - + : Set `response.include_body` to one of the options listed below: * `on_error`: Include the body if an error is encountered during the check. This is the default. * `never`: Never include the body. * `always`: Always include the body with checks.