Allow users to specify dataset and namespace for rerouting data collected by kubernetes.container_logs datastream

As a follow-up of https://github.com/elastic/integrations/issues/5988 we want to avoid adding default routing rules in our integrations. Instead, we would like to give the user the ability to specify labels in Kubernetes containers that would be used to reroute the traffic.

> In contrast to previous considerations, we’re not creating a data stream per service name as this would lead to a lot of overhead for customers with thousands of services. 

The reroute processor was introduced in this PR https://github.com/elastic/elasticsearch/pull/76511 and it has been available since 8.8.0.

Currently, you can add the reroute processor manually to an ingest pipeline with an `if` condition to reroute traffic to a destination dataset and namespace.

This is a quite manual process at the moment.

In order to improve the experience for the end user we could:
1. Define some standard Kubernetes labels (for example elastic.co/dataset and elastic.co/namespace) that if present could be used to reroute the traffic automatically without the need to define a custom pipeline defined by the user. The values from those labels would end up into the fields `data_stream.dataset` and `data_stream.namespace` and a default routing rule will use them to reroute the traffic. Since the reroute processor has to be added to an ingest pipeline, that means that integrations that use those Kubernetes labels should have an ingest pipeline that checks for the presence of those container labels and reroute if those are present. We will have to evaluate what's the performance hit of having these extra steps always running. Since the benefits are quite significant, maybe that's worth it.
2. we should also extract the fields `service.name` and `service.version` from the well knows Kubernetes labels  `app.kubernetes.io/name` and `app.kubernetes.io/version`. Alternatively if those are not provided we should infer the service name from the field `container.name` and leave out the service.version field.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow users to specify dataset and namespace for rerouting data collected by kubernetes.container_logs datastream #6845

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Allow users to specify dataset and namespace for rerouting data collected by kubernetes.container_logs datastream #6845

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions