-
Notifications
You must be signed in to change notification settings - Fork 497
Description
We're making progress on event routing via the data_stream_router
processor:
I think now is the time to discuss remaining open questions on how integrations will actually use event routing rules and to start prototyping.
Open questions:
- Should there be one pipeline per input or a single global pipeline?
- I'm unsure if a global pipeline is feasible
- Routing may be based on input specific criteria like the log file name or k8s metadata
- However, when there’s one pipeline per input, each integration would need to maintain input-specific routing rules
- I'm unsure if a global pipeline is feasible
- How would event routing look like on k8s, for example?
- Will it be possible to do some kind of hint-based routing? For example, by annotating a pod with
co.elastic.integration/mysql
. - Or will the routing be based on things like the container name?
- The routing conditions should be as lightweight as possible to minimize impact on log ingestion rate. Introspecting each log like with a grok processor, without cheaper pre-flight check is likely to be prohibitively expensive.
- Will it be possible to do some kind of hint-based routing? For example, by annotating a pod with
To answer these questions and to validate the overall approach, we should start building out a prototype. We can do that even with the data_stream_router
processor not being merged yet. Just trying to define a concrete routing pipeline with a couple of example integrations should help us thinking through remaining open questions and validate the approach.
We can also start measuring the performance impact of the routing pipeline conditions and what the expected log ingestion rate is. Do do that, we should define Rally tracks that simulate a typical event routing scenario.
cc @ruflin