Skip to content

Commit a2274b3

Browse files
committed
Add additional proposal
1 parent 930f4a0 commit a2274b3

File tree

1 file changed

+55
-7
lines changed

1 file changed

+55
-7
lines changed

docs/design/4940-reliable-loki-pipelines.md

Lines changed: 55 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66

77
## Abstract
88

9-
Alloy's Loki pipelines currently use channels, which limits throughput due to head-of-line blocking and can causes silent log drops during config reloads or shutdowns.
9+
Alloy's Loki pipelines currently use channels, which limits throughput due to head-of-line blocking and can cause silent log drops during config reloads or shutdowns.
1010

1111
This proposal introduces a function-based pipeline using a `Consumer` interface, replacing the channel-based design. Source components will call `Consume()` directly on downstream components, enabling parallel processing and returning errors that sources can use for retry logic or proper HTTP error responses.
1212

@@ -16,7 +16,7 @@ Loki pipelines in Alloy are built using (unbuffered) channels, a design inherite
1616

1717
This comes with two big limitations:
1818
1. Throughput of each component is limited due to head-of-line blocking, where pushing to the next channel may not be possible in the presence of a slow component. An example of this is usage of [secretfilter](https://github.com/grafana/alloy/issues/3694).
19-
2. Because there is no way to signal back to the source, components can silently drop logs during config reload or shutdown and there is no to detect that.
19+
2. Because there is no way to signal back to the source, components can silently drop logs during config reload or shutdown and there is no way to detect that.
2020

2121
Consider the following simplified config:
2222
```
@@ -60,6 +60,58 @@ Solving the issues listed above.
6060

6161
A batch of entries should be considered successfully consumed when they are queued up for sending. We could try to extend this to when it was successfully sent over the wire, but that could be considered an improvement at a later stage.
6262

63+
Pros:
64+
* Increase throughput of log pipelines.
65+
* A way to signal back to the source
66+
Cons:
67+
* We need to rewrite all loki components with this new pattern and make them safe to call in parallel.
68+
* We go from an iterator like pipeline to passing slices around. Every component would have to iterate over this slice and we need to make sure it's safe to mutate because of fanout.
69+
70+
## Proposal 2: Appendable
71+
72+
The prometheus pipeline uses [Appendable](https://github.com/prometheus/prometheus/blob/main/storage/interface.go#L62).
73+
Appendable only has one method `Appender` that will return an implementation of [Appender](https://github.com/prometheus/prometheus/blob/main/storage/interface.go#L270).
74+
75+
We could adopt this pattern for loki pipelines by having:
76+
```go
77+
type Appendable interface {
78+
Appender(ctx context.Context) Appender
79+
}
80+
81+
type Appender interface {
82+
Append(entry Entry) error
83+
Commit() error
84+
Rollback() error
85+
}
86+
```
87+
88+
This approach would, like Proposal 1, solve the issues listed above with a function-based pipeline, but the pipeline would still be iterator-like (one entry at a time).
89+
90+
### How it works
91+
92+
Sources would obtain an `Appender` from the next component in the pipeline, then:
93+
1. Call `Append(entry)` for each entry in a batch
94+
2. Call `Commit()` to finalize the batch, or `Rollback()` to discard it
95+
3. Handle errors at any step
96+
97+
Processing components would:
98+
Implement `Appendable` to return an `Appender` that runs processing for each entry and forwards it to next component.
99+
100+
Sink components would:
101+
Implement `Appendable` to return an `Appender` that buffers entries until either `Commit` or `Rollback` is called.
102+
103+
Pros:
104+
* Increase throughput of log pipelines.
105+
* A way to signal back to the source
106+
* Iterator-like pipeline - one entry at a time
107+
* Transaction semantics provide better error handling
108+
* Aligns with Prometheus patterns already in use
109+
Cons:
110+
* We need to rewrite all loki components with this new pattern and make them safe to call in parallel.
111+
* More complex API
112+
113+
## Considerations for implementation
114+
63115
### Handling fan-out failures
64116

65117
Because a pipeline can "fan-out" to multiple paths, it can also partially succeed. We need to determine how to handle this.
@@ -102,8 +154,4 @@ The following components need to be updated with this new interface and we need
102154
- `loki.write`
103155
- `loki.echo`
104156

105-
Pros:
106-
* Increase throughput of log pipelines.
107-
* A way to signal back to the source
108-
Cons:
109-
* We need to rewrite all loki components with this new pattern and make them safe to call in parallel.
157+

0 commit comments

Comments
 (0)