Skip to content

Commit f84179f

Browse files
committed
remove redundancies
1 parent bae037d commit f84179f

File tree

1 file changed

+17
-63
lines changed

1 file changed

+17
-63
lines changed

develop-docs/sdk/miscellaneous/telemetry-buffer.mdx

Lines changed: 17 additions & 63 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ Introduce a `Buffer` layer between the `Client` and the `Transport`. This `Buffe
3434
3535
┌────────────────────────────────────────────────────────────────────────────┐
3636
│ Buffer │
37-
│ Add(item) · Flush(timeout) · Close(timeout)
37+
Add(item) · Flush(timeout) · Close(timeout) │
3838
│ │
3939
│ ┌──────────────────────┐ ┌──────────────────────┐ ┌──────────────────┐ │
4040
│ │ Error Store │ │ Check-in Store │ │ Log Store │ │
@@ -71,8 +71,8 @@ Introduce a `Buffer` layer between the `Client` and the `Transport`. This `Buffe
7171
### Priorities
7272
- CRITICAL: Error, Feedback.
7373
- HIGH: Session, CheckIn.
74-
- MEDIUM: Log, ClientReport, Span.
75-
- LOW: Transaction, Profile, ProfileChunk.
74+
- MEDIUM: Transaction, ClientReport, Span.
75+
- LOW: Log, Profile, ProfileChunk.
7676
- LOWEST: Replay.
7777

7878
Configurable via weights.
@@ -90,8 +90,8 @@ Each telemetry category maintains a store interface; a fixed-size circular array
9090
- **Batching configuration**:
9191
- `batchSize`: Number of items to combine into a single batch (1 for errors, transactions, and monitors; 100 for logs).
9292
- `timeout`: Maximum time to wait before sending a partial batch (5 seconds for logs).
93-
- **Bucketed Storage Support**: The storage interface should satisfy both bucketed and single-item implementations, allowing sending spans per trace id.
94-
- **Observability**: Each store tracks offered, accepted, and dropped item counts for client reports.
93+
- **Bucketed Storage Support**: The storage interface should satisfy both bucketed and single-item implementations, allowing sending spans per trace id (required for Span First).
94+
- **Observability**: Each store tracks dropped item counts for client reports.
9595

9696
##### Single-item ring buffer (default)
9797

@@ -104,15 +104,20 @@ Each telemetry category maintains a store interface; a fixed-size circular array
104104

105105
##### Bucketed-by-trace storage (spans)
106106

107-
- Purpose: keep spans from the same trace together and flush them as a unit to avoid partial-trace delivery under pressure.
108-
- Grouping: a new bucket is created per trace id; a map (`traceIndex`) provides O(1) lookup. Items without a trace id are accepted but grouped without an index.
109-
- Capacity model: two limits are enforced—overall `itemCapacity` and a derived `bucketCapacity ~= capacity/10` (minimum 10). Additionally, a `perBucketItemLimit` (100) prevents a single trace from monopolizing storage.
110-
- Readiness: when total buffered items reach `batchSize` or `timeout` elapses, the entire oldest bucket is flushed to preserve trace coherence.
111-
- Overflow behavior:
112-
- `drop_oldest`: evict the oldest bucket (dropping all its items) and invoke the dropped callback for each (`buffer_full_drop_oldest_bucket`). Preferred for spans to drop an entire trace.
107+
- **Purpose**: keep spans from the same trace together and flush them as a unit to avoid partial-trace delivery under pressure. This addresses a gap in standard implementations where individual span drops can create incomplete traces.
108+
- **Grouping**: a new bucket is created per trace id; a map (`traceIndex`) provides O(1) lookup.
109+
- **Capacity model**: two limits are enforced—overall `itemCapacity` and a derived `bucketCapacity ~= capacity/10` (minimum 10).
110+
- **Readiness**: when total buffered items reach `batchSize` or `timeout` elapses, the entire oldest bucket is flushed to preserve trace coherence.
111+
- **Overflow behavior**:
112+
- `drop_oldest`: evict the oldest bucket (dropping all its items). Preferred for spans to drop an entire trace.
113113
- `drop_newest`: reject the incoming item (`buffer_full_drop_newest`).
114114
- Lifecycle: empty buckets are removed and their trace ids are purged from the index; `MarkFlushed()` updates `lastFlushTime`.
115115

116+
##### Trace Consistency Trade-offs
117+
118+
There still remains a small subset of cases that might result in partial traces, where either an old trace bucket was dropped and a new span with the same trace arrived, or we dropped an incoming span of this trace.
119+
The preferred overflow behavior in most cases should be `drop_oldest` since it results in the fewest incomplete traces from the two scenarios.
120+
116121
Stores are mapped to [DataCategories](https://github.com/getsentry/relay/blob/master/relay-base-schema/src/data_category.rs), which determine their scheduling priority and rate limits.
117122

118123
#### Scheduler
@@ -135,18 +140,14 @@ The transport layer handles HTTP communication with Sentry's ingestion endpoints
135140
### Configuration
136141

137142
#### Buffer Options
138-
- **Capacity**: 100 items for errors, logs, and monitors; 1000 for transactions.
143+
- **Capacity**: 100 items for errors and check-ins, 10*BATCH_SIZE for logs, 1000 for transactions.
139144
- **Overflow policy**: `drop_oldest`.
140145
- **Batch size**: 1 for errors and monitors (immediate send), 100 for logs.
141146
- **Batch timeout**: 5 seconds for logs.
142147

143148
#### Scheduler Options
144149
- **Priority weights**: CRITICAL=5, HIGH=4, MEDIUM=3, LOW=2, LOWEST=1.
145150

146-
#### Transport Options
147-
- **Queue size**: 1000 envelopes for AsyncTransport.
148-
- **HTTP timeout**: 30 seconds.
149-
150151
### Implementation Example (Go)
151152

152153
The `sentry-go` SDK provides a reference implementation of this architecture:
@@ -177,17 +178,6 @@ type Storage[T any] interface {
177178
// Category/Priority
178179
Category() ratelimit.Category
179180
Priority() ratelimit.Priority
180-
181-
// Metrics
182-
OfferedCount() int64
183-
DroppedCount() int64
184-
AcceptedCount() int64
185-
DropRate() float64
186-
GetMetrics() BufferMetrics
187-
188-
// Configuration
189-
SetDroppedCallback(callback func(item T, reason string))
190-
Clear()
191181
}
192182

193183

@@ -298,39 +288,3 @@ func (b *Buffer) Flush(timeout time.Time) {
298288
transport.flush(timeout)
299289
}
300290
```
301-
302-
### Batching Policies
303-
304-
Different telemetry types use batching strategies optimized for their characteristics:
305-
306-
- **Errors**: Single-item envelopes for immediate delivery (latency-sensitive).
307-
- **Monitors**: Single-item envelopes to maintain check-in timing accuracy.
308-
- **Logs**: Batches of up to 100 items or 5-second timeout, whichever comes first (volume-optimized).
309-
- **Transactions**: Single-item envelopes (trace-aware batching is a future enhancement).
310-
311-
#### Batch Processing Details
312-
313-
For high-volume telemetry like logs, the buffer uses time and count-based batching:
314-
315-
**Timeout-based flushing**:
316-
- When the first item enters an empty log buffer, a timeout starts (5 seconds).
317-
- When the timeout expires, all buffered log items are sent regardless of batch size.
318-
- The timeout resets after each flush.
319-
320-
**Count-based flushing**:
321-
- When the number of buffered log items reaches the batch size (100), they are sent immediately.
322-
323-
**Ordering and lifecycle**:
324-
- Filtering and sampling happen before buffering to avoid wasting buffer space.
325-
- Rate limiting is checked before dispatch; if limited, items remain buffered.
326-
- Items are batched into a single envelope with multiple entries of the same type (logs).
327-
328-
### Observability
329-
330-
The buffer system exposes metrics to help you understand telemetry flow and identify issues:
331-
332-
- **Per-category counters**: Items offered, sent successfully, and dropped.
333-
- **Drop reasons**: Distinguish between buffer overflow and rate limit drops.
334-
- **Buffer utilization**: Current size vs. capacity for each category.
335-
336-
These metrics enable dashboards that visualize why events are being dropped, helping you tune buffer sizes or identify rate limiting issues.

0 commit comments

Comments
 (0)