Skip to content

Commit 3f8c511

Browse files
authored
feat(relay): Add document describing span and transaction quotas (#11563)
1 parent fb8e875 commit 3f8c511

File tree

1 file changed

+227
-0
lines changed

1 file changed

+227
-0
lines changed
Lines changed: 227 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,227 @@
1+
---
2+
title: Transaction and Span Rate Limiting
3+
---
4+
5+
<Alert level="warning" title="Important">
6+
This document describes a desired state, it is not yet fully implemented in Relay.
7+
</Alert>
8+
9+
10+
Relay enforces quotas defined in Sentry and propagates them as rate limits to clients. In most cases,
11+
this is a simple one-to-one mapping of data category to envelope item.
12+
13+
The transaction and span rate limits are more complicated. Spans and Transactions both
14+
have a "total" and an additional "indexed" category. They are also closely related, a transaction is a container for spans.
15+
A dropped transaction results in dropped spans.
16+
17+
The following document describes how transaction and span quotas interact with each other within Relay.
18+
19+
<Alert level="info" title="Important">
20+
This section describes how Relay has to interpret quotas, SDKs and other consumers
21+
should read [SDK Development / Rate Limiting](/sdk/rate-limiting/).
22+
</Alert>
23+
24+
25+
Related Documentation:
26+
- [The Indexed Outcome Category](/application/dynamic-sampling/outcomes/)
27+
28+
29+
30+
## Enforcement
31+
32+
The following rules hold true:
33+
- A quota for the transaction category is also enforced for the span category.
34+
- A quota for the span category is also enforced for the transaction category.
35+
- When a transaction is dropped, all the contained spans should be reported in outcomes.
36+
- If either transactions or spans are rate limited, clients should receive a limit for both categories.
37+
- Indexed quotas only affect the payload for their respective category.
38+
39+
These rules can be visualized using this table:
40+
41+
| Rate Limit / Item | Transaction Payload | Transaction Metric | Span Payload | Span Metric |
42+
| ----------------------- | ------------------- | ------------------ | ------------ | ----------- |
43+
| **Transaction** |||||
44+
| **Transaction Indexed** |||||
45+
| **Span** |||||
46+
| **Span Indexed** |||||
47+
48+
- ❌: Item rejected.
49+
- ✅: Item accepted.
50+
51+
## Outcomes
52+
53+
Outcomes must be generated in Relay for every span and transaction which is dropped. Usually a dropped
54+
span/transaction results in outcomes for their respective total and indexed category, refer to
55+
[The Indexed Outcome Category](/application/dynamic-sampling/outcomes/) for details.
56+
57+
This is straight forward for all indexed rate limits, they only drop the payload which results in a single
58+
negative outcome in the indexed category of the item. A `transaction_indexed` rate limit does not
59+
cause any spans to be dropped and vice versa.
60+
61+
A span quota and the resulting span rate limit is also trivial for standalone spans received by Relay,
62+
the standalone span is dropped, and a single outcome is generated.
63+
64+
### Transaction Outcomes
65+
66+
Transactions are containers for spans until they are extracted in Relay. This span extraction can happen at any
67+
Relay stage: customer managed, PoP-Relay or Processing-Relay. Until spans are extracted from Relay, a dropped transaction
68+
should count the contained spans and generate an outcome with the contained span quantity + 1, for the segment span
69+
which would be generated from the transaction itself.
70+
71+
<Note>
72+
While it is desirable to have span counts correctly extracted from dropped transactions, it may not be feasible
73+
to do so at any stage of the processing pipeline. For example, it may not be possible to do so (malformed transactions)
74+
or simply too expensive to compute.
75+
</Note>
76+
77+
After spans have been extracted, the transaction is no longer a container of span items and just represents itself,
78+
thus, a dropped transaction with spans already extracted only generates outcomes for the total transactions and
79+
indexed transaction categories.
80+
81+
<Note>
82+
Span quotas are equivalent to transactions quotas, so all the above also applies for a span quota.
83+
</Note>
84+
85+
86+
## Examples
87+
88+
### Example: Span Indexed Quota
89+
90+
**Quota**:
91+
92+
```json
93+
{
94+
"categories": ["span_indexed"],
95+
"limit": 0
96+
// ...
97+
}
98+
```
99+
100+
**Transaction**:
101+
102+
```json
103+
{
104+
"type": "transaction",
105+
"spans": [
106+
{ .. },
107+
{ .. },
108+
{ .. }
109+
],
110+
// ...
111+
}
112+
```
113+
114+
An envelope containing a transaction with 3 child spans generates 4 outcomes for rate limited spans in the
115+
`spans_indexed` category. 1 count for the generated segment span from the transaction and 3 counts for the
116+
contained spans. The transaction itself will still be ingested.
117+
118+
**Ingestion**:
119+
120+
| Transaction Payload | Transaction Metrics | Span Payload | Span Metrics |
121+
| ------------------- | ------------------- | ------------ | ------------ |
122+
|||||
123+
124+
**Negative Outcomes**:
125+
126+
| `transaction` | `transaction_indexed` | `span` | `span_indexed` |
127+
| ------------- | --------------------- | ------ | -------------- |
128+
| 0 | 0 | 0 | 4 |
129+
130+
**Rate Limits propagated to SDKs:** None.
131+
132+
133+
### Example: Transaction Quota
134+
135+
**Quota**:
136+
137+
```json
138+
{
139+
"categories": ["transaction"],
140+
"limit": 0
141+
// ...
142+
}
143+
```
144+
145+
**Transaction**:
146+
147+
```json
148+
{
149+
"type": "transaction",
150+
"spans": [
151+
{ .. },
152+
{ .. },
153+
{ .. }
154+
],
155+
// ...
156+
}
157+
```
158+
159+
**Ingestion**:
160+
161+
| Transaction Payload | Transaction Metrics | Span Payload | Span Metrics |
162+
| ------------------- | ------------------- | ------------ | ------------ |
163+
|||||
164+
165+
**Negative Outcomes**:
166+
167+
| `transaction` | `transaction_indexed` | `span` | `span_indexed` |
168+
| ------------- | --------------------- | ------ | -------------- |
169+
| 1 | 1 | 4 | 4 |
170+
171+
**Rate Limits propagated to SDKs:** Transaction, Span.
172+
173+
174+
## FAQ / Reasoning
175+
176+
### Why do transaction limits need to be propagated as span limits and vice versa?
177+
178+
At first glance this is non-obvious, spans can exist without transactions
179+
and also transactions can exist without (standalone) spans, so why can the quotas
180+
be used interchangeably?
181+
182+
The reasoning lies in the Sentry product and how the information is used within Sentry.
183+
Because of Relay, Sentry can safely assume spans exist if the user/SDK sends transactions
184+
and more and more of the product is built on the basis of spans. From this point of view
185+
transactions and spans convey the same information, it is just represented differently.
186+
187+
The logical conclusion and simplification is, to treat span and transaction rate limits
188+
equally (important: treat the limits the same, the quotas are still tracked separately).
189+
Enforcing these limits on SDKs then does not require extra logic, it is all contained
190+
within Relay and works for any SDK version, future and present.
191+
192+
### Why can span outcomes and transaction outcomes become inconsistent?
193+
194+
There can be multiple reasons for this in the entire pipeline, these are just some reasons why there
195+
can be differences caused by Relay:
196+
197+
- Parsing the span count from a transaction is too expensive and may be omitted. This is an extremely
198+
important property of Relay, abuse cases must be handled as fast as possible with as little cost ($ and resources)
199+
as possible. In some cases, it may not be feasible to JSON parse the transaction to extract a span count.
200+
- A envelope item may be malformed, there will be outcomes generated for the inferred data category (span or transaction),
201+
but the span count cannot be recovered from an invalid transaction.
202+
203+
204+
### I want Relay to enforce a quota, which category should I use?
205+
206+
From top to bottom:
207+
208+
- Do you completely disable ingestion? Configure the limit for the `span` and `transaction` data categories.
209+
- Is it billing related? For example, spike protection operates on the category of the billing unit,
210+
use the respective data category for the limit.
211+
- Use the total category (not indexed) which makes most sense to you. When in doubt, ask the Relay team.
212+
213+
### Should I ever use an indexed category in a Quota?
214+
215+
No, unless the intention is to protect infrastructure (abuse limits).
216+
217+
Indexed quotas are only useful to protect downstream infrastructure through abuse quotas.
218+
They are inherently more expensive to enforce, cannot be propagated to clients and are generally a sign
219+
of misconfiguration.
220+
221+
Dynamic-, smart- or client side-sampling should prevent any indexed quota from being enforced.
222+
223+
### What does this mean for standalone spans?
224+
225+
They are not treated differently from extracted spans. After metrics extraction, which may happen in customer
226+
Relays, there is no more distinction. Having no special treatment for standalone spans also means we do not
227+
need any special logic in the SDKs.

0 commit comments

Comments
 (0)