|
| 1 | +--- |
| 2 | +title: Transaction and Span Rate Limiting |
| 3 | +--- |
| 4 | + |
| 5 | +<Alert level="warning" title="Important"> |
| 6 | +This document describes a desired state, it is not yet fully implemented in Relay. |
| 7 | +</Alert> |
| 8 | + |
| 9 | + |
| 10 | +Relay enforces quotas defined in Sentry and propagates them as rate limits to clients. In most cases, |
| 11 | +this is a simple one-to-one mapping of data category to envelope item. |
| 12 | + |
| 13 | +The transaction and span rate limits are more complicated. Spans and Transactions both |
| 14 | +have a "total" and an additional "indexed" category. They are also closely related, a transaction is a container for spans. |
| 15 | +A dropped transaction results in dropped spans. |
| 16 | + |
| 17 | +The following document describes how transaction and span quotas interact with each other within Relay. |
| 18 | + |
| 19 | +<Alert level="info" title="Important"> |
| 20 | +This section describes how Relay has to interpret quotas, SDKs and other consumers |
| 21 | +should read [SDK Development / Rate Limiting](/sdk/rate-limiting/). |
| 22 | +</Alert> |
| 23 | + |
| 24 | + |
| 25 | +Related Documentation: |
| 26 | +- [The Indexed Outcome Category](/application/dynamic-sampling/outcomes/) |
| 27 | + |
| 28 | + |
| 29 | + |
| 30 | +## Enforcement |
| 31 | + |
| 32 | +The following rules hold true: |
| 33 | +- A quota for the transaction category is also enforced for the span category. |
| 34 | +- A quota for the span category is also enforced for the transaction category. |
| 35 | +- When a transaction is dropped, all the contained spans should be reported in outcomes. |
| 36 | +- If either transactions or spans are rate limited, clients should receive a limit for both categories. |
| 37 | +- Indexed quotas only affect the payload for their respective category. |
| 38 | + |
| 39 | +These rules can be visualized using this table: |
| 40 | + |
| 41 | +| Rate Limit / Item | Transaction Payload | Transaction Metric | Span Payload | Span Metric | |
| 42 | +| ----------------------- | ------------------- | ------------------ | ------------ | ----------- | |
| 43 | +| **Transaction** | ❌ | ❌ | ❌ | ❌ | |
| 44 | +| **Transaction Indexed** | ❌ | ✅ | ✅ | ✅ | |
| 45 | +| **Span** | ❌ | ❌ | ❌ | ❌ | |
| 46 | +| **Span Indexed** | ✅ | ✅ | ❌ | ✅ | |
| 47 | + |
| 48 | +- ❌: Item rejected. |
| 49 | +- ✅: Item accepted. |
| 50 | + |
| 51 | +## Outcomes |
| 52 | + |
| 53 | +Outcomes must be generated in Relay for every span and transaction which is dropped. Usually a dropped |
| 54 | +span/transaction results in outcomes for their respective total and indexed category, refer to |
| 55 | +[The Indexed Outcome Category](/application/dynamic-sampling/outcomes/) for details. |
| 56 | + |
| 57 | +This is straight forward for all indexed rate limits, they only drop the payload which results in a single |
| 58 | +negative outcome in the indexed category of the item. A `transaction_indexed` rate limit does not |
| 59 | +cause any spans to be dropped and vice versa. |
| 60 | + |
| 61 | +A span quota and the resulting span rate limit is also trivial for standalone spans received by Relay, |
| 62 | +the standalone span is dropped, and a single outcome is generated. |
| 63 | + |
| 64 | +### Transaction Outcomes |
| 65 | + |
| 66 | +Transactions are containers for spans until they are extracted in Relay. This span extraction can happen at any |
| 67 | +Relay stage: customer managed, PoP-Relay or Processing-Relay. Until spans are extracted from Relay, a dropped transaction |
| 68 | +should count the contained spans and generate an outcome with the contained span quantity + 1, for the segment span |
| 69 | +which would be generated from the transaction itself. |
| 70 | + |
| 71 | +<Note> |
| 72 | +While it is desirable to have span counts correctly extracted from dropped transactions, it may not be feasible |
| 73 | +to do so at any stage of the processing pipeline. For example, it may not be possible to do so (malformed transactions) |
| 74 | +or simply too expensive to compute. |
| 75 | +</Note> |
| 76 | + |
| 77 | +After spans have been extracted, the transaction is no longer a container of span items and just represents itself, |
| 78 | +thus, a dropped transaction with spans already extracted only generates outcomes for the total transactions and |
| 79 | +indexed transaction categories. |
| 80 | + |
| 81 | +<Note> |
| 82 | +Span quotas are equivalent to transactions quotas, so all the above also applies for a span quota. |
| 83 | +</Note> |
| 84 | + |
| 85 | + |
| 86 | +## Examples |
| 87 | + |
| 88 | +### Example: Span Indexed Quota |
| 89 | + |
| 90 | +**Quota**: |
| 91 | + |
| 92 | +```json |
| 93 | +{ |
| 94 | + "categories": ["span_indexed"], |
| 95 | + "limit": 0 |
| 96 | + // ... |
| 97 | +} |
| 98 | +``` |
| 99 | + |
| 100 | +**Transaction**: |
| 101 | + |
| 102 | +```json |
| 103 | +{ |
| 104 | + "type": "transaction", |
| 105 | + "spans": [ |
| 106 | + { .. }, |
| 107 | + { .. }, |
| 108 | + { .. } |
| 109 | + ], |
| 110 | + // ... |
| 111 | +} |
| 112 | +``` |
| 113 | + |
| 114 | +An envelope containing a transaction with 3 child spans generates 4 outcomes for rate limited spans in the |
| 115 | +`spans_indexed` category. 1 count for the generated segment span from the transaction and 3 counts for the |
| 116 | +contained spans. The transaction itself will still be ingested. |
| 117 | + |
| 118 | +**Ingestion**: |
| 119 | + |
| 120 | +| Transaction Payload | Transaction Metrics | Span Payload | Span Metrics | |
| 121 | +| ------------------- | ------------------- | ------------ | ------------ | |
| 122 | +| ✅ | ✅ | ❌ | ✅ | |
| 123 | + |
| 124 | +**Negative Outcomes**: |
| 125 | + |
| 126 | +| `transaction` | `transaction_indexed` | `span` | `span_indexed` | |
| 127 | +| ------------- | --------------------- | ------ | -------------- | |
| 128 | +| 0 | 0 | 0 | 4 | |
| 129 | + |
| 130 | +**Rate Limits propagated to SDKs:** None. |
| 131 | + |
| 132 | + |
| 133 | +### Example: Transaction Quota |
| 134 | + |
| 135 | +**Quota**: |
| 136 | + |
| 137 | +```json |
| 138 | +{ |
| 139 | + "categories": ["transaction"], |
| 140 | + "limit": 0 |
| 141 | + // ... |
| 142 | +} |
| 143 | +``` |
| 144 | + |
| 145 | +**Transaction**: |
| 146 | + |
| 147 | +```json |
| 148 | +{ |
| 149 | + "type": "transaction", |
| 150 | + "spans": [ |
| 151 | + { .. }, |
| 152 | + { .. }, |
| 153 | + { .. } |
| 154 | + ], |
| 155 | + // ... |
| 156 | +} |
| 157 | +``` |
| 158 | + |
| 159 | +**Ingestion**: |
| 160 | + |
| 161 | +| Transaction Payload | Transaction Metrics | Span Payload | Span Metrics | |
| 162 | +| ------------------- | ------------------- | ------------ | ------------ | |
| 163 | +| ❌ | ❌ | ❌ | ❌ | |
| 164 | + |
| 165 | +**Negative Outcomes**: |
| 166 | + |
| 167 | +| `transaction` | `transaction_indexed` | `span` | `span_indexed` | |
| 168 | +| ------------- | --------------------- | ------ | -------------- | |
| 169 | +| 1 | 1 | 4 | 4 | |
| 170 | + |
| 171 | +**Rate Limits propagated to SDKs:** Transaction, Span. |
| 172 | + |
| 173 | + |
| 174 | +## FAQ / Reasoning |
| 175 | + |
| 176 | +### Why do transaction limits need to be propagated as span limits and vice versa? |
| 177 | + |
| 178 | +At first glance this is non-obvious, spans can exist without transactions |
| 179 | +and also transactions can exist without (standalone) spans, so why can the quotas |
| 180 | +be used interchangeably? |
| 181 | + |
| 182 | +The reasoning lies in the Sentry product and how the information is used within Sentry. |
| 183 | +Because of Relay, Sentry can safely assume spans exist if the user/SDK sends transactions |
| 184 | +and more and more of the product is built on the basis of spans. From this point of view |
| 185 | +transactions and spans convey the same information, it is just represented differently. |
| 186 | + |
| 187 | +The logical conclusion and simplification is, to treat span and transaction rate limits |
| 188 | +equally (important: treat the limits the same, the quotas are still tracked separately). |
| 189 | +Enforcing these limits on SDKs then does not require extra logic, it is all contained |
| 190 | +within Relay and works for any SDK version, future and present. |
| 191 | + |
| 192 | +### Why can span outcomes and transaction outcomes become inconsistent? |
| 193 | + |
| 194 | +There can be multiple reasons for this in the entire pipeline, these are just some reasons why there |
| 195 | +can be differences caused by Relay: |
| 196 | + |
| 197 | +- Parsing the span count from a transaction is too expensive and may be omitted. This is an extremely |
| 198 | +important property of Relay, abuse cases must be handled as fast as possible with as little cost ($ and resources) |
| 199 | +as possible. In some cases, it may not be feasible to JSON parse the transaction to extract a span count. |
| 200 | +- A envelope item may be malformed, there will be outcomes generated for the inferred data category (span or transaction), |
| 201 | +but the span count cannot be recovered from an invalid transaction. |
| 202 | + |
| 203 | + |
| 204 | +### I want Relay to enforce a quota, which category should I use? |
| 205 | + |
| 206 | +From top to bottom: |
| 207 | + |
| 208 | +- Do you completely disable ingestion? Configure the limit for the `span` and `transaction` data categories. |
| 209 | +- Is it billing related? For example, spike protection operates on the category of the billing unit, |
| 210 | +use the respective data category for the limit. |
| 211 | +- Use the total category (not indexed) which makes most sense to you. When in doubt, ask the Relay team. |
| 212 | + |
| 213 | +### Should I ever use an indexed category in a Quota? |
| 214 | + |
| 215 | +No, unless the intention is to protect infrastructure (abuse limits). |
| 216 | + |
| 217 | +Indexed quotas are only useful to protect downstream infrastructure through abuse quotas. |
| 218 | +They are inherently more expensive to enforce, cannot be propagated to clients and are generally a sign |
| 219 | +of misconfiguration. |
| 220 | + |
| 221 | +Dynamic-, smart- or client side-sampling should prevent any indexed quota from being enforced. |
| 222 | + |
| 223 | +### What does this mean for standalone spans? |
| 224 | + |
| 225 | +They are not treated differently from extracted spans. After metrics extraction, which may happen in customer |
| 226 | +Relays, there is no more distinction. Having no special treatment for standalone spans also means we do not |
| 227 | +need any special logic in the SDKs. |
0 commit comments