Skip to content
Closed

Alextest #13145

Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
288 changes: 288 additions & 0 deletions docs/platforms/python/tracing/configure-sampling/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,288 @@
---
title: Configure Sampling
description: "Learn how to configure sampling in your app."
sidebar_order: 30
---

Sentry's tracing functionality helps you monitor application performance by capturing distributed traces, attaching attributes, and adding span performance metrics across your application. However, capturing traces for every transaction can generate significant volumes of data. Sampling allows you to control the amount of spans that are sent to Sentry from your application.

Effective sampling is key to getting the most value from Sentry's performance monitoring while minimizing overhead. The `traces_sampler` function gives you precise control over which transactions to record, allowing you to focus on the most important parts of your application.

## Sampling Configuration Options

The Python SDK provides two main options for controlling the sampling rate:

1. Uniform Sample Rate (`traces_sample_rate`)

This option sets a fixed percentage of transactions to be captured:

<PlatformContent includePath="/performance/traces-sample-rate" />

With `traces_sample_rate` set to `0.25`, approximately 25% of transactions will be recorded and sent to Sentry. This provides an even cross-section of transactions regardless of where in your app they occur.

2. Sampling Function (`traces_sampler`)

For more granular control, you can use the `traces_sampler` function. This approach allows you to:

- Apply different sampling rates to different types of transactions
- Filter out specific transactions entirely
- Make sampling decisions based on transaction data
- Control the inheritance of sampling decisions in distributed traces

<PlatformContent includePath="/performance/traces-sampler-as-sampler" />

### Trace Sampler Examples

1. Prioritizing Critical User Flows

```python
def traces_sampler(sampling_context):
ctx = sampling_context.get("transaction_context", {})
name = ctx.get("name", "")

# Sample all checkout transactions
if name and ('/checkout' in name or
ctx.get("op") == 'checkout'):
return 1.0

# Sample 50% of login transactions
if name and ('/login' in name or
ctx.get("op") == 'login'):
return 0.5

# Sample 10% of everything else
return 0.1

sentry_sdk.init(
dsn="your-dsn",
traces_sampler=traces_sampler,
)
```

2. Handling Different Environments and Error Rates

```python
def traces_sampler(sampling_context):
ctx = sampling_context.get("transaction_context", {})
environment = os.environ.get("ENVIRONMENT", "development")

# Sample all transactions in development
if environment == "development":
return 1.0

# Sample more transactions if there are recent errors
if ctx.get("data", {}).get("hasRecentErrors"):
return 0.8

# Sample based on environment
if environment == "production":
return 0.05 # 5% in production
elif environment == "staging":
return 0.2 # 20% in staging

return 0.1 # 10% default

sentry_sdk.init(
dsn="your-dsn",
traces_sampler=traces_sampler,
)
```

3. Controlling Sampling Based on User and Transaction Properties

```python
def traces_sampler(sampling_context):
ctx = sampling_context.get("transaction_context", {})
data = ctx.get("data", {})

# Always sample for premium users
if data.get("user", {}).get("tier") == "premium":
return 1.0

# Sample more transactions for users experiencing errors
if data.get("hasRecentErrors"):
return 0.8

# Sample less for high-volume, low-value paths
if ctx.get("name", "").startswith("/api/metrics"):
return 0.01

# Sample more for slow transactions
if data.get("duration_ms", 0) > 1000: # Transactions over 1 second
return 0.5

# If there's a parent sampling decision, respect it
if sampling_context.get("parent_sampled") is not None:
return sampling_context["parent_sampled"]

# Default sampling rate
return 0.2

sentry_sdk.init(
dsn="your-dsn",
traces_sampler=traces_sampler,
)
```

4. Complex Business Logic Sampling

```python
def traces_sampler(sampling_context):
ctx = sampling_context.get("transaction_context", {})
data = ctx.get("data", {})

# Always sample critical business operations
if ctx.get("op") in ["payment.process", "order.create", "user.verify"]:
return 1.0

# Sample based on user segment
user_segment = data.get("user", {}).get("segment")
if user_segment == "enterprise":
return 0.8
elif user_segment == "premium":
return 0.5

# Sample based on transaction value
transaction_value = data.get("transaction", {}).get("value", 0)
if transaction_value > 1000: # High-value transactions
return 0.7

# Sample based on error rate in the service
error_rate = data.get("service", {}).get("error_rate", 0)
if error_rate > 0.05: # Error rate above 5%
return 0.9

# Inherit parent sampling decision if available
if sampling_context.get("parent_sampled") is not None:
return sampling_context["parent_sampled"]

# Default sampling rate
return 0.1

sentry_sdk.init(
dsn="your-dsn",
traces_sampler=traces_sampler,
)
```

5. Performance-Based Sampling

```python
def traces_sampler(sampling_context):
ctx = sampling_context.get("transaction_context", {})
data = ctx.get("data", {})

# Sample all slow transactions
if data.get("duration_ms", 0) > 2000: # Over 2 seconds
return 1.0

# Sample more transactions with high memory usage
if data.get("memory_usage_mb", 0) > 500: # Over 500MB
return 0.8

# Sample more transactions with high CPU usage
if data.get("cpu_percent", 0) > 80: # Over 80% CPU
return 0.8

# Sample more transactions with high database load
if data.get("db_connections", 0) > 100: # Over 100 connections
return 0.7

# Default sampling rate
return 0.1

sentry_sdk.init(
dsn="your-dsn",
traces_sampler=traces_sampler,
)
```

## The Sampling Context Object

When the `traces_sampler` function is called, the Sentry SDK passes a `sampling_context` object with information from the relevant span to help make sampling decisions:

```python
{
"transaction_context": {
"name": str, # transaction title at creation time
"op": str, # short description of transaction type, like "http.request"
# other transaction data...
},
"parent_sampled": bool, # whether the parent transaction was sampled (if any)
"parent_sample_rate": float, # the sample rate used by the parent (if any)
# Custom context as passed to start_transaction
}
```

The sampling context contains:

- `transaction_context`: Includes the transaction name, operation type, and other metadata
- `parent_sampled`: Whether the parent transaction was sampled (for distributed tracing)
- `parent_sample_rate`: The sample rate used in the parent transaction
- Any custom sampling context data passed to `start_transaction`

## Inheritance in Distributed Tracing

In distributed systems, trace information is propagated between services. You can implement inheritance logic like this:

```python
def traces_sampler(sampling_context):
# Examine provided context data
if "transaction_context" in sampling_context:
name = sampling_context["transaction_context"].get("name", "")

# Apply specific rules first
if "critical-path" in name:
return 1.0 # Always sample

# Inherit parent sampling decision if available
if sampling_context.get("parent_sampled") is not None:
return sampling_context["parent_sampled"]

# Otherwise use a default rate
return 0.1
```

This approach ensures consistent sampling decisions across your entire distributed trace. All transactions in a given trace will share the same sampling decision, preventing broken or incomplete traces.

## Sampling Decision Precedence

When multiple sampling mechanisms could apply, Sentry follows this order of precedence:

1. If a sampling decision is passed to `start_transaction`, that decision is used
2. If `traces_sampler` is defined, its decision is used (can consider parent sampling)
3. If no `traces_sampler` but parent sampling is available, parent decision is used
4. If neither of the above, `traces_sample_rate` is used
5. If none of the above are set, no transactions are sampled (0%)

## How Sampling Propagates in Distributed Traces

Sentry uses a "head-based" sampling approach:

- A sampling decision is made in the originating service (the "head")
- This decision is propagated to all downstream services via HTTP headers

The two key headers are:
- `sentry-trace`: Contains trace ID, span ID, and sampling decision
- `baggage`: Contains additional trace metadata including sample rate

The Sentry Python SDK automatically attaches these headers to outgoing HTTP requests when using auto-instrumentation with libraries like `requests`, `urllib3`, or `httpx`. For other communication channels, you can manually propagate trace information:

```python
# Extract trace data from the current scope
trace_data = sentry_sdk.get_current_scope().get_trace_context()
sentry_trace_header = trace_data.get("sentry-trace")
baggage_header = trace_data.get("baggage")

# Add to your custom request (example using a message queue)
message = {
"data": "Your data here",
"metadata": {
"sentry_trace": sentry_trace_header,
"baggage": baggage_header
}
}
queue.send(json.dumps(message))
```

By implementing a thoughtful sampling strategy, you'll get the performance insights you need without overwhelming your systems or your Sentry quota.
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
---
title: Trace Propagation
title: Set Up Distributed Tracing
description: "Learn how to connect events across applications/services."
sidebar_order: 3000
sidebar_order: 20
---

If the overall application landscape that you want to observe with Sentry consists of more than just a single service or application, distributed tracing can add a lot of value.
<PlatformContent includePath="distributed-tracing/explanation" />

## What is Distributed Tracing?

Expand Down
Loading
Loading