work in review commeents

shellmayr · shellmayr · commit 2706eeff1a08 · 2024-11-25T09:56:57.000+01:00
diff --git a/develop-docs/application-architecture/dynamic-sampling/fidelity-and-biases.mdx b/develop-docs/application-architecture/dynamic-sampling/fidelity-and-biases.mdx
@@ -15,12 +15,19 @@ A sample rate is a number in the interval `[0.0, 1.0]` that will determine the l
 
 At the core of Dynamic Sampling there is the concept of **fidelity**, which translates to an overall **target sample rate** that should be applied across all events of an organization.
 
-### Target Sample Rate Adjustment: Automatic Mode and Manual Mode
+### Dynamic Sampling Modes
 There are two available modes to govern the target sample rates for Dynamic Sampling: Automatic Mode and Manual Mode.
 - **Automatic mode** dynamically manages the target sample rate for each project based on the target sample rate for the organization, prioritizing lower volume projects to increase visibility.
 - **Manual mode** allows the user to set static target sample rates on a per-project basis that serve as the baseline sample rate before applying the dynamic biases outlined below. Target sample rates are not adjusted by the system. 
 
-Within this target sample rate, Dynamic Sampling can create a **bias toward more meaningful data**. This is achieved by constantly updating and communicating special rules to Relay, via a project configuration, which then applies targeted sampling to every event.
+Internally, Automatic Mode is called Organization Mode, while Manual Mode is called Project Mode. The settings around the mode and the sample rates are implemented using organization and project options. The [DynamicSamplingMode object](https://github.com/getsentry/sentry/blob/9b98be6b97323a78809a829e06dcbef26a16365c/src/sentry/dynamic_sampling/types.py#L7-L12) defines the available modes and their string representations to be set in the options. The dynamic sampling mode is set using the organization option `sentry:sampling_mode`. 
+
+If `sentry:sampling_mode` == `organization`, the **organization** option `sentry:target_sample_rate` defines the organization target sample rate.
+If `sentry:sampling_mode` == `project`, the **project** option `sentry:target_sample_rate` defines the project target sample rate for each project.
+
+On switching between modes, the current target sample rates are preserved unless changed by the user explicitly. For example, if the user switches from Automatic Mode to Manual Mode, the current target sample rate for the organization is preserved by setting the project options `project:target_sample_rate` to the project target sample rates calculated during automatic mode. Conversely, if the user switches from Manual Mode to Automatic Mode, the project target sample rates are recalculated based on the overall organization target sample rate.
+
+The [target sample rates are periodically recalibrated](https://github.com/getsentry/sentry/blob/9b98be6b97323a78809a829e06dcbef26a16365c/src/sentry/dynamic_sampling/rules/biases/recalibration_bias.py#L11-L44) to ensure that the overall target sample rate is met. This recalibration is done on a project level or organization level, depending on the dynamic sampling mode. Within the target sample rate, Dynamic Sampling **biases towards more meaningful data**. This is achieved by constantly updating and communicating special rules to Relay, via a project configuration, which then applies targeted sampling to every event. 
 
 ![Concept of Fidelity](./images/fidelityAndPriorities.png)
 
@@ -82,7 +89,7 @@ Since the adoption of a release is not constant, we created a system of _decayin
 
 ![Sample Rate and Adoption](./images/sampleRateAndAdoption.png)
 
-The latest release bias uses a decaying rule to interpolate between a starting sample rate and an ending sample rate over a time window that is statically defined for each platform (the list of time to adoptions is define [here](https://github.com/getsentry/sentry/blob/master/src/sentry/dynamic_sampling/rules/helpers/time_to_adoptions.py#L26-L26). For example, Android has a bigger time window than Javascript because on average Android apps take more time to get adopted by users.
+The latest release bias uses a decaying rule to interpolate between a starting sample rate and an ending sample rate over a time window that is statically defined for each platform (the list of time to adoptions is define [here](https://github.com/getsentry/sentry/blob/9b98be6b97323a78809a829e06dcbef26a16365c/src/sentry/dynamic_sampling/rules/helpers/time_to_adoptions.py#L25). For example, Android has a bigger time window than Javascript because on average Android apps take more time to get adopted by users.
 
 ### Prioritize Dev Environments
 
@@ -109,7 +116,11 @@ For prioritizing dev environments, we use a sample rate of `1.0` (100%), which r
 ### Prioritize Low Volume Transactions
 This bias is used to prioritize low-volume transactions that can be drowned out by high-volume transactions. The goal is to rebalance sample rates of the individual transactions so that low-volume transactions are more likely to have representative samples. The bias is of type trace, which means that the transaction considered for rebalancing will be the root transaction of the trace.
 
-In order to rebalance transactions, the system computes the counts of the transactions for each project and runs an algorithm that, given the sample rate of the organization and the counts of each transaction, computes a new sample rate for each transaction assuming an ideal distribution of the counts.
+Prioritization of low volume projects works slightly differently depending on the dynamic sampling mode: 
+- In **Automatic Mode** (`sentry:sampling_mode` == `organization`), the organization target sample rate is used as the base sample rate for the balancing algorithm.
+- In **Manual Mode** (`sentry:sampling_mode` == `project`), the project target sample rate is used as the base sample rate for the balancing algorithm.
+
+In order to rebalance transactions, the system retrieves the counts of the transactions for each project and calculates a new sample rate for each transaction. 
 
 <Alert title="✨ Note" level="info">
 
diff --git a/develop-docs/application-architecture/dynamic-sampling/the-big-picture.mdx b/develop-docs/application-architecture/dynamic-sampling/the-big-picture.mdx
@@ -18,7 +18,7 @@ Dynamic Sampling currently operates on either spans or transactions, based on th
 
 Dynamic Sampling occurs at the edge of our ingestion pipeline, precisely in [Relay](https://github.com/getsentry/relay).
 
-When events arrive, in a simplified model, they go through the following steps (some of which won't apply if you self-host Sentry):
+When events arrive, in a simplified model, they go through the following steps:
 
 1. **Inbound data filters**: every event runs through inbound data filters as configured in project settings, such as legacy browsers or denied releases. Events dropped here are not counted towards quota and are not included in "total events" data.
 2. **Quota enforcement**: Sentry charges for all further events sent in, before they are passed on to dynamic sampling.
@@ -65,7 +65,7 @@ For example, if there is a trace from `Project A` to `Project B` and `Project B`
 
 Clients have their own [traces sample rate](https://docs.sentry.io/platforms/javascript/tracing/#configure). The client sample rate is a number in the range `[0.0, 1.0]` (from 0% to 100%) that controls **how many events arrive at Sentry**. While documentation will generally suggest a sample rate of `1.0`, for some use cases it might be better to reduce it.
 
-Dynamic Sampling further reduces how many events get stored internally. **While most graphs and numbers in Sentry are based on total events**, accessing spans and tags requires stored events. The sample rates apply on top of each other.
+Dynamic Sampling further reduces how many events get stored internally. **While most graphs and numbers in Sentry are based on metrics**, accessing spans and tags requires stored events. The sample rates apply on top of each other.
 
 An example of client side sampling and Dynamic Sampling starting from 100k events which results in 15k stored events is shown below: