Skip to content

Commit f467d12

Browse files
authored
Add Experiment Outcomes/Errors (#859)
Add Experiment Outcomes and Errors
1 parent 6758efb commit f467d12

File tree

1 file changed

+14
-2
lines changed

1 file changed

+14
-2
lines changed

docs/platform/experimentation/how-metrics-are-calculated.md

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ This Metric type calculates the total number of times a unique user (or service)
4040

4141
**% Difference** - This is a simple calculation of the difference between the Events/Denominator of the variation and the Control variation.
4242

43-
**Statistical Significance** - An icon that indicates whether the Feature has reached statistical significance or not at a 95% confidence interval.
43+
**Statistical Significance** - An icon that indicates whether the Feature has reached statistical significance or not at a 95% confidence interval or 0.05 significance level. A checkmark indicates a positive significant result, a cross indicates negative significant result and ellipses indicate that the result is non-significant.
4444

4545
## Value Optimization Metrics (Numerical Metrics)
4646

@@ -64,4 +64,16 @@ Similar to the sum per user, the average for user also uses the numerical value
6464

6565
**% Difference** - Simple difference check against the Control value.
6666

67-
**Statistical Significance** - An icon that indicates whether the Feature has reached statistical significance or not at a 95% confidence interval.
67+
**Statistical Significance** - An icon that indicates whether the Feature has reached statistical significance or not at a 95% confidence interval or 0.05 significance level. A checkmark indicates a positive significant result, a cross indicates negative significant result and ellipses indicate that the result is non-significant.
68+
69+
## Interpreting Experiment Outcomes
70+
71+
With any controlled experiment, you should anticipate three possible outcomes:
72+
73+
- Accurate results – There's a genuine difference between baseline and the variation, the data reflects a winner or a loser accordingly. Conversely, when there's no significant difference, the data shows an inconclusive result.
74+
- False-positive (Type I Error) – Your test data shows a significant difference between your original and variation, but it’s merely random noise in the data; there's no real difference between your original and your variation.
75+
- False-negative (Type II Error) – Your test shows an inconclusive result, but your variation is genuinely different from your baseline whether that may be positive or negative.
76+
77+
DevCycle ensures an optimal balance between experiment sensitivity and reliability. We enable product and engineering teams to make informed, data-driven decisions, and ultimately, empower them to continuously improve and enhance user experience based on trustworthy insights.
78+
79+
Remember, the statistical tests used in A/B testing in general, provide a mathematical framework to make informed decisions. However, like all statistical tests, they are not infallible and are based on certain assumptions. Violations of these assumptions can lead to misleading results. It's crucial to understand the conditions of the tests and ensure that your data meets those conditions as closely as possible to draw valid conclusions from your A/B tests.

0 commit comments

Comments
 (0)