Skip to content

Commit 9275f79

Browse files
committed
Add summary section + updates to ROPE section
1 parent d00a9e8 commit 9275f79

File tree

2 files changed

+80
-30
lines changed

2 files changed

+80
-30
lines changed

examples/howto/hypothesis_testing.ipynb

Lines changed: 49 additions & 24 deletions
Large diffs are not rendered by default.

examples/howto/hypothesis_testing.myst.md

Lines changed: 31 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -133,7 +133,7 @@ p_mu_greater
133133
The HDI gives an interval of highest probability density. If zero is outside the HDI, it’s unlikely the parameter is near zero.
134134

135135
```{code-cell} ipython3
136-
hdi_mu = az.hdi(idata, var_names=["mu"])["mu"].data
136+
hdi_mu = az.hdi(idata.posterior["mu"])["mu"].data
137137
hdi_mu
138138
```
139139

@@ -151,24 +151,24 @@ az.plot_posterior(idata, var_names=["mu"], figsize=(14, 3));
151151

152152
If the probability that the parameter is within a certain range is high, we can say that the parameter is practically equivalent to that value. This is a useful way to express that we don't care about small differences.
153153

154-
One proposal is that we now examine the HDI's but compare them to the ROPE and not zero.
154+
For example, if we state that values within $-0.1$ to $0.1$ (this region need not be symmetric) are practically equivalent to zero, we can compute the probability that $\mu$ is within this range. If this probability is high enough then we can say that the mean is practically equivalent to zero.
155155

156156
```{code-cell} ipython3
157157
rope = [-0.1, 0.1]
158158
p_in_rope = np.mean((mu_samples > rope[0]) & (mu_samples < rope[1]))
159159
p_in_rope
160160
```
161161

162+
So there is only a 2.2% probability that the mean is practically equivalent to zero. This is sufficiently low that we can reject the hypothesis that the mean is practically equivalent to zero.
163+
164+
+++
165+
162166
Third time in a row, `arviz` has our back and can plot the ROPE and HDIs.
163167

164168
```{code-cell} ipython3
165169
az.plot_posterior(idata, var_names=["mu"], rope=rope, figsize=(14, 3));
166170
```
167171

168-
This shows that there is only a 2.6% chance that the mean is within the chosen ROPE.
169-
170-
+++
171-
172172
{ref}`kruschke2018rejecting` outlines the HDI+ROPE decision rule, which is summarised in the figure taken from that paper:
173173

174174
![](hdi_plus_rope_decision_rule.png)
@@ -200,6 +200,31 @@ We can see that the probability of $\mu=0$ has gone _down_ after observing the d
200200

201201
## Summary
202202

203+
**Posterior Probability Statements**
204+
- *Idea:* Compute $P(\theta > \delta \mid \text{data})$ directly from the posterior.
205+
- *Pros:* Simple, intuitive, no special tools needed.
206+
- *Cons:* Requires choosing a threshold $\delta$.
207+
208+
**Highest Density Intervals (HDIs)**
209+
- *Idea:* Identify the range of values containing a fixed portion (e.g., 95%) of the posterior mass.
210+
- *Pros:* Provides a clear summary of where the parameter lies; easy to interpret.
211+
- *Cons:* By itself, doesn’t encode a decision rule; must still choose what HDI exclusion implies.
212+
213+
**ROPE (Region of Practical Equivalence)**
214+
- *Idea:* Define a small interval around the null (e.g., zero) representing negligible effect size and assess posterior mass within it.
215+
- *Pros:* Focuses on practical rather than just statistical significance; flexible.
216+
- *Cons:* Requires subjective definition of what counts as negligible.
217+
218+
**ROPE + HDI Decision Rule**
219+
- *Idea:* Combine ROPE with HDI to classify results as negligible, meaningful, or inconclusive.
220+
- *Pros:* Offers a three-way decision with practical interpretation; balances interval uncertainty and practical thresholds.
221+
- *Cons:* Still needs careful definition of ROPE bounds and HDI level.
222+
223+
**Bayes Factors**
224+
- *Idea:* Compare evidence for one model/hypothesis against another by ratio of marginal likelihoods.
225+
- *Pros:* Provides a direct measure of relative evidence; can be viewed as updating prior odds.
226+
- *Cons:* Sensitive to priors; can be computationally challenging; interpreting BF scales can be tricky.
227+
203228
+++
204229

205230
## Authors

0 commit comments

Comments
 (0)