You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: inference-many-means.qmd
+4-4Lines changed: 4 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -86,7 +86,7 @@ Investigating groups IV, V, and VI, we see the differences in the groups' center
86
86
#| Two sets of side by side dot plots. The first set shows three
87
87
#| groups of observations where the variability within a group is so large
88
88
#| that it swamps out any variability across the groups. The second set
89
-
#| shows three groups of observations where the variabilit within a group is
89
+
#| shows three groups of observations where the variability within a group is
90
90
#| much smaller and the center of the groups appears different.
91
91
#| fig-asp: 0.5
92
92
toy_anova |>
@@ -385,7 +385,7 @@ classdata |>
385
385
#| label: fig-boxplotThreeVersionsOfExams
386
386
#| fig-cap: Exam scores for students given one of three different exams.
387
387
#| fig-alt: |
388
-
#| Side-by-side box plots of exam score boken down by exam A, exam B, or
388
+
#| Side-by-side box plots of exam score broken down by exam A, exam B, or
389
389
#| exam C. Exam C's median is above 80 which is higher than exam A with a median
390
390
#| around 74 and exam B with a median around 72.
391
391
classdata |>
@@ -507,7 +507,7 @@ While it is temping to say that exam C is harder than the other two (given the i
507
507
When the null hypothesis is true, random variability that exists in nature sometimes produces data with p-values less than 0.05.
508
508
How often does that happen?
509
509
5% of the time.
510
-
That is to say, if you use 20 different models applied to the same data where there is no signal (i.e., the null hypothesis is true), you are reasonably likely to to get a p-value less than 0.05 in one of the tests you run.
510
+
That is to say, if you use 20 different models applied to the same data where there is no signal (i.e., the null hypothesis is true), you are reasonably likely to get a p-value less than 0.05 in one of the tests you run.
511
511
The details surrounding the ideas of this problem, called a **multiple comparisons test** or **multiple comparisons problem**, are outside the scope of this textbook, but should be something that you keep in the back of your head.
512
512
To best mitigate any extra Type I errors, we suggest that you set up your hypotheses and testing protocol before running any analyses.
513
513
Once the conclusions have been reached, you should report your findings instead of running a different type of test on the same data.
@@ -536,7 +536,7 @@ If $H_0$ is true and the model conditions are satisfied, an $F$-statistic follow
536
536
:::
537
537
538
538
::: {.guidedpractice data-latex=""}
539
-
For the baseball data, $MSG = 0.00803$ and $MSE=0.00158.$ Identify the degrees of freedom associated with MSG and MSE and verify the $F$-statistic is approximately 5.077.[^22-inference-many-means-5]
539
+
For the baseball data, $MSG = 0.00803$ and $MSE=0.00158$. Identify the degrees of freedom associated with MSG and MSE and verify the $F$-statistic is approximately 5.077.[^22-inference-many-means-5]
540
540
:::
541
541
542
542
[^22-inference-many-means-5]: There are $k = 3$ groups, so $df_{G} = k - 1 = 2.$ There are $n = n_1 + n_2 + n_3 = 429$ total observations, so $df_{E} = n - k = 426.$ Then the $F$-statistic is computed as the ratio of $MSG$ and $MSE:$ $F = \frac{MSG}{MSE} = \frac{0.00803}{0.00158} = 5.082 \approx 5.077.$ $(F = 5.077$ was computed by using values for $MSG$ and $MSE$ that were not rounded.)
0 commit comments