You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: paper/main.tex
+12-90Lines changed: 12 additions & 90 deletions
Original file line number
Diff line number
Diff line change
@@ -1502,11 +1502,9 @@ \subsubsection*{Generalized linear mixed models}\label{subsec:glmm}
1502
1502
Fundamental Forces}, \textit{Birth of Stars}, or general physics knowledge.
1503
1503
Note that with our coding scheme, identifiers for each \texttt{question} are
1504
1504
implicitly nested within levels of \texttt{lecture} and do not require explicit
1505
-
nesting in our model formula.
1506
-
1507
-
% We then iteratively removed random effects from the maximal model until it
1508
-
% successfully converged with a full rank (i.e., non-singular) random effects
1509
-
% variance-covariance matrix.
1505
+
nesting in our model formula. We then iteratively removed random effects from
1506
+
the maximal model until it successfully converged with a full rank (i.e., non-singular)
1507
+
random effects variance-covariance matrix.
1510
1508
1511
1509
%% JRM NOTE: do we need this next paragraph? Commenting out for now...
1512
1510
% %When inspecting the model's random effect estimates revealed multiple terms estimated at the boundary of their parameter space (i.e., variance components of 0 or correlation terms of $\pm 1$), we found that the order in which we eliminated these terms typically did not affect which terms did and did not need to be removed in order for the model to converge to a non-degenerate solution.
@@ -1531,96 +1529,20 @@ \subsubsection*{Generalized linear mixed models}\label{subsec:glmm}
1531
1529
1532
1530
To assess the predictive value of our knowledge estimates, we compared each
1533
1531
GLMM's ability to discriminate between correctly and incorrectly answered
1534
-
questions to that of an analogous model that did not consider estimated
1532
+
questions to that of an analogous model that did \textit{not} consider estimated
1535
1533
knowledge. Specifically, we used the same sets of observations with which we
1536
1534
fit each ``full'' model to fit a second ``null'' model, with the formula:
%In order to assess the predictive value of the knowledge estimates, we then fit a second set of ``null'' models to the same sets of observations used to fit our full GLMMs. These
1551
-
%
1552
-
%
1553
-
%used the same sets of observations used to fit these GLMMs to fit a second set of ``null'' models.
1554
-
%
1555
-
%
1556
-
%
1557
-
%Next, in order to assess the predictive value of the knowledge estimates
1558
-
%
1559
-
%
1560
-
%
1561
-
%In order to assess the predictive value of the knowledge estimates we used to fit each GLMM, we then fit a \textit{second} model to the same data as each of the 15
1562
-
%
1563
-
%In order to assess whether the knowledge estimates we used to fit each GLMM could reliably predict participants' success on held-out questions, we then fit a second GLMM to the observations
where ``\texttt{accuracy}'', ``\texttt{participant}'', and ``\texttt{question}'' are as defined above.
1539
+
As with our full models, the null models we fit for the “All questions” version of the analysis for each quiz contained an additional term, $\mathtt{(1\ \vert\ lecture)}$, where ``\texttt{lecture}'' are as defined above.
1540
+
We then compared each full model to its reduced (null) equivalent using a likelihood-ratio test (LRT).
1541
+
Because the typical asymptotic $\chi^2_d$ approximation of the null distribution for the LRT statistic ($\lambda_{LR}$) is anti-conservative for models that differ in their random slope terms~\citep{GoldSimo00,ScheEtal08b,SnijBosk11}, we computed $p$-values for these tests using a parametric bootstrapping procedure~\citep{HaleHojs14}.
1542
+
For each of 1,000 bootstraps, we used the fitted null model to simulate a sample of observations of equal size to our original sample.
1543
+
We then re-fit both the null and full models to this simulated sample and compared them via an LRT.
1544
+
This yielded a distribution of $\lambda_{LR}$ statistics we may expect to observe under our null hypothesis.
1545
+
Following~\citep{DaviHink97,NortEtal02}, we computed a corrected $p$-value for our observed $\lambda_{LR}$ as $\frac{r + 1}{n + 1}$, where $r$ is the number of simulated model comparisons that yielded a $\lambda_{LR}$ greater than or equal to our observed value and $n$ is the number of simulations we ran (1,000).
1624
1546
1625
1547
\subsubsection*{Estimating the ``smoothness'' of knowledge}\label{subsec:smoothness}
0 commit comments