Skip to content

Commit f77c435

Browse files
finished GLMM methods section
1 parent 9aeff7a commit f77c435

File tree

2 files changed

+12
-90
lines changed

2 files changed

+12
-90
lines changed

paper/main.pdf

4.24 KB
Binary file not shown.

paper/main.tex

Lines changed: 12 additions & 90 deletions
Original file line numberDiff line numberDiff line change
@@ -1502,11 +1502,9 @@ \subsubsection*{Generalized linear mixed models}\label{subsec:glmm}
15021502
Fundamental Forces}, \textit{Birth of Stars}, or general physics knowledge.
15031503
Note that with our coding scheme, identifiers for each \texttt{question} are
15041504
implicitly nested within levels of \texttt{lecture} and do not require explicit
1505-
nesting in our model formula.
1506-
1507-
% We then iteratively removed random effects from the maximal model until it
1508-
% successfully converged with a full rank (i.e., non-singular) random effects
1509-
% variance-covariance matrix.
1505+
nesting in our model formula. We then iteratively removed random effects from
1506+
the maximal model until it successfully converged with a full rank (i.e., non-singular)
1507+
random effects variance-covariance matrix.
15101508

15111509
%% JRM NOTE: do we need this next paragraph? Commenting out for now...
15121510
% %When inspecting the model's random effect estimates revealed multiple terms estimated at the boundary of their parameter space (i.e., variance components of 0 or correlation terms of $\pm 1$), we found that the order in which we eliminated these terms typically did not affect which terms did and did not need to be removed in order for the model to converge to a non-degenerate solution.
@@ -1531,96 +1529,20 @@ \subsubsection*{Generalized linear mixed models}\label{subsec:glmm}
15311529

15321530
To assess the predictive value of our knowledge estimates, we compared each
15331531
GLMM's ability to discriminate between correctly and incorrectly answered
1534-
questions to that of an analogous model that did not consider estimated
1532+
questions to that of an analogous model that did \textit{not} consider estimated
15351533
knowledge. Specifically, we used the same sets of observations with which we
15361534
fit each ``full'' model to fit a second ``null'' model, with the formula:
15371535
\[
15381536
\mathtt{accuracy \sim (1\ \vert\ participant) + (1\ \vert\ question)}
15391537
\]
1540-
where the terms are as defined above.
1541-
1542-
1543-
1544-
1545-
1546-
1547-
1548-
1549-
1550-
%In order to assess the predictive value of the knowledge estimates, we then fit a second set of ``null'' models to the same sets of observations used to fit our full GLMMs. These
1551-
%
1552-
%
1553-
%used the same sets of observations used to fit these GLMMs to fit a second set of ``null'' models.
1554-
%
1555-
%
1556-
%
1557-
%Next, in order to assess the predictive value of the knowledge estimates
1558-
%
1559-
%
1560-
%
1561-
%In order to assess the predictive value of the knowledge estimates we used to fit each GLMM, we then fit a \textit{second} model to the same data as each of the 15
1562-
%
1563-
%In order to assess whether the knowledge estimates we used to fit each GLMM could reliably predict participants' success on held-out questions, we then fit a second GLMM to the observations
1564-
1565-
1566-
%$$
1567-
%\begin{aligned}
1568-
% \operatorname{accuracy}_{i} &\sim \operatorname{Binomial}(n = 1, \operatorname{prob}_{\operatorname{accuracy} = \operatorname{correct}} = \widehat{P}) \\
1569-
% \log\left[\frac{\hat{P}}{1 - \hat{P}} \right] &=\alpha_{j[i],k[i]} \\
1570-
%\left(
1571-
% \begin{array}{c}
1572-
% \begin{aligned}
1573-
% &\alpha_{j} \\
1574-
% &\gamma_{1j}
1575-
% \end{aligned}
1576-
% \end{array}
1577-
%\right)
1578-
% &\sim N \left(
1579-
%\left(
1580-
% \begin{array}{c}
1581-
% \begin{aligned}
1582-
% &\gamma_{0}^{\alpha} + \gamma_{1k[i]}^{\alpha}(\operatorname{knowledge}) \\
1583-
% &\mu_{\gamma_{1j}}
1584-
% \end{aligned}
1585-
% \end{array}
1586-
%\right)
1587-
%,
1588-
%\left(
1589-
% \begin{array}{cc}
1590-
% \sigma^2_{\alpha_{j}} & \rho_{\alpha_{j}\gamma_{1j}} \\
1591-
% \rho_{\gamma_{1j}\alpha_{j}} & \sigma^2_{\gamma_{1j}}
1592-
% \end{array}
1593-
%\right)
1594-
% \right)
1595-
% \text{, for participant j = 1,} \dots \text{,J} \\
1596-
%\left(
1597-
% \begin{array}{c}
1598-
% \begin{aligned}
1599-
% &\alpha_{k} \\
1600-
% &\gamma_{1k}
1601-
% \end{aligned}
1602-
% \end{array}
1603-
%\right)
1604-
% &\sim N \left(
1605-
%\left(
1606-
% \begin{array}{c}
1607-
% \begin{aligned}
1608-
% &\mu_{\alpha_{k}} \\
1609-
% &\mu_{\gamma_{1k}}
1610-
% \end{aligned}
1611-
% \end{array}
1612-
%\right)
1613-
%,
1614-
%\left(
1615-
% \begin{array}{cc}
1616-
% \sigma^2_{\alpha_{k}} & \rho_{\alpha_{k}\gamma_{1k}} \\
1617-
% \rho_{\gamma_{1k}\alpha_{k}} & \sigma^2_{\gamma_{1k}}
1618-
% \end{array}
1619-
%\right)
1620-
% \right)
1621-
% \text{, for question k = 1,} \dots \text{,K}
1622-
%\end{aligned}
1623-
%$$
1538+
where ``\texttt{accuracy}'', ``\texttt{participant}'', and ``\texttt{question}'' are as defined above.
1539+
As with our full models, the null models we fit for the “All questions” version of the analysis for each quiz contained an additional term, $\mathtt{(1\ \vert\ lecture)}$, where ``\texttt{lecture}'' are as defined above.
1540+
We then compared each full model to its reduced (null) equivalent using a likelihood-ratio test (LRT).
1541+
Because the typical asymptotic $\chi^2_d$ approximation of the null distribution for the LRT statistic ($\lambda_{LR}$) is anti-conservative for models that differ in their random slope terms~\citep{GoldSimo00,ScheEtal08b,SnijBosk11}, we computed $p$-values for these tests using a parametric bootstrapping procedure~\citep{HaleHojs14}.
1542+
For each of 1,000 bootstraps, we used the fitted null model to simulate a sample of observations of equal size to our original sample.
1543+
We then re-fit both the null and full models to this simulated sample and compared them via an LRT.
1544+
This yielded a distribution of $\lambda_{LR}$ statistics we may expect to observe under our null hypothesis.
1545+
Following~\citep{DaviHink97,NortEtal02}, we computed a corrected $p$-value for our observed $\lambda_{LR}$ as $\frac{r + 1}{n + 1}$, where $r$ is the number of simulated model comparisons that yielded a $\lambda_{LR}$ greater than or equal to our observed value and $n$ is the number of simulations we ran (1,000).
16241546

16251547
\subsubsection*{Estimating the ``smoothness'' of knowledge}\label{subsec:smoothness}
16261548

0 commit comments

Comments
 (0)