Skip to content

Commit c9fdb09

Browse files
updated fig 6 caption
1 parent 0195543 commit c9fdb09

File tree

2 files changed

+17
-40
lines changed

2 files changed

+17
-40
lines changed

paper/main.pdf

230 Bytes
Binary file not shown.

paper/main.tex

Lines changed: 17 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -543,55 +543,32 @@ \section*{Results}
543543
content tested by a given question, our estimates of their knowledge should
544544
carry some predictive information about whether they are likely to answer that
545545
question correctly or incorrectly. We developed a statistical approach to test this claim.
546-
For each quiz question a participant answered, in turn, we used Equation~\ref{eqn:prop} to estimate their knowledge at the given question's embedding space coordinate, based on other questions that participant answered on the same quiz.
546+
For each quiz question a participant answered, in turn, we used Equation~\ref{eqn:prop} to estimate their knowledge at the given question's embedding space coordinate based on other questions that participant answered on the same quiz.
547547
We repeated this for all participants, and for each of the three quizzes.
548-
Then, separately for each quiz, we fit a generalized linear mixed model (GLMM) with a logistic link function to explain the likelihood of correctly answering a question as a function of estimated knowledge for its embedding coordinate, while accounting for random variation among participants and questions (see Sec.~\nameref{subsec:glmm}).
549-
To assess the predictive value of the knowledge estimates, we performed likelihood-ratio tests comparing each GLMMs to an analogous (i.e., nested) ``null'' model that did not consider estimated knowledge.
548+
Then, separately for each quiz, we fit a generalized linear mixed model (GLMM) with a logistic link function to explain the likelihood of correctly answering a question as a function of estimated knowledge for its embedding coordinate, while accounting for random variation among participants and questions (see \nameref{subsec:glmm}).
549+
To assess the predictive value of the knowledge estimates, we compared each GLMMs to an analogous (i.e., nested) ``null'' model that did not consider estimated knowledge using using parametric bootstrap likelihood-ratio tests.
550550

551551
\begin{figure}[tp]
552552
\centering
553+
% TODO: adjust width to max possible based on finalized caption
553554
\includegraphics[width=0.7\textwidth]{figs/predict-knowledge-questions}
554-
555-
\caption{\textbf{Predicting knowledge at the embedding coordinates of
556-
held-out questions.} Separately for each quiz (column), we plot the
557-
distributions of predicted knowledge at the embedding coordinates of each
558-
held-out correctly (blue) or incorrectly (red) answered question. The
559-
Mann-Whitney $\U$-tests reported in each panel are between the
560-
distributions of predicted knowledge at the coordinates of correctly and
561-
incorrectly answered held-out questions. In the top row (``All
562-
questions''), we used all quiz questions (from each quiz, for each
563-
participant) except one to predict knowledge at the held-out question's
564-
embedding coordinate. In the middle rows (``Across-lecture''), we used all
565-
questions about one lecture to predict knowledge at the embedding
566-
coordinate of a held-out question about the \textit{other} lecture. In the
567-
bottom row (``Within-lecture''), we used all but one question about one
568-
lecture to predict knowledge at the embedding coordinate of a held-out
569-
question about the \textit{same} lecture. We repeated each of these
570-
analyses using all possible held-out questions for each quiz and
571-
participant. The arrows at the tops of each panel indicate whether the
572-
average predicted knowledge was higher for held-out correctly answered
573-
(left) or incorrectly answered (right) questions.}
555+
\caption{\textbf{Predicting success on held-out questions using estimated knowledge.}
556+
We used generalized linear mixed models (GLMMs) to model the likelihood of correctly answering a quiz question as a function of estimated knowledge for its embedding coordinate (see \nameref{subsec:glmm}).
557+
Separately for each quiz (column), we examined this relationship based on three different sets of knowledge estimates: knowledge for each question based on all other questions the same participant answered on the same quiz (``All questions''; top row), knowledge for each question about one lecture based on all questions (from the same participant and quiz) about the \textit{other} lecture (``Across-lecture''; middle rows), and knowledge for each question about one lecture based on all other questions (from the same participant and quiz) about the \textit{same} lecture (``Within-lecture''; bottom rows).
558+
The background in each panel displays the relative density of observed correctly (blue) versus incorrectly (red) answered questions over the range of knowledge estimates.
559+
The black curves display the (population-level) GLMM-predicted probabilities of correctly answering a question as a function of estimated knowledge.
560+
Error ribbons denote 95\% confidence intervals.}
574561

575562
\label{fig:predictions}
576563
\end{figure}
577564

578-
We carried out these analyses in three different ways. First, we used all (but one) of
579-
the questions from a given quiz (and participant) to predict knowledge at the
580-
embedding coordinate of a held-out question (``All questions'' in
581-
Fig.~\ref{fig:predictions}). This test was intended to serve as an overall
582-
baseline for the predictive power of our approach. Second, we used questions
583-
about one lecture to predict knowledge at the embedding coordinate of a held-out
584-
question about the \textit{other} lecture, from the same quiz and participant
585-
(``Across-lecture'' in Fig.~\ref{fig:predictions}). This test was intended to
586-
test the \textit{generalizability} of our approach by asking whether our
587-
knowledge predictions held across the content areas of the two lectures. Third,
588-
we used questions about one lecture to predict knowledge at the embedding
589-
coordinate of a held-out question about the \textit{same} lecture, from the same
590-
quiz and participant (``Within-lecture'' in Fig.~\ref{fig:predictions}). This
591-
test was intended to test the \textit{specificity} of our approach by asking
592-
whether our knowledge predictions could distinguish between questions about
593-
different content covered by the same lecture. We repeated each of these
594-
analyses using all possible held-out questions for each quiz and participant.
565+
We carried out three different versions of the analyses described above, wherein we considered different sources of information in our estimates of participants' knowledge for each quiz question.
566+
First, we estimated knowledge at each question's embedding coordinate using \textit{all other} questions answered by the same participant on the same quiz (``All questions''; Fig.~\ref{fig:predictions}, top row).
567+
This test was intended to assess the overall predictive power of our approach.
568+
Second, we estimated knowledge for each question about one lecture using only questions (from the same participant and quiz) about the \textit{other} lecture (``Across-lecture''; Fig.~\ref{fig:predictions}, middle rows).
569+
This test was intended to assess the \textit{generalizability} of our approach by asking whether our predictions held across the content areas of the two lectures.
570+
Third, we estimated knowledge for each question about a given lecture using only the other questions (from the same participant and quiz) about that \textit{same} lecture (``Within-lecture''; Fig.~\ref{fig:predictions}, bottom rows).
571+
This test was intended to assess the \textit{specificity} of our approach by asking whether our predictions could distinguish between questions about different content covered by the same lecture.
595572

596573
For the initial quizzes participants took (prior to watching either lecture),
597574
predicted knowledge tended to be low overall, and relatively

0 commit comments

Comments
 (0)