Skip to content

Commit 040af4f

Browse files
updates to fig 6 text description
1 parent c9fdb09 commit 040af4f

File tree

2 files changed

+33
-3
lines changed

2 files changed

+33
-3
lines changed

paper/main.pdf

3.87 KB
Binary file not shown.

paper/main.tex

Lines changed: 33 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -568,15 +568,45 @@ \section*{Results}
568568
Second, we estimated knowledge for each question about one lecture using only questions (from the same participant and quiz) about the \textit{other} lecture (``Across-lecture''; Fig.~\ref{fig:predictions}, middle rows).
569569
This test was intended to assess the \textit{generalizability} of our approach by asking whether our predictions held across the content areas of the two lectures.
570570
Third, we estimated knowledge for each question about a given lecture using only the other questions (from the same participant and quiz) about that \textit{same} lecture (``Within-lecture''; Fig.~\ref{fig:predictions}, bottom rows).
571-
This test was intended to assess the \textit{specificity} of our approach by asking whether our predictions could distinguish between questions about different content covered by the same lecture.
571+
This test was intended to assess the \textit{specificity} of our approach by asking whether our predictions could distinguish between questions about different content covered by the same lecture.
572+
573+
Our null hypothesis in performing these analyses is that the knowledge estimates we compute based on the quiz questions' embedding coordinates do \textit{not} provide useful information about participants' abilities to answer those questions.
574+
What result might we expect to see if this is the case?
575+
To provide an intuition for this, consider the expected outcome if we carried out these same analyses using a simple proportion-correct measure in lieu of our knowledge estimates.
576+
Suppose a participant correctly answered $n$ out of 13 questions on a given quiz.
577+
If we held out a single correctly answered question and computed the proportion of remaining questions answered correctly, that proportion would be $(n - 1) / 12$.
578+
Whereas if we held out a single \textit{incorrectly} answered question and did the same, that proportion would be $n / 12$.
579+
Thus for a given participant and quiz, a ``knowledge estimate'' computed as the simple (i.e., unweighted) remaining proportion-correct is perfectly inversely related to success on a held-out question: it will always be \textit{lower} for correctly answered questions than for incorrectly answered questions.
580+
Given that our knowledge estimates are computed as a weighted version of this same proportion-correct score (where each held-in question's weight reflects its embedding-space distance from the held-out question; see Eqn.~\ref{eqn:prop}), if these weights are uninformative (e.g., simply randomly distributed), then we would expect, on average, to see this same inverse relationship.
581+
It is only if the spatial relationships among the quiz questions' embedding coordinates map onto participants' knowledge in a meaningful way that we would we expect this relationship to be non-negative [PHRASING].
582+
583+
When we fit a GLMM to estimates of participants' knowledge for each Quiz~1 question based on all other Quiz~1 questions, we observed this inverse relationship expected under our null hypothesis.
584+
Specifically, higher estimated knowledge at the embedding coordinate of a held-out Quiz~1 question was associated with a lower likelihood of answering the question correctly (odds ratio $(OR) = 0.136$, likelihood-ratio test statistic $(\lambda_{LR}) = 19.749$, 95\%\ $\textnormal{CI} = [14.352,\ 26.545],\ p = 0.001$).
585+
However, when we repeated this analysis for quizzes 2 and 3, the direction of this relationship reversed: higher estimated knowledge for a given question predicted a greater likelihood of answering it correctly (Quiz~2: $OR = 2.905,\ \lambda_{LR} = 17.333,\ 95\%\ \textnormal{CI} = [14.966,\ 29.309],\ p = 0.002$; Quiz~3: $OR = 3.238,\ \lambda_{LR} = 6.882,\ 95\%\ \textnormal{CI} = [6.228,\ 8.184],\ p = 0.017$).
586+
Taken together, these results suggest that our knowledge estimations can reliably predict participants' likelihood of success on individual quiz questions, provided they have at least some amount of structured knowledge about the underlying concepts being tested.
587+
In other words, when participants' correct responses primarily arise from knowledge about the content probed by each question (e.g., after watching one or both lectures), these successes can be predicted from their ability to answer other questions about conceptually similar content (as captured by embedding-space distance).
588+
However, when a sufficiently large portion of participants' correct responses (presumably) reflect successful random guessing (such as on a multiple-choice quiz taken before viewing either lecture), our approach fails to predict these successes since they do not map onto embedding space distances in a meaningful way.
589+
590+
%When we estimated participants' knowledge for each Quiz~1 question based on all other Quiz~1 questions, we found an inverse relationship. Specifically, higher estimated knowledge at the embedding coordinate at a held-out question was associated with a lower likelihood of answering the question correctly ($\textrm{odds ratio}\ (OR) = 0.136,\ \textrm{likelihood-ratio test statistic}\ (\lambda_{LR}) = 19.749,\ \textrm{95\% CI} = [14.352,\ 26.545],\ p = 0.001$). However, this inverse relationship in fact represents the expected result under our null hypothesis (that estimated knowledge is \textit{not} predictive of success on a question). An intuition for this can be taken from the expected outcome of same analysis based on the simple proportion correct, rather than estimated knowledge. Suppose a participant answered $n$ out of 13 quiz questions correctly. If we held out a single correctly answered question and computed the proportion of remaining questions answered correctly, that proportion would be $(n - 1) / 12$. Whereas if we held out a single incorrectly answered question, the proportion of remaining questions answered correctly would be $n / 12$.
591+
592+
When we
593+
594+
595+
Taken together, the results of these sets of analyses suggest that our knowledge estimations can reliably predict participants' abilities to answer individual quiz questions, generalize across content areas, and distinguish between questions about similar content, provided that a basic set of assumptions about that content (as described above) can be assumed.
596+
597+
% our approach works when participants have a minimal baseline level of knowledge about content predicted and used to predict
598+
% our approach generalizes when knowledge of content used to predict can be assumed to be a reasonable indicator of knowledge of content predicted
599+
% our approach has enough specificity to distinguish between content within the same lecture when it was just watched -- maybe when people forget a little bit they forget "randomly"?.
600+
601+
% two conditions: participants have at least some knowledge of content being tested, and knowledge of content used to predict is good indicator of knowledge of content predicted / participants can be expected to have same level of knowledge for content used to predict and content predicted
572602

573603
For the initial quizzes participants took (prior to watching either lecture),
574604
predicted knowledge tended to be low overall, and relatively
575605
unstructured (Fig.~\ref{fig:predictions}, left column). When we held out
576606
individual questions and predicted their knowledge at the held-out questions'
577607
embedding coordinates, we found no reliable differences in the predictions when
578608
the held-out question had been correctly versus incorrectly answered. This
579-
``null'' effect persisted when we used \textit{all} of the Quiz 1 questions
609+
``null'' effect persisted when we used \textit{all} of the Quiz~1 questions
580610
from a given participant to predict a held-out question (``All questions''; $\U
581611
= 50587,~p = 0.723$), when we used questions from one lecture to predict
582612
knowledge at the embedding coordinate of a held-out question about the
@@ -785,7 +815,7 @@ \section*{Results}
785815
watched prior to taking Quiz~2. This localization is non-trivial: these
786816
knowledge estimates are informed only by the embedded coordinates of the
787817
\textit{quiz questions}, not by the embeddings of either lecture (see
788-
Eqn.~\ref{eqn:rbf-knowledge}). Finally, the knowledge map estimated from Quiz 3
818+
Eqn.~\ref{eqn:rbf-knowledge}). Finally, the knowledge map estimated from Quiz~3
789819
responses shows a second increase in knowledge, localized to the region
790820
surrounding the embedding of the \textit{Birth of Stars} lecture participants
791821
watched immediately prior to taking Quiz~3.

0 commit comments

Comments
 (0)