Skip to content

Commit 0b8ff62

Browse files
Merge pull request #106 from jeremymanning/master
updated fig 6 description of across-lecture predictions
2 parents ffecc6c + 8a5bfee commit 0b8ff62

File tree

2 files changed

+65
-30
lines changed

2 files changed

+65
-30
lines changed

paper/main.pdf

2.69 KB
Binary file not shown.

paper/main.tex

Lines changed: 65 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -672,37 +672,72 @@ \section*{Results}
672672
responses to extract meaningful information about both what they know and what
673673
they do not know.
674674

675-
Finally, when we estimated participants' knowledge for each question about one
675+
Finally, we estimated participants' knowledge for each question about one
676676
lecture using their performance on questions (from the same quiz) about the
677-
\textit{other} lecture, we observed a somewhat different pattern of results.
678-
Here we found that before viewing either lecture (i.e., on Quiz~1),
679-
participants' abilities to answer \textit{Four Fundamental Forces}-related
680-
questions could not be predicted from their responses to \textit{Birth of
681-
Stars}-related questions ($OR = 1.896,\ 95\%\ \textnormal{CI} = [0.419,\
682-
9.088],\ \lambda_{LR} = 0.712,\ p = 0.404$), nor could their abilities to
683-
answer \textit{Birth of Stars}-related questions be predicted from their
684-
responses to \textit{Four Fundamental Forces}-related questions ($OR = 1.522,\
685-
95\%\ \textnormal{CI} = [0.332,\ 6.835],\ \lambda_{LR} = 0.286,\ p = 0.611$).
686-
Similarly, we found that participants' performance on questions about either
687-
lecture could not be predicted given their responses to questions about the
688-
other lecture after viewing \textit{Four Fundamental Forces} but before viewing
689-
\textit{Birth of Stars} (i.e., on Quiz~2; \textit{Four Fundamental Forces}
690-
questions given \textit{Birth of Stars} questions: $OR = 3.49,\ 95\%\
691-
\textnormal{CI} = [0.739,\ 12.849],\ \lambda_{LR} = 3.266,\ p = 0.083$;
692-
\textit{Birth of Stars} questions given \textit{Four Fundamental Forces}
693-
questions: $OR = 2.199,\ 95\%\ \textnormal{CI} = [0.711,\ 5.623],\
694-
\lambda_{LR} = 2.304,\ p = 0.141$). Only after viewing \textit{both} lectures
695-
(i.e., on Quiz~3) did these across-lecture knowledge estimates reliably predict
696-
participants' success on individual quiz questions (\textit{Four Fundamental
697-
Forces} questions given \textit{Birth of Stars} questions: $OR = 11.294,\
698-
95\%\ \textnormal{CI} = [1.375,\ 47.744],\ \lambda_{LR} = 10.396,\ p < 0.001$;
699-
\textit{Birth of Stars} questions given \textit{Four Fundamental Forces}
700-
questions: $OR = 7.302,\ 95\%\ \textnormal{CI} = [1.077,\ 44.879],\
701-
\lambda_{LR} = 4.708,\ p = 0.038$). Taken together, these results suggest that
702-
our knowledge estimates can be used to predict participants' success across
703-
content areas once they have received some training on both the content about
704-
which their knowledge is estimated and the content used to construct these
705-
estimates.
677+
\textit{other} lecture. This is an especially stringent test of our approach.
678+
Our primary assumption in building our knowledge estimates is that knowledge
679+
about a given concept is similar to knowledge about other concepts that are
680+
nearby in the embedding space. However, our analyses in Figure~\ref{fig:topics}
681+
and Supplementary Figure~\topicWeights~show that the embeddings of content from
682+
the two lectures are largely distinct. Therefore any predictive power of the
683+
knowledge estimates must overcome large distances in the embedding space. To
684+
put this in concrete terms, this test requires predicting participants'
685+
performance on individual highly specific questions about the formation of
686+
stars, using each participants' responses to just five multiple choice
687+
questions about the fundamental forces of the universe (and vice versa).
688+
689+
We found that, before viewing either lecture (i.e., on Quiz~1), participants'
690+
abilities to answer \textit{Four Fundamental Forces}-related questions could
691+
not be predicted from their responses to \textit{Birth of Stars}-related
692+
questions ($OR = 1.896,\ 95\%\ \textnormal{CI} = [0.419,\ 9.088],\ \lambda_{LR}
693+
= 0.712,\ p = 0.404$), nor could their abilities to answer \textit{Birth of
694+
Stars}-related questions be predicted from their responses to \textit{Four
695+
Fundamental Forces}-related questions ($OR = 1.522,\ 95\%\ \textnormal{CI} =
696+
[0.332,\ 6.835],\ \lambda_{LR} = 0.286,\ p = 0.611$). Similarly, we found that
697+
participants' performance on questions about either lecture could not be
698+
predicted given their responses to questions about the other lecture after
699+
viewing \textit{Four Fundamental Forces} but before viewing \textit{Birth of
700+
Stars} (i.e., on Quiz~2; \textit{Four Fundamental Forces} questions given
701+
\textit{Birth of Stars} questions: $OR = 3.49,\ 95\%\ \textnormal{CI} =
702+
[0.739,\ 12.849],\ \lambda_{LR} = 3.266,\ p = 0.083$; \textit{Birth of Stars}
703+
questions given \textit{Four Fundamental Forces} questions: $OR = 2.199,\ 95\%\
704+
\textnormal{CI} = [0.711,\ 5.623],\ \lambda_{LR} = 2.304,\ p = 0.141$). Only
705+
after viewing \textit{both} lectures (i.e., on Quiz~3) did these across-lecture
706+
knowledge estimates reliably predict participants' success on individual quiz
707+
questions (\textit{Four Fundamental Forces} questions given \textit{Birth of
708+
Stars} questions: $OR = 11.294,\ 95\%\ \textnormal{CI} = [1.375,\ 47.744],\
709+
\lambda_{LR} = 10.396,\ p < 0.001$; \textit{Birth of Stars} questions given
710+
\textit{Four Fundamental Forces} questions: $OR = 7.302,\ 95\%\ \textnormal{CI}
711+
= [1.077,\ 44.879],\ \lambda_{LR} = 4.708,\ p = 0.038$). Taken together, our
712+
results suggest that our ability to form estimates solely across different
713+
content areas is more limited than our ability to form estimates that
714+
incorporate responses to questions across both content areas (as in
715+
Fig.~\ref{fig:predictions}, ``All questions'') or within a single content area (as
716+
in Fig.~\ref{fig:predictions}, ``Within-lecture''). However, if participants have recently
717+
received some training on both content areas, the knowledge estimates appear to be informative
718+
even across content areas.
719+
720+
We speculate that these ``Across-lecture'' results might relate to some of our
721+
earlier work on the nature of semantic representations~\citep{MannKaha12}. In
722+
that work, we asked whether semantic similarities could be captured through
723+
behavioral measures, even if participants' ``true'' internal representations
724+
differed from the embeddings used to \textit{characterize} participants'
725+
behaviors. We found that mismatches between someone's internal representation
726+
of a set of concepts and the representation used to characterize their
727+
behaviors can lead to underestimates of how semantically driven their behaviors
728+
are. Along similar lines, we suspect that in our current study, participants'
729+
conceptual representations may initially differ from the representations
730+
learned by our topic model. (Although the topic models are still
731+
\textit{related} to participants' initial internal representations; otherwise
732+
we would have found that knowledge estimates derived from Quiz 1 and 2
733+
responses would have no predictive power in the other tests we conducted.)
734+
After watching both lectures, however, participants' internal representations
735+
may become more aligned with the embeddings used to estimate their knowledge
736+
(since those embeddings were trained on the lecture transcripts). This could
737+
help explain why the knowledge estimates derived from Quizzes 1 and 2 (before
738+
both lectures had been watched) do not reliably predict performance across
739+
content areas, whereas estiamtes derived from Quiz 3 \textit{do} reliably
740+
predict performance across content areas.
706741

707742
That the knowledge predictions derived from the text embedding space reliably
708743
distinguish between held-out correctly versus incorrectly answered questions

0 commit comments

Comments
 (0)