Skip to content

Commit 5d1d354

Browse files
merge jeremy's updates to fig 6 across-lecture results
2 parents e5a3964 + 0b8ff62 commit 5d1d354

File tree

2 files changed

+65
-30
lines changed

2 files changed

+65
-30
lines changed

paper/main.pdf

2.63 KB
Binary file not shown.

paper/main.tex

Lines changed: 65 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -670,37 +670,72 @@ \section*{Results}
670670
participants' quiz responses to extract meaningful information about both what
671671
they know and what they do not know.
672672

673-
Finally, when we estimated participants' knowledge for each question about one
673+
Finally, we estimated participants' knowledge for each question about one
674674
lecture using their performance on questions (from the same quiz) about the
675-
\textit{other} lecture, we observed a somewhat different pattern of results.
676-
Here we found that before viewing either lecture (i.e., on Quiz~1),
677-
participants' abilities to answer \textit{Four Fundamental Forces}-related
678-
questions could not be predicted from their responses to \textit{Birth of
679-
Stars}-related questions ($OR = 1.896,\ 95\%\ \textnormal{CI} = [0.419,\
680-
9.088],\ \lambda_{LR} = 0.712,\ p = 0.404$), nor could their abilities to
681-
answer \textit{Birth of Stars}-related questions be predicted from their
682-
responses to \textit{Four Fundamental Forces}-related questions ($OR = 1.522,\
683-
95\%\ \textnormal{CI} = [0.332,\ 6.835],\ \lambda_{LR} = 0.286,\ p = 0.611$).
684-
Similarly, we found that participants' performance on questions about either
685-
lecture could not be predicted given their responses to questions about the
686-
other lecture after viewing \textit{Four Fundamental Forces} but before viewing
687-
\textit{Birth of Stars} (i.e., on Quiz~2; \textit{Four Fundamental Forces}
688-
questions given \textit{Birth of Stars} questions: $OR = 3.49,\ 95\%\
689-
\textnormal{CI} = [0.739,\ 12.849],\ \lambda_{LR} = 3.266,\ p = 0.083$;
690-
\textit{Birth of Stars} questions given \textit{Four Fundamental Forces}
691-
questions: $OR = 2.199,\ 95\%\ \textnormal{CI} = [0.711,\ 5.623],\
692-
\lambda_{LR} = 2.304,\ p = 0.141$). Only after viewing \textit{both} lectures
693-
(i.e., on Quiz~3) did these across-lecture knowledge estimates reliably predict
694-
participants' success on individual quiz questions (\textit{Four Fundamental
695-
Forces} questions given \textit{Birth of Stars} questions: $OR = 11.294,\
696-
95\%\ \textnormal{CI} = [1.375,\ 47.744],\ \lambda_{LR} = 10.396,\ p < 0.001$;
697-
\textit{Birth of Stars} questions given \textit{Four Fundamental Forces}
698-
questions: $OR = 7.302,\ 95\%\ \textnormal{CI} = [1.077,\ 44.879],\
699-
\lambda_{LR} = 4.708,\ p = 0.038$). Taken together, these results suggest that
700-
our knowledge estimates can be used to predict participants' success across
701-
content areas once they have received some training on both the content about
702-
which their knowledge is estimated and the content used to construct these
703-
estimates.
675+
\textit{other} lecture. This is an especially stringent test of our approach.
676+
Our primary assumption in building our knowledge estimates is that knowledge
677+
about a given concept is similar to knowledge about other concepts that are
678+
nearby in the embedding space. However, our analyses in Figure~\ref{fig:topics}
679+
and Supplementary Figure~\topicWeights~show that the embeddings of content from
680+
the two lectures are largely distinct. Therefore any predictive power of the
681+
knowledge estimates must overcome large distances in the embedding space. To
682+
put this in concrete terms, this test requires predicting participants'
683+
performance on individual highly specific questions about the formation of
684+
stars, using each participants' responses to just five multiple choice
685+
questions about the fundamental forces of the universe (and vice versa).
686+
687+
We found that, before viewing either lecture (i.e., on Quiz~1), participants'
688+
abilities to answer \textit{Four Fundamental Forces}-related questions could
689+
not be predicted from their responses to \textit{Birth of Stars}-related
690+
questions ($OR = 1.896,\ 95\%\ \textnormal{CI} = [0.419,\ 9.088],\ \lambda_{LR}
691+
= 0.712,\ p = 0.404$), nor could their abilities to answer \textit{Birth of
692+
Stars}-related questions be predicted from their responses to \textit{Four
693+
Fundamental Forces}-related questions ($OR = 1.522,\ 95\%\ \textnormal{CI} =
694+
[0.332,\ 6.835],\ \lambda_{LR} = 0.286,\ p = 0.611$). Similarly, we found that
695+
participants' performance on questions about either lecture could not be
696+
predicted given their responses to questions about the other lecture after
697+
viewing \textit{Four Fundamental Forces} but before viewing \textit{Birth of
698+
Stars} (i.e., on Quiz~2; \textit{Four Fundamental Forces} questions given
699+
\textit{Birth of Stars} questions: $OR = 3.49,\ 95\%\ \textnormal{CI} =
700+
[0.739,\ 12.849],\ \lambda_{LR} = 3.266,\ p = 0.083$; \textit{Birth of Stars}
701+
questions given \textit{Four Fundamental Forces} questions: $OR = 2.199,\ 95\%\
702+
\textnormal{CI} = [0.711,\ 5.623],\ \lambda_{LR} = 2.304,\ p = 0.141$). Only
703+
after viewing \textit{both} lectures (i.e., on Quiz~3) did these across-lecture
704+
knowledge estimates reliably predict participants' success on individual quiz
705+
questions (\textit{Four Fundamental Forces} questions given \textit{Birth of
706+
Stars} questions: $OR = 11.294,\ 95\%\ \textnormal{CI} = [1.375,\ 47.744],\
707+
\lambda_{LR} = 10.396,\ p < 0.001$; \textit{Birth of Stars} questions given
708+
\textit{Four Fundamental Forces} questions: $OR = 7.302,\ 95\%\ \textnormal{CI}
709+
= [1.077,\ 44.879],\ \lambda_{LR} = 4.708,\ p = 0.038$). Taken together, our
710+
results suggest that our ability to form estimates solely across different
711+
content areas is more limited than our ability to form estimates that
712+
incorporate responses to questions across both content areas (as in
713+
Fig.~\ref{fig:predictions}, ``All questions'') or within a single content area (as
714+
in Fig.~\ref{fig:predictions}, ``Within-lecture''). However, if participants have recently
715+
received some training on both content areas, the knowledge estimates appear to be informative
716+
even across content areas.
717+
718+
We speculate that these ``Across-lecture'' results might relate to some of our
719+
earlier work on the nature of semantic representations~\citep{MannKaha12}. In
720+
that work, we asked whether semantic similarities could be captured through
721+
behavioral measures, even if participants' ``true'' internal representations
722+
differed from the embeddings used to \textit{characterize} participants'
723+
behaviors. We found that mismatches between someone's internal representation
724+
of a set of concepts and the representation used to characterize their
725+
behaviors can lead to underestimates of how semantically driven their behaviors
726+
are. Along similar lines, we suspect that in our current study, participants'
727+
conceptual representations may initially differ from the representations
728+
learned by our topic model. (Although the topic models are still
729+
\textit{related} to participants' initial internal representations; otherwise
730+
we would have found that knowledge estimates derived from Quiz 1 and 2
731+
responses would have no predictive power in the other tests we conducted.)
732+
After watching both lectures, however, participants' internal representations
733+
may become more aligned with the embeddings used to estimate their knowledge
734+
(since those embeddings were trained on the lecture transcripts). This could
735+
help explain why the knowledge estimates derived from Quizzes 1 and 2 (before
736+
both lectures had been watched) do not reliably predict performance across
737+
content areas, whereas estiamtes derived from Quiz 3 \textit{do} reliably
738+
predict performance across content areas.
704739

705740
That the knowledge predictions derived from the text embedding space reliably
706741
distinguish between held-out correctly versus incorrectly answered questions

0 commit comments

Comments
 (0)