5959quantify the content of each moment of video and each quiz question. We use
6060these embeddings, along with participants' quiz responses, to track how the
6161learners' knowledge changed after watching each video. Our findings show how a
62- small set of quiz questions may be used to obtain rich and meaningful, high-resolution
63- insights into what each learner knows, and how their knowledge
62+ small set of quiz questions may be used to obtain rich and meaningful
63+ high-resolution insights into what each learner knows, and how their knowledge
6464changes over time as they learn.
6565
6666\textbf {Keywords: education, learning, knowledge, concepts, natural language processing }
@@ -252,7 +252,7 @@ \section*{Results}
252252We also wrote a set of multiple-choice quiz questions that we hoped would
253253enable us to evaluate participants' knowledge about each individual lecture,
254254along with related knowledge about physics concepts not specifically presented in either
255- video (see Supplementary Tab.~\questions ~for the full list of questions in our stimulus
255+ video (see Supp. Tab.~\questions ~for the full list of questions in our stimulus
256256pool). Participants answered questions randomly drawn from each content area
257257(Lecture~1, Lecture~2, and general physics knowledge) on each of the three
258258quizzes. Quiz~1 was intended to assess participants' `` baseline'' knowledge
@@ -275,7 +275,7 @@ \section*{Results}
275275 windows parsed from its transcript. \textbf {C. Embedding multiple lectures and
276276 questions in a shared space. } We apply the same model (trained on the two
277277 lectures' windows) to both lectures, along with the text of each question in
278- our pool (Supplementary Tab.~\questions ), to project them into a shared text embedding space.
278+ our pool (Supp. Tab.~\questions ), to project them into a shared text embedding space.
279279 This results in one trajectory per lecture and one coordinate for each
280280 question. Here, we have projected the 15-dimensional embeddings onto their first
281281 3 principal components for visualization.}
@@ -319,7 +319,7 @@ \section*{Results}
319319lecture transcripts and questions, since the lectures and questions used
320320different words. Simply comparing the average topic weights from each lecture
321321and question set (averaging across time and questions, respectively) reveals a
322- striking correspondence (Supplementary Fig.~\topicWeights ). Specifically, the average topic
322+ striking correspondence (Supp. Fig.~\topicWeights ). Specifically, the average topic
323323weights from Lecture~1 are strongly correlated with the average topic weights
324324from Lecture~1 questions ($ r(13 ) = 0.809 ,~p < 0.001 $ , 95\% confidence interval
325325(CI)~$ = [0.633 ,~0.962 ]$ ), and the average topic weights from Lecture~2 are
@@ -516,7 +516,7 @@ \section*{Results}
516516content tested by a given question, our estimates of their knowledge should carry some
517517predictive information about whether the participant is likely to answer that
518518question correctly or incorrectly. We developed a statistical approach to test this claim.
519- For each question in turn, we used equation ~\ref {eqn:prop } to estimate each
519+ For each question in turn, we used Equation ~\ref {eqn:prop } to estimate each
520520participant's knowledge at the given question's embedding space coordinate,
521521using all \textit {other } questions that participant answered on the same quiz.
522522% For each question in turn, for each
@@ -955,7 +955,7 @@ \subsubsection*{Constructing text embeddings of multiple lectures and questions}
955955We transformed each sliding window's text into a topic vector, and then used
956956linear interpolation (independently for each topic dimension) to resample the
957957resulting timeseries to one vector per second. We also used the fitted model to
958- obtain topic vectors for each question in our pool (see Supplementary Tab.~\questions ). Taken
958+ obtain topic vectors for each question in our pool (see Supp. Tab.~\questions ). Taken
959959together, we obtained a \textit {trajectory } for each video, describing its path
960960through topic space, and a single coordinate for each question
961961(Fig.~\ref {fig:sliding-windows }C). Embedding both videos and all of the
@@ -997,7 +997,7 @@ \subsubsection*{Estimating dynamic knowledge traces}\label{subsec:traces}
997997between timepoint $ t$ 's topic vector and the topic vectors for each question.
998998The normalization step (i.e., using $ \mathrm {ncorr}$ instead of the raw
999999correlations) insures that every question
1000- contributes some non-zero amount to the knowledge estimate.
1000+ contributes some non-negative amount to the knowledge estimate.
10011001
10021002\subsubsection* {Creating knowledge and learning map visualizations }\label {subsec:knowledge-maps }
10031003
@@ -1006,7 +1006,7 @@ \subsubsection*{Creating knowledge and learning map visualizations}\label{subsec
10061006their knowledge about \textit {any } content expressible by the embedding
10071007model---not solely the content explicitly probed by the quiz questions, or even
10081008appearing in the lectures. To visualize these estimates
1009- (Fig.~\ref {fig:knowledge-maps }, Supplementary Figs.~\individualKnowledgeMapsA ,~\individualKnowledgeMapsB ,~\individualKnowledgeMapsC ,~\individualLearningMapsA ,
1009+ (Fig.~\ref {fig:knowledge-maps }, Supp. Figs.~\individualKnowledgeMapsA ,~\individualKnowledgeMapsB ,~\individualKnowledgeMapsC ,~\individualLearningMapsA ,
10101010and~\individualLearningMapsB ), we used Uniform Manifold Approximation and
10111011Projection (UMAP; \citealp {McInEtal18a , McInEtal18b }) to construct a 2D projection of the
10121012text embedding space. Sampling the original 100-dimensional space at high
0 commit comments