Skip to content

Commit 17c501f

Browse files
committed
JRM final pass through
1 parent 30e3b70 commit 17c501f

File tree

5 files changed

+17
-14
lines changed

5 files changed

+17
-14
lines changed

paper/figs/sliding_windows.pdf

-761 Bytes
Binary file not shown.

paper/main.pdf

-626 Bytes
Binary file not shown.

paper/main.tex

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -59,8 +59,8 @@
5959
quantify the content of each moment of video and each quiz question. We use
6060
these embeddings, along with participants' quiz responses, to track how the
6161
learners' knowledge changed after watching each video. Our findings show how a
62-
small set of quiz questions may be used to obtain rich and meaningful, high-resolution
63-
insights into what each learner knows, and how their knowledge
62+
small set of quiz questions may be used to obtain rich and meaningful
63+
high-resolution insights into what each learner knows, and how their knowledge
6464
changes over time as they learn.
6565

6666
\textbf{Keywords: education, learning, knowledge, concepts, natural language processing}
@@ -252,7 +252,7 @@ \section*{Results}
252252
We also wrote a set of multiple-choice quiz questions that we hoped would
253253
enable us to evaluate participants' knowledge about each individual lecture,
254254
along with related knowledge about physics concepts not specifically presented in either
255-
video (see Supplementary Tab.~\questions~for the full list of questions in our stimulus
255+
video (see Supp. Tab.~\questions~for the full list of questions in our stimulus
256256
pool). Participants answered questions randomly drawn from each content area
257257
(Lecture~1, Lecture~2, and general physics knowledge) on each of the three
258258
quizzes. Quiz~1 was intended to assess participants' ``baseline'' knowledge
@@ -275,7 +275,7 @@ \section*{Results}
275275
windows parsed from its transcript. \textbf{C. Embedding multiple lectures and
276276
questions in a shared space.} We apply the same model (trained on the two
277277
lectures' windows) to both lectures, along with the text of each question in
278-
our pool (Supplementary Tab.~\questions), to project them into a shared text embedding space.
278+
our pool (Supp. Tab.~\questions), to project them into a shared text embedding space.
279279
This results in one trajectory per lecture and one coordinate for each
280280
question. Here, we have projected the 15-dimensional embeddings onto their first
281281
3 principal components for visualization.}
@@ -319,7 +319,7 @@ \section*{Results}
319319
lecture transcripts and questions, since the lectures and questions used
320320
different words. Simply comparing the average topic weights from each lecture
321321
and question set (averaging across time and questions, respectively) reveals a
322-
striking correspondence (Supplementary Fig.~\topicWeights). Specifically, the average topic
322+
striking correspondence (Supp. Fig.~\topicWeights). Specifically, the average topic
323323
weights from Lecture~1 are strongly correlated with the average topic weights
324324
from Lecture~1 questions ($r(13) = 0.809,~p < 0.001$, 95\% confidence interval
325325
(CI)~$= [0.633,~0.962]$), and the average topic weights from Lecture~2 are
@@ -516,7 +516,7 @@ \section*{Results}
516516
content tested by a given question, our estimates of their knowledge should carry some
517517
predictive information about whether the participant is likely to answer that
518518
question correctly or incorrectly. We developed a statistical approach to test this claim.
519-
For each question in turn, we used equation~\ref{eqn:prop} to estimate each
519+
For each question in turn, we used Equation~\ref{eqn:prop} to estimate each
520520
participant's knowledge at the given question's embedding space coordinate,
521521
using all \textit{other} questions that participant answered on the same quiz.
522522
%For each question in turn, for each
@@ -955,7 +955,7 @@ \subsubsection*{Constructing text embeddings of multiple lectures and questions}
955955
We transformed each sliding window's text into a topic vector, and then used
956956
linear interpolation (independently for each topic dimension) to resample the
957957
resulting timeseries to one vector per second. We also used the fitted model to
958-
obtain topic vectors for each question in our pool (see Supplementary Tab.~\questions). Taken
958+
obtain topic vectors for each question in our pool (see Supp. Tab.~\questions). Taken
959959
together, we obtained a \textit{trajectory} for each video, describing its path
960960
through topic space, and a single coordinate for each question
961961
(Fig.~\ref{fig:sliding-windows}C). Embedding both videos and all of the
@@ -997,7 +997,7 @@ \subsubsection*{Estimating dynamic knowledge traces}\label{subsec:traces}
997997
between timepoint $t$'s topic vector and the topic vectors for each question.
998998
The normalization step (i.e., using $\mathrm{ncorr}$ instead of the raw
999999
correlations) insures that every question
1000-
contributes some non-zero amount to the knowledge estimate.
1000+
contributes some non-negative amount to the knowledge estimate.
10011001

10021002
\subsubsection*{Creating knowledge and learning map visualizations}\label{subsec:knowledge-maps}
10031003

@@ -1006,7 +1006,7 @@ \subsubsection*{Creating knowledge and learning map visualizations}\label{subsec
10061006
their knowledge about \textit{any} content expressible by the embedding
10071007
model---not solely the content explicitly probed by the quiz questions, or even
10081008
appearing in the lectures. To visualize these estimates
1009-
(Fig.~\ref{fig:knowledge-maps}, Supplementary Figs.~\individualKnowledgeMapsA,~\individualKnowledgeMapsB,~\individualKnowledgeMapsC,~\individualLearningMapsA,
1009+
(Fig.~\ref{fig:knowledge-maps}, Supp. Figs.~\individualKnowledgeMapsA,~\individualKnowledgeMapsB,~\individualKnowledgeMapsC,~\individualLearningMapsA,
10101010
and~\individualLearningMapsB), we used Uniform Manifold Approximation and
10111011
Projection (UMAP; \citealp{McInEtal18a, McInEtal18b}) to construct a 2D projection of the
10121012
text embedding space. Sampling the original 100-dimensional space at high

paper/supplement.pdf

10 Bytes
Binary file not shown.

paper/supplement.tex

Lines changed: 8 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -129,11 +129,14 @@
129129
\end{tabular}
130130
\end{tiny}
131131

132-
\caption{\textbf{Topics.} We fit a topic model with (up to) $k = 15$ topics to sliding
133-
windows parsed from the two course video transcripts (see~\topicModelMethods),
134-
and identified 14 topics with non-zero weights. The table displays the 10
135-
top-weighted words (columns) from each of the topics (rows) discovered by the
136-
model. The weights of these words are shown in Supplementary Figure~\ref{fig:topic-word-distributions}.}
132+
\caption{\textbf{Topics.} We fit a topic model with (up to) $k = 15$ topics
133+
to sliding windows parsed from the two course video transcripts
134+
(see~\topicModelMethods), and identified 14 topics with non-zero weights.
135+
The table displays the 10 top-weighted words (columns) from each of the
136+
topics (rows) discovered by the model. The weights of these words are shown
137+
in Supplementary Figure~\ref{fig:topic-word-distributions}.}
138+
139+
137140
\label{tab:topics}
138141
\end{table}
139142

0 commit comments

Comments
 (0)