Skip to content

Commit daf469f

Browse files
finished final edits to main text
1 parent ef4979b commit daf469f

File tree

2 files changed

+36
-37
lines changed

2 files changed

+36
-37
lines changed

paper/main.pdf

39 Bytes
Binary file not shown.

paper/main.tex

Lines changed: 36 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -1238,7 +1238,7 @@ \subsection*{Analysis}
12381238
\subsubsection*{Statistics}
12391239

12401240
All of the statistical tests performed in our study were two-sided. The 95\%
1241-
confidence intervals we reported for each correlation were estimated from
1241+
confidence intervals we report for each correlation were estimated from
12421242
bootstrap distributions of 10,000 correlation coefficients obtained by
12431243
sampling (with replacement) from the observed data.
12441244

@@ -1270,10 +1270,10 @@ \subsubsection*{Constructing text embeddings of multiple lectures and questions}
12701270
Supplementary Figure~\topicWordWeights, and each topic's top-weighted words may
12711271
be found in Supplementary Table~\topics.
12721272

1273-
As illustrated in Figure~\ref{fig:sliding-windows}A, we start by building up a
1274-
corpus of documents using overlapping sliding windows that span each video's
1273+
As illustrated in Figure~\ref{fig:sliding-windows}A, we started by building up a
1274+
corpus of documents using overlapping sliding windows that spanned each lecture's
12751275
transcript. Khan Academy provides professionally created, manual transcriptions
1276-
of all videos for closed captioning. However, such transcripts would not be
1276+
of all lecture videos for closed captioning. However, such transcripts would not be
12771277
readily available in all contexts to which our framework could potentially be
12781278
applied. Khan Academy videos are hosted on the YouTube platform, which
12791279
additionally provides automated captions. We opted to use these automated
@@ -1283,9 +1283,9 @@ \subsubsection*{Constructing text embeddings of multiple lectures and questions}
12831283
it more directly extensible and adaptable by others in the future.
12841284

12851285
We fetched these automated transcripts using the
1286-
\texttt{youtube-transcript-api} Python package~\citep{Depo18}. The transcripts
1286+
\texttt{youtube-transcript-api} Python package~\citep{Depo18}. Each transcript
12871287
consisted of one timestamped line of text for every few seconds (mean: 2.34~s;
1288-
standard deviation: 0.83~s) of spoken content in the video (i.e., corresponding
1288+
standard deviation: 0.83~s) of spoken content in the lecture (i.e., corresponding
12891289
to each individual caption that would appear on-screen if viewing the lecture
12901290
via YouTube, and when those lines would appear). We defined a sliding window
12911291
length of (up to) $w = 30$ transcript lines and assigned each window a
@@ -1307,13 +1307,13 @@ \subsubsection*{Constructing text embeddings of multiple lectures and questions}
13071307
approaches suggested by~\citet{BoydEtal14}: ``actual,'' ``actually,'' ``also,''
13081308
``bit,'' ``could,'' ``e,'' ``even,'' ``first,'' ``follow,'' ``following,''
13091309
``four,'' ``let,'' ``like,'' ``mc,'' ``really,'', ``saw,'' ``see,'' ``seen,''
1310-
``thing,'' and ``two.'' This yielded sliding windows with an average of 73.8
1311-
remaining words, and lasting for an average of 62.22~seconds. We treated the
1310+
``thing,'' and ``two.'' This yielded sliding windows containing an average of 73.8
1311+
remaining words, and spanning an average of 62.22~seconds. We treated the
13121312
text from each sliding window as a single ``document'' and combined these
1313-
documents across the two videos' windows to create a single training corpus for
1313+
documents across the two lectures' windows to create a single training corpus for
13141314
the topic model.
13151315

1316-
After fitting the topic model to the two videos' transcripts, we could use the
1316+
After fitting the topic model to the two lectures' transcripts, we could use the
13171317
trained model to transform arbitrary (potentially new) documents into
13181318
$k$-dimensional topic vectors. A convenient property of these topic vectors is
13191319
that documents that reflect similar blends of topics (i.e., documents that
@@ -1326,13 +1326,13 @@ \subsubsection*{Constructing text embeddings of multiple lectures and questions}
13261326
We transformed each sliding window's text into a topic vector, and then used
13271327
linear interpolation (independently for each topic dimension) to resample the
13281328
resulting time series to one vector per second. We also used the fitted model to
1329-
obtain topic vectors for each question in our pool (see Supp.~Tab.~\questions).
1330-
Taken together, we obtained a \textit{trajectory} for each video, describing
1329+
obtain topic vectors for each quiz question in our pool (see Supp.~Tab.~\questions).
1330+
Taken together, we obtained a \textit{trajectory} for each lecture video, describing
13311331
its path through topic space, and a single coordinate for each question
1332-
(Fig.~\ref{fig:sliding-windows}C). Embedding both videos and all of the
1332+
(Fig.~\ref{fig:sliding-windows}C). Embedding both lectures and all of the
13331333
questions using a common model enables us to compare the content from different
1334-
moments of videos, compare the content across videos, and estimate potential
1335-
associations between specific questions and specific moments of video.
1334+
moments of the lectures, compare the content across lectures, and estimate potential
1335+
associations between specific questions and specific moments of lecture content.
13361336

13371337

13381338
\subsubsection*{Estimating dynamic knowledge traces}\label{subsec:traces}
@@ -1349,12 +1349,12 @@ \subsubsection*{Estimating dynamic knowledge traces}\label{subsec:traces}
13491349
\mathrm{ncorr}(x, y) = \frac{\mathrm{corr}(x, y) - \mathrm{mincorr}}{\mathrm{maxcorr} - \mathrm{mincorr}},
13501350
\end{equation}
13511351
and where $\mathrm{mincorr}$ and $\mathrm{maxcorr}$ are the minimum and maximum
1352-
correlations between any lecture timepoint and question, taken over all
1352+
correlations between the topic vectors for any lecture timepoint and quiz question, taken over all
13531353
timepoints in the given lecture and all questions \textit{about} that
13541354
lecture appearing on the given quiz. We also define $f(s, \Omega)$ as the
13551355
$s$\textsuperscript{th} topic vector from the set of topic vectors $\Omega$.
1356-
Here $t$ indexes the set of lecture topic vectors $L$, and $i$ and $j$ index
1357-
the topic vectors of questions $Q$ used to estimate the knowledge trace. Note
1356+
Here $t$ indexes the time series of lecture topic vectors $L$, and $i$ and $j$ index
1357+
the topic vectors of questions $Q$ used to estimate the participant's knowledge. Note
13581358
that ``$\mathrm{correct}$'' denotes the set of indices of the questions the
13591359
participant answered correctly on the given quiz.
13601360

@@ -1391,17 +1391,17 @@ \subsubsection*{Generalized linear mixed models}\label{subsec:glmm}
13911391

13921392
In performing these analyses, our null hypothesis is that the knowledge
13931393
estimates we compute based on the quiz questions' embedding coordinates do
1394-
\textit{not} provide useful information about participants' abilities to answer
1394+
\textit{not} provide useful information about participants' abilities to correctly answer
13951395
those questions---in other words, that there is no meaningful difference (on
13961396
average) between the knowledge estimates we compute for questions participants
1397-
answered correctly and those they answered incorrectly. Specifically, since we
1397+
answered correctly versus incorrectly. Specifically, since we
13981398
estimate knowledge for a given embedding coordinate as a weighted
13991399
proportion-correct score (where each question's weight reflects its
14001400
embedding-space distance from the target coordinate; see Eqn.~\ref{eqn:prop}),
14011401
if these weights are uninformative (e.g., randomly distributed), then our
14021402
estimates of participants' knowledge should be equivalent (on average) to the
14031403
\textit{unweighted} proportion of correctly answered questions used to compute
1404-
them. In general, for a given participant and quiz, this expected value (i.e.,
1404+
them. In general, for a given participant and quiz, this expected null value (i.e.,
14051405
that participant's proportion-correct score on that quiz) is the same for any
14061406
coordinate in the embedding space (e.g., any lecture timepoint, quiz question,
14071407
etc.). However, in the ``All questions'' and ``Within-lecture'' versions of the
@@ -1413,42 +1413,41 @@ \subsubsection*{Generalized linear mixed models}\label{subsec:glmm}
14131413
available to estimate their knowledge for it. For example, suppose a participant
14141414
correctly answered $n$ out of $q$ questions on a given quiz. If we hold out a
14151415
single \textit{correctly} answered question as the target, the proportion of
1416-
remaining questions answered correctly would be $\frac{n - 1}{q - 1}$. Whereas
1416+
remaining questions answered correctly would be $\frac{n - 1}{q - 1}$, whereas
14171417
if we hold out a single \textit{incorrectly} answered question, the proportion
14181418
of remaining questions answered correctly would be $\frac{n}{q - 1}$. Thus, the
14191419
proportion of correctly answered remaining questions (and therefore the
14201420
null-hypothesized value of a knowledge estimate computed from them) is always
14211421
\textit{lower} for target questions a participant answered correctly than for
14221422
those they answered incorrectly.
14231423

1424-
To correct for this baseline inverse relationship between a participant's
1425-
success on a target question and their estimated knowledge for it, we used a
1424+
To correct for this baseline difference under our null hypothesis, we used a
14261425
rebalancing procedure that ensured our knowledge estimates for questions each
14271426
participant answered correctly and incorrectly were computed from the
14281427
\textit{same} proportion of correctly answered questions. For each target
1429-
question on a given participant's quiz, we identified all remaining questions
1428+
question on a given participant's quiz, we first identified all remaining questions
14301429
with the opposite ``correctness'' label (i.e., if the target question was
14311430
answered correctly, we identified all remaining incorrectly answered questions,
14321431
and vice versa). We then held out each of these opposite-label questions, in
14331432
turn, along with the target question, and estimated the participant's knowledge
14341433
for the target question using all \textit{other} remaining questions. Since each
14351434
of these subsets of remaining questions was constructed by holding out one
14361435
correctly answered question and one incorrectly answered question from the
1437-
participant's quiz, if the participant correctly answered $n$ out of $q$
1436+
participant's quiz responses, if the participant correctly answered $n$ out of $q$
14381437
questions total, then their proportion-correct score on each subset of questions
1439-
used to estimate their knowledge for the target question is $\frac{n-1}{q-2}$,
1438+
used to estimate their knowledge would be $\frac{n-1}{q-2}$,
14401439
regardless of whether they answered the target question correctly or
1441-
incorrectly. Finally, averaging over these per-subset knowledge estimates
1442-
yielded a rebalanced estimate of the participant's knowledge for the target
1440+
incorrectly. Finally, we averaged over these per-subset knowledge estimates
1441+
to obtain a rebalanced estimate of the participant's knowledge for the target
14431442
question that leveraged information from all remaining questions' embedding
14441443
coordinates, but whose expected value under our null hypothesis was the same as
14451444
that of each individual subset ($\frac{n-1}{q-2}$). By equalizing the
14461445
null-hypothesized values of knowledge estimates for correctly and incorrectly
14471446
answered questions, this procedure ensures that any meaningful relationships we
14481447
observe between participants' estimated knowledge for individual quiz questions
1449-
and their abilities to correctly answer them are attributable to the predictive
1450-
power of the embedding-space distances used to weight questions' contributions
1451-
to the knowledge estimates, rather than an artifact of our estimation procedure.
1448+
and their abilities to correctly answer them reflect the predictive
1449+
power of the embedding-space distances we use to weight questions' contributions
1450+
to the knowledge estimates, rather than an artifact of our testing procedure.
14521451
Note that if a participant answered all or no questions on a given quiz
14531452
correctly, their responses contained no opposite-label questions with which to
14541453
perform this rebalancing, and we therefore excluded their data from our analyses
@@ -1503,9 +1502,9 @@ \subsubsection*{Generalized linear mixed models}\label{subsec:glmm}
15031502
that of an analogous model which assumed (as we assume under our null
15041503
hypothesis) that knowledge estimates for correctly and incorrectly answered
15051504
questions did \textit{not} systematically differ, on average. Specifically, we
1506-
used the same sets of observations with which we fit each ``full'' model to fit
1507-
a second ``null'' model that had the same random effects structure, but in which
1508-
the coefficient for the fixed effect of ''\texttt{knowledge}'' was fixed at 0
1505+
used the same sets of observations to which we fit each ``full'' model to fit
1506+
a second ``null'' model with the same random effects structure, but with
1507+
the coefficient for the fixed effect of ''\texttt{knowledge}'' constrained to zero
15091508
(i.e., we removed this term from the null model). We then compared each full
15101509
model to its reduced (null) equivalent using a likelihood-ratio test (LRT).
15111510
Because the standard asymptotic $\chi^2_d$ approximation of the null
@@ -1581,7 +1580,7 @@ \subsubsection*{Estimating the ``smoothness'' of knowledge}\label{subsec:smoothn
15811580
\subsubsection*{Creating knowledge and learning map visualizations}\label{subsec:knowledge-maps}
15821581

15831582
An important feature of our approach is that, given a trained text embedding
1584-
model and participants' quiz performance on each question, we can estimate
1583+
model and participants' performance on each quiz question, we can estimate
15851584
their knowledge about \textit{any} content expressible by the embedding
15861585
model---not solely the content explicitly probed by the quiz questions, or even
15871586
appearing in the lectures. To visualize these estimates
@@ -1636,7 +1635,7 @@ \subsubsection*{Creating knowledge and learning map visualizations}\label{subsec
16361635
To generate our estimates, we placed a set of 39 radial basis functions (RBFs)
16371636
throughout the embedding space, centered on the 2D projections for each
16381637
question (i.e., we included one RBF for each question). At coordinate $x$, the
1639-
value of an RBF centered on a question's coordinate $\mu$, is given by:
1638+
value of an RBF centered on a question's coordinate $\mu$ is given by:
16401639
\begin{equation}
16411640
\mathrm{RBF}(x, \mu, \lambda) = \exp\left\{-\frac{||x - \mu||^2}{\lambda}\right\}.
16421641
\label{eqn:rbf}

0 commit comments

Comments
 (0)