Skip to content

Commit a441ac3

Browse files
committed
checking in minor edits
1 parent b085c8d commit a441ac3

File tree

5 files changed

+44
-53
lines changed

5 files changed

+44
-53
lines changed

paper/changes.pdf

-572 Bytes
Binary file not shown.

paper/changes.tex

Lines changed: 25 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
\documentclass[10pt]{article}
22
%DIF LATEXDIFF DIFFERENCE FILE
33
%DIF DEL old.tex Mon Feb 19 07:49:49 2024
4-
%DIF ADD main.tex Wed Feb 21 19:20:24 2024
4+
%DIF ADD main.tex Wed Feb 21 21:04:59 2024
55
%DIF 2a2
66
\usepackage{amsmath} %DIF >
77
%DIF -------
@@ -776,9 +776,9 @@ \section*{Results}
776776
knowledge at the embedding coordinate of a }\DIFdelend \DIFaddbegin \DIFadd{question was answered correctly than when it was answered
777777
incorrectly. Because our knowledge estimates are computed as a weighted version
778778
of this same proportion-correct score (where each held-in question's weight
779-
reflects its embedding-space distance from the }\DIFaddend held-out \DIFdelbegin \DIFdel{question about the }\textit{\DIFdel{other}} %DIFAUXCMD
780-
\DIFdel{lecture (``Across-lecture''; predicting knowledge
781-
for }\DIFdelend \DIFaddbegin \DIFadd{question; see
779+
reflects its embedding-space distance from the }\DIFaddend held-out \DIFdelbegin \DIFdel{question about the
780+
}\textit{\DIFdel{other}} %DIFAUXCMD
781+
\DIFdel{lecture (``Across-lecture''; predicting knowledge for }\DIFdelend \DIFaddbegin \DIFadd{question; see
782782
Eqn.~\ref{eqn:prop}), if these weights are uninformative (e.g., randomly
783783
distributed), then we should expect to see this same inverse relationship
784784
between estimated knowledge and performance, on average. On the other hand, if
@@ -797,23 +797,23 @@ \section*{Results}
797797
\DIFadd{Before presenting our results, it is worth considering three possible
798798
explanations of why a participant might answer a given question correctly or
799799
incorrectly. One possibility is that the participant simply }\textit{\DIFadd{guessed}}
800-
\DIFadd{the answer. A second is that they selected an answer by mistake, despite
801-
``knowing'' the correct answer. In both of these scenarios, the participant's
802-
knowledge about the question's content should be uninformative about their
803-
observed response. A third possibility is that the participant's response
804-
reflects their }\textit{\DIFadd{actual}} \DIFadd{knowledge about the question's content. In this
805-
case, we }\textit{\DIFadd{might}} \DIFadd{expect to see a positive relationship between the
806-
participant's knowledge and their likelihood of answering the question
807-
correctly. However, in order to see this positive relationship, the
808-
participant's knowledge must be structured in a way that is reflected (at least
809-
partially) by the embedding space. In other words, if the participant's
810-
performance reflects their true knowledge, but our text embedding space does
811-
not sufficiently capture the structure of that knowledge, then the }\DIFaddend knowledge
812-
\DIFaddbegin \DIFadd{estimates we generate will not be predictive of the participant's performance.
813-
In the extreme, if the embedding space is completely unstructured with respect
814-
to the content of the quiz questions, then we would expect to see the negative
815-
relationship between estimated knowledge and performance that we described
816-
above.
800+
\DIFadd{the answer. A second is that they selected the incorrect answer by mistake,
801+
despite ``knowing'' the correct answer (or vice versa). In both of these
802+
scenarios, the participant's knowledge about the question's content should be
803+
uninformative about their observed response. A third possibility is that the
804+
participant's response reflects their }\textit{\DIFadd{actual}} \DIFadd{knowledge about the
805+
question's content. In this case, we }\textit{\DIFadd{might}} \DIFadd{expect to see a positive
806+
relationship between the participant's knowledge and their likelihood of
807+
answering the question correctly. However, in order to see this positive
808+
relationship, the participant's knowledge must be structured in a way that is
809+
reflected (at least partially) by the embedding space. In other words, if the
810+
participant's performance reflects their true knowledge, but our text embedding
811+
space does not sufficiently capture the structure of that knowledge, then the
812+
}\DIFaddend knowledge \DIFaddbegin \DIFadd{estimates we generate will not be predictive of the participant's
813+
performance. In the extreme, if the embedding space is completely unstructured
814+
with respect to the content of the quiz questions, then we would expect to see
815+
the negative relationship between estimated knowledge and performance that we
816+
described above.
817817
}
818818

819819
\DIFadd{When we fit a GLMM to estimates of participants' knowledge for each Quiz~1
@@ -973,7 +973,7 @@ \section*{Results}
973973
95\%\ \textnormal{CI} = [3.033, 3.866],\ p = 0.094$). These ``prediction
974974
failures'' appear to come from the fact that any signal derived from
975975
participants' knowledge about the content of the }\textit{\DIFadd{Birth of Stars}}
976-
\DIFadd{lecture (prior to watching it) is swamped by the much more dramatic increase in
976+
\DIFadd{lecture (prior to watching it) is overwhelmed by the much more dramatic increase in
977977
their knowledge about the content of the }\textit{\DIFadd{Four Fundamental Forces}}
978978
\DIFadd{(which they watched just prior to taking Quiz~2). This is reflected in their
979979
Quiz~2 performance for questions about each lecture (mean proportion correct
@@ -988,15 +988,11 @@ \section*{Results}
988988
p = 0.017$) using responses to questions about the other lecture's content.
989989
Across all three versions of these analyses, our results suggest that (by and
990990
large) our knowledge estimates can reliably predict participants' abilities to
991-
answer individual quiz questions, }\DIFaddend distinguish between questions about \DIFdelbegin \DIFdel{more subtly different contentwithin the same lecture}\DIFdelend \DIFaddbegin \DIFadd{similar
991+
answer individual quiz questions, }\DIFaddend distinguish between questions about \DIFdelbegin \DIFdel{more subtly different contentwithin the
992+
same lecture}\DIFdelend \DIFaddbegin \DIFadd{similar
992993
content, and generalize across content areas, provided that participants' quiz
993994
responses reflect a minimum level of ``real'' knowledge about both content on
994-
which these predictions are based and that for which they are made. Our results
995-
also indicate some important limitations of our approach: if participants' quiz
996-
performance does not reflect what they know (e.g., when they ``guess''), or if
997-
their knowledge is not structured in a way that is reflected by the embedding
998-
space, then our knowledge estimates will not be predictive of their
999-
performance}\DIFaddend .
995+
which these predictions are based and that for which they are made}\DIFaddend .
1000996

1001997
%DIF > our approach works when participants have a minimal baseline level of knowledge about content predicted and used to predict
1002998
%DIF > our approach generalizes when knowledge of content used to predict can be assumed to be a reasonable indicator of knowledge of content predicted

paper/main.pdf

-304 Bytes
Binary file not shown.

paper/main.tex

Lines changed: 19 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -634,23 +634,23 @@ \section*{Results}
634634
Before presenting our results, it is worth considering three possible
635635
explanations of why a participant might answer a given question correctly or
636636
incorrectly. One possibility is that the participant simply \textit{guessed}
637-
the answer. A second is that they selected an answer by mistake, despite
638-
``knowing'' the correct answer. In both of these scenarios, the participant's
639-
knowledge about the question's content should be uninformative about their
640-
observed response. A third possibility is that the participant's response
641-
reflects their \textit{actual} knowledge about the question's content. In this
642-
case, we \textit{might} expect to see a positive relationship between the
643-
participant's knowledge and their likelihood of answering the question
644-
correctly. However, in order to see this positive relationship, the
645-
participant's knowledge must be structured in a way that is reflected (at least
646-
partially) by the embedding space. In other words, if the participant's
647-
performance reflects their true knowledge, but our text embedding space does
648-
not sufficiently capture the structure of that knowledge, then the knowledge
649-
estimates we generate will not be predictive of the participant's performance.
650-
In the extreme, if the embedding space is completely unstructured with respect
651-
to the content of the quiz questions, then we would expect to see the negative
652-
relationship between estimated knowledge and performance that we described
653-
above.
637+
the answer. A second is that they selected the incorrect answer by mistake,
638+
despite ``knowing'' the correct answer (or vice versa). In both of these
639+
scenarios, the participant's knowledge about the question's content should be
640+
uninformative about their observed response. A third possibility is that the
641+
participant's response reflects their \textit{actual} knowledge about the
642+
question's content. In this case, we \textit{might} expect to see a positive
643+
relationship between the participant's knowledge and their likelihood of
644+
answering the question correctly. However, in order to see this positive
645+
relationship, the participant's knowledge must be structured in a way that is
646+
reflected (at least partially) by the embedding space. In other words, if the
647+
participant's performance reflects their true knowledge, but our text embedding
648+
space does not sufficiently capture the structure of that knowledge, then the
649+
knowledge estimates we generate will not be predictive of the participant's
650+
performance. In the extreme, if the embedding space is completely unstructured
651+
with respect to the content of the quiz questions, then we would expect to see
652+
the negative relationship between estimated knowledge and performance that we
653+
described above.
654654

655655
When we fit a GLMM to estimates of participants' knowledge for each Quiz~1
656656
question based on all other Quiz~1 questions, we observed an outcome consistent
@@ -752,7 +752,7 @@ \section*{Results}
752752
95\%\ \textnormal{CI} = [3.033, 3.866],\ p = 0.094$). These ``prediction
753753
failures'' appear to come from the fact that any signal derived from
754754
participants' knowledge about the content of the \textit{Birth of Stars}
755-
lecture (prior to watching it) is swamped by the much more dramatic increase in
755+
lecture (prior to watching it) is overwhelmed by the much more dramatic increase in
756756
their knowledge about the content of the \textit{Four Fundamental Forces}
757757
(which they watched just prior to taking Quiz~2). This is reflected in their
758758
Quiz~2 performance for questions about each lecture (mean proportion correct
@@ -770,12 +770,7 @@ \section*{Results}
770770
answer individual quiz questions, distinguish between questions about similar
771771
content, and generalize across content areas, provided that participants' quiz
772772
responses reflect a minimum level of ``real'' knowledge about both content on
773-
which these predictions are based and that for which they are made. Our results
774-
also indicate some important limitations of our approach: if participants' quiz
775-
performance does not reflect what they know (e.g., when they ``guess''), or if
776-
their knowledge is not structured in a way that is reflected by the embedding
777-
space, then our knowledge estimates will not be predictive of their
778-
performance.
773+
which these predictions are based and that for which they are made.
779774

780775
% our approach works when participants have a minimal baseline level of knowledge about content predicted and used to predict
781776
% our approach generalizes when knowledge of content used to predict can be assumed to be a reasonable indicator of knowledge of content predicted

paper/supplement.pdf

0 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)