@@ -192,7 +192,7 @@ \subsubsection{Investigating the contributions of function words, content
192192 \includegraphics [width=\textwidth ]{figs/loss_all_authors.pdf}
193193
194194
195- \caption {\textbf {Cross-entropy loss across models and authors. }\textbf {A. }
195+ \caption {\textbf {Cross-entropy loss across models and authors. } \textbf {A. }
196196Average cross-entropy loss on \textit {Train }ing data and held-out test data
197197from each author, plotted as a function of the number of training epochs. Each
198198color denotes a model trained on a single author's work. Error ribbons denote
@@ -235,20 +235,20 @@ \subsection{Predictive comparison testing of eight classic authors}
235235 \includegraphics [width=\textwidth ]{figs/t_stats.pdf}
236236
237237
238- \caption {\textbf {Same vs. other author comparisons, by
239- model. } \textbf { A. } Each curve denotes, as a function of the
240- number of training epochs, the the $ t$ -statistic from a $ t$ -test
241- comparing the distribution of losses (across random seeds)
242- assigned to held-out texts from the given author (color) versus
243- held-out texts from all other authors. \textbf { B. } The average
244- $ t$ -statistic across all eight authors, as a function of the
245- number of training epochs . Error ribbons denote
246- bootstrap-estimated 95 \% confidence intervals across authors. See Supplementary
247- Materials for analogous plots using models trained on only content words (Supp.
248- Fig.~\ttestsContent ), only function words (Supp.
249- Fig.~\ttestsFunction ), and only parts of speech (Supp.
250- Fig.~ \ttestsPOS ).}
251- \label {fig:t-stats }
238+ \caption {\textbf {Same vs. other author comparisons, by model. } \textbf { A. } Each
239+ curve denotes, as a function of the number of training epochs, the the
240+ $ t$ -statistic from a $ t$ -test comparing the distribution of losses (across
241+ random seeds) assigned to held-out texts from the given author (color) versus
242+ held-out texts from all other authors. \textbf { B. } The average $ t $ -statistic
243+ across all eight authors, as a function of the number of training epochs. The
244+ black curves in both panels indicates the average $ t$ -value corresponding to $ p
245+ = 0.001 $ , for each epoch . Error ribbons denote bootstrap-estimated 95 \%
246+ confidence intervals across authors. See Supplementary Materials for analogous
247+ plots using models trained on only content words (Supp. Fig.~ \ttestsContent ),
248+ only function words (Supp. Fig.~\ttestsFunction ), and only parts of speech
249+ (Supp. Fig.~\ttestsPOS ).}
250+
251+ \label {fig:t-stats }
252252\end {figure* }
253253
254254We also wondered how many training epochs were required for the models to
@@ -623,9 +623,10 @@ \subsection{Concluding remarks}
623623
624624\section* {Acknowledgments }
625625
626- We acknowledge helpful discussions with Jacob Bacus, Hung-Tu Chen,
627- and Paxton Fitzpatrick. This research was supported in part by
628- National Science Foundation Grant 2145172 to JRM.
626+ We acknowledge helpful discussions with Jacob Bacus, Hung-Tu Chen, and Paxton
627+ Fitzpatrick. This research was supported in part by National Science Foundation
628+ Grant 2145172 to JRM and by a GPU cluster generously donated by the estate of
629+ Daniel J. Milstein.
629630
630631\section* {Data and code availability }
631632
0 commit comments