Skip to content

Commit ca09853

Browse files
committed
minor corrections
1 parent b4353a1 commit ca09853

File tree

9 files changed

+15
-15
lines changed

9 files changed

+15
-15
lines changed

additive.pdf

19 Bytes
Binary file not shown.

additive.tex

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -153,7 +153,7 @@ \subsection{Weighting different orders of interaction}
153153

154154
On different datasets, the dominant order of interaction estimated by the additive model varies widely.
155155
In some cases, the variance is concentrated almost entirely onto a single order of interaction.
156-
This may may be a side-effect of using the same lengthscales for all orders of interaction.; lengthscales appropriate for low-dimensional regression might not be appropriate for high-dimensional regression.
156+
This may may be a side-effect of using the same lengthscales for all orders of interaction; lengthscales appropriate for low-dimensional regression might not be appropriate for high-dimensional regression.
157157
A re-scaling of lengthscales which preserves relative average distances between datapoints might be expected to improve the model.
158158
%An additive \gp{} with all of its variance coming from the 1st order is equivalent to a sum of one-dimensional functions.
159159
%An additive \gp{} with all its variance coming from the $D$th order is equivalent to a \gp{} with an \seard{} kernel.
@@ -219,7 +219,7 @@ \subsubsection{Evaluation of derivatives}
219219
\label{eq:additive-derivatives}
220220
\end{align}
221221
%
222-
\Cref{eq:additive-derivatives} gives the terms that $k_j$ is multiplied by in the original polynomial, which are the terms required by the chain rule.
222+
\Cref{eq:additive-derivatives} gives all terms that $k_j$ is multiplied by in the original polynomial, which are exactly the terms required by the chain rule.
223223
These derivatives allow gradient-based optimization of the base kernel parameters with respect to the marginal likelihood.
224224

225225

grammar.pdf

0 Bytes
Binary file not shown.

grammar.tex

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -676,7 +676,7 @@ \subsection{Structure recovery on synthetic data}
676676

677677

678678
\Cref{tbl:synthetic} shows the results.
679-
For the highest signal-to-noise ratio, \procedurename{} usually recoveres the correct structure.
679+
For the highest signal-to-noise ratio, \procedurename{} usually recovers the correct structure.
680680
The reported additional linear structure in the last row can be explained the fact that functions sampled from \kSE{} kernels with long length-scales occasionally have near-linear trends.
681681
As the noise increases, our method generally backs off to simpler structures rather than reporting spurious structure.
682682

intro.pdf

-7 Bytes
Binary file not shown.

intro.tex

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ \chapter{Introduction}
2525

2626
%This thesis will be concerned with finding structure in functions.
2727
%The types of structure examined in this thesis
28-
One can construct models of functions having many different types of structure, such as additivity, symmetry, periodicity, changepoints, or combinations of these, using Gaussian processes (\gp{}s).
28+
One can construct models of functions that have many different types of structure, such as additivity, symmetry, periodicity, changepoints, or combinations of these, using Gaussian processes (\gp{}s).
2929
%To be able to learn a wide variety of structures, we would like to have an expressive language of models of functions.
3030
%We would like to be able to represent simple kinds of functions, such as linear functions or polynomials.
3131
%We would also like to have models of arbitrarily complex functions, specified in terms of high-level properties such as how smooth they are, whether they repeat over time, or which symmetries they have.
@@ -35,7 +35,7 @@ \chapter{Introduction}
3535
%
3636
%This chapter will introduce the basic properties of \gp{}s.
3737
Chapter \ref{ch:kernels} will describe how to model these different types of structure using \gp{}s.
38-
This short chapter will introduce the basic properties of \gp{}s, and provide an outline of the thesis.
38+
This short chapter introduces the basic properties of \gp{}s, and provides an outline of the thesis.
3939
%Chapter \ref{ch:grammar} will show how searching over many
4040

4141

kernels.pdf

121 Bytes
Binary file not shown.

kernels.tex

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -146,15 +146,15 @@ \subsection{Combining properties through multiplication}
146146
Here, we discuss a few examples:
147147

148148
\begin{itemize}
149-
\item {\bf Locally Periodic Functions.}
150-
In univariate data, multiplying a kernel by \kSE{} gives a way of converting global structure to local structure.
151-
For example, $\Per$ corresponds to exactly periodic structure, whereas $\Per \kerntimes \SE$ corresponds to locally periodic structure, as shown in the second column of \cref{fig:kernels_times}.
152-
153149
\item {\bf Polynomial Regression.}
154150
By multiplying together $T$ linear kernels, we obtain a prior on polynomials of degree $T$.
155151
%This class of functions also has a simple parametric form.
156152
The first column of \cref{fig:kernels_times} shows a quadratic kernel.
157153

154+
\item {\bf Locally Periodic Functions.}
155+
In univariate data, multiplying a kernel by \kSE{} gives a way of converting global structure to local structure.
156+
For example, $\Per$ corresponds to exactly periodic structure, whereas $\Per \kerntimes \SE$ corresponds to locally periodic structure, as shown in the second column of \cref{fig:kernels_times}.
157+
158158
\item {\bf Functions with Growing Amplitude.}
159159
Multiplying by a linear kernel means that the marginal standard deviation of the function being modeled grows linearly away from the location given by kernel parameter $c$.
160160
The third and fourth columns of \cref{fig:kernels_times} show two examples.
@@ -381,7 +381,7 @@ \subsection{Example: An additive model of concrete strength}
381381

382382
To illustrate how additive kernels give rise to interpretable models, we built an additive model of the strength of concrete as a function of the amount of seven different ingredients (cement, slag, fly ash, water, plasticizer, coarse aggregate and fine aggregate), and the age of the concrete \citep{yeh1998modeling}.
383383
%We model measurements of the compressive strength of concrete, as a function of the concentration of 7 ingredients, plus the age of the concrete.
384-
Our simple model is a sum of 8 different one-dimensional functions, each depending on one of these variables:
384+
Our simple model is a sum of 8 different one-dimensional functions, each depending on only one of these quantities:
385385
%
386386
\begin{align}
387387
f(\vx) & =
@@ -521,12 +521,12 @@ \subsubsection{Posterior covariance of additive components}
521521
\def\incpic#1{\includegraphics[width=0.11\columnwidth]{../figures/decomp/concrete-#1}}
522522
\begin{tabular}{p{2mm}*{6}{c}}
523523
& {cement} & {slag} & {fly ash} & {water} & \parbox{0.1\columnwidth}{plasticizer} & {age} \\
524-
\rotatebox{90}{{$\;\;$cement}} & \incpic{Cement-Cement} & \incpic{Cement-Slag} & \incpic{Cement-Fly-Ash} & \incpic{Cement-Water} & \incpic{Cement-Plasticizer} & \incpic{Cement-Age} \\
525-
\rotatebox{90}{{$\;\;$$\;\;$slag}} & \incpic{Slag-Cement} & \incpic{Slag-Slag} & \incpic{Slag-Fly-Ash} & \incpic{Slag-Water} & \incpic{Slag-Plasticizer} & \incpic{Slag-Age} \\
524+
\rotatebox{90}{$\;\;$ cement} & \incpic{Cement-Cement} & \incpic{Cement-Slag} & \incpic{Cement-Fly-Ash} & \incpic{Cement-Water} & \incpic{Cement-Plasticizer} & \incpic{Cement-Age} \\
525+
\rotatebox{90}{$\;\;\;\;$ slag} & \incpic{Slag-Cement} & \incpic{Slag-Slag} & \incpic{Slag-Fly-Ash} & \incpic{Slag-Water} & \incpic{Slag-Plasticizer} & \incpic{Slag-Age} \\
526526
\rotatebox{90}{{$\;\;$fly ash}} & \incpic{Fly-Ash-Cement} & \incpic{Fly-Ash-Slag} & \incpic{Fly-Ash-Fly-Ash} & \incpic{Fly-Ash-Water} & \incpic{Fly-Ash-Plasticizer} & \incpic{Fly-Ash-Age} \\
527527
\rotatebox{90}{{$\quad$water}} & \incpic{Water-Cement} & \incpic{Water-Slag} & \incpic{Water-Fly-Ash} & \incpic{Water-Water} & \incpic{Water-Plasticizer} & \incpic{Water-Age} \\
528528
\rotatebox{90}{{plasticizer}} & \incpic{Plasticizer-Cement} & \incpic{Plasticizer-Slag} & \incpic{Plasticizer-Fly-Ash} & \incpic{Plasticizer-Water} & \incpic{Plasticizer-Plasticizer} & \incpic{Plasticizer-Age}\\
529-
\rotatebox{90}{$\;\;$$\;\;$\phantom{t}age} & \incpic{Age-Cement} & \incpic{Age-Slag} & \incpic{Age-Fly-Ash} & \incpic{Age-Water} & \incpic{Age-Plasticizer} & \incpic{Age-Age} \\
529+
\rotatebox{90}{$\;\;\;$ \phantom{t}age} & \incpic{Age-Cement} & \incpic{Age-Slag} & \incpic{Age-Fly-Ash} & \incpic{Age-Water} & \incpic{Age-Plasticizer} & \incpic{Age-Age} \\
530530
\end{tabular}
531531
\fbox{
532532
\begin{tabular}{c}
@@ -539,7 +539,7 @@ \subsubsection{Posterior covariance of additive components}
539539
%Each plot shows the posterior correlations between the height of two functions, evaluated across the range of the data upon which they depend.
540540
%Color indicates the amount of correlation between the function value of the two components.
541541
Red indicates high correlation, teal indicates no correlation, and blue indicates negative correlation.
542-
Plots on the diagonal show posterior correlations within each function.
542+
Plots on the diagonal show posterior correlations between different values of the same function.
543543
Correlations are evaluated over the same input ranges as in \cref{fig:interpretable functions}.
544544
%Off-diagonal plots show posterior covariance between each pair of functions, as a function of both inputs.
545545
%Negative correlation means that one function is high and the other low, but which one is uncertain.
@@ -549,7 +549,7 @@ \subsubsection{Posterior covariance of additive components}
549549
\end{figure}
550550
%
551551
For example, \cref{fig:interpretable interactions} shows the posterior correlation between all non-zero components of the concrete model.
552-
This figure shows that most of the correlation occurs within components, but there is also negative correlation between the ``cement'' and ``slag'' variables.
552+
This figure shows that most of the correlation occurs within components, but there is also negative correlation between the height of $f_1(\textnormal{cement})$ and $f_2(\textnormal{slag})$.
553553
This reflects an ambiguity in the model about which one of these functions is high and the other low.
554554

555555

thesis.pdf

137 Bytes
Binary file not shown.

0 commit comments

Comments
 (0)