Skip to content

Commit 11008d6

Browse files
st--devmotion
andauthored
Documentation improvements (inc. lengthscale explanation) and Matern12Kernel alias (#213)
* various edits for clarity and typos * remove reference to not-yet-implemented feature (#38) * adds Matern12Kernel as alias for ExponentialKernel (in line with the explicitly defined Matern32Kernel and Matern52Kernel) and gives all aliases docstrings * incorporates the lengthscales explanation from #212. Co-authored-by: David Widmann <[email protected]>
1 parent 83a7f5f commit 11008d6

File tree

13 files changed

+157
-104
lines changed

13 files changed

+157
-104
lines changed

docs/make.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ makedocs(
1616
"User Guide" => "userguide.md",
1717
"Examples"=>"example.md",
1818
"Kernel Functions"=>"kernels.md",
19-
"Transform"=>"transform.md",
19+
"Input Transforms"=>"transform.md",
2020
"Metrics"=>"metrics.md",
2121
"Theory"=>"theory.md",
2222
"Custom Kernels"=>"create_kernel.md",

docs/src/create_kernel.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,9 @@
22

33
KernelFunctions.jl contains the most popular kernels already but you might want to make your own!
44

5-
Here are a few ways depending on how complicated your kernel is :
5+
Here are a few ways depending on how complicated your kernel is:
66

7-
### SimpleKernel for kernels function depending on a metric
7+
### SimpleKernel for kernel functions depending on a metric
88

99
If your kernel function is of the form `k(x, y) = f(d(x, y))` where `d(x, y)` is a `PreMetric`,
1010
you can construct your custom kernel by defining `kappa` and `metric` for your kernel.
@@ -20,15 +20,15 @@ KernelFunctions.metric(::MyKernel) = SqEuclidean()
2020
### Kernel for more complex kernels
2121

2222
If your kernel does not satisfy such a representation, all you need to do is define `(k::MyKernel)(x, y)` and inherit from `Kernel`.
23-
For example we recreate here the `NeuralNetworkKernel`
23+
For example, we recreate here the `NeuralNetworkKernel`:
2424

2525
```julia
2626
struct MyKernel <: KernelFunctions.Kernel end
2727

2828
(::MyKernel)(x, y) = asin(dot(x, y) / sqrt((1 + sum(abs2, x)) * (1 + sum(abs2, y))))
2929
```
3030

31-
Note that `BaseKernel` do not use `Distances.jl` and can therefore be a bit slower.
31+
Note that the fallback implementation of the base `Kernel` evaluation does not use `Distances.jl` and can therefore be a bit slower.
3232

3333
### Additional Options
3434

@@ -37,7 +37,7 @@ Finally there are additional functions you can define to bring in more features:
3737
- `KernelFunctions.dim(x::MyDataType)`: by default the dimension of the inputs will only be checked for vectors of type `AbstractVector{<:Real}`. If you want to check the dimensionality of your inputs, dispatch the `dim` function on your datatype. Note that `0` is the default.
3838
- `dim` is called within `KernelFunctions.validate_inputs(x::MyDataType, y::MyDataType)`, which can instead be directly overloaded if you want to run special checks for your input types.
3939
- `kernelmatrix(k::MyKernel, ...)`: you can redefine the diverse `kernelmatrix` functions to eventually optimize the computations.
40-
- `Base.print(io::IO, k::MyKernel)`: if you want to specialize the printing of your kernel
40+
- `Base.print(io::IO, k::MyKernel)`: if you want to specialize the printing of your kernel.
4141

4242
KernelFunctions uses [Functors.jl](https://github.com/FluxML/Functors.jl) for specifying trainable kernel parameters
4343
in a way that is compatible with the [Flux ML framework](https://github.com/FluxML/Flux.jl).

docs/src/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# KernelFunctions.jl
22

3-
Model agnostic kernel functions compatible with automatic differentiation
3+
Model-agnostic kernel functions compatible with automatic differentiation
44

55
**KernelFunctions.jl** is a general purpose kernel package.
66
It aims at providing a flexible framework for creating kernels and manipulating them.

docs/src/kernels.md

Lines changed: 40 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
# Base Kernels
66

7-
These are the basic kernels without any transformation of the data. They are the building blocks of KernelFunctions
7+
These are the basic kernels without any transformation of the data. They are the building blocks of KernelFunctions.
88

99

1010
## Constant Kernels
@@ -86,21 +86,20 @@ The [`FBMKernel`](@ref) is defined as
8686
k(x,x';h) = \frac{|x|^{2h} + |x'|^{2h} - |x-x'|^{2h}}{2},
8787
```
8888

89-
where $h$ is the [Hurst index](https://en.wikipedia.org/wiki/Hurst_exponent#Generalized_exponent) and $0<h<1$.
89+
where $h$ is the [Hurst index](https://en.wikipedia.org/wiki/Hurst_exponent#Generalized_exponent) and $0 < h < 1$.
9090

9191
## Gabor Kernel
9292

9393
The [`GaborKernel`](@ref) is defined as
9494

9595
```math
96-
k(x,x'; l,p) =& h(x-x';l,p)\\
97-
h(u;l,p) =& \exp\left(-\cos\left(\pi \sum_i \frac{u_i}{p_i}\right)\sum_i \frac{u_i^2}{l_i^2}\right),
96+
k(x,x'; l,p) = \exp\left(-\cos\left(\pi \sum_i \frac{x_i - x'_i}{p_i}\right)\sum_i \frac{(x_i - x'_i)^2}{l_i^2}\right),
9897
```
99-
where $l_i >0 $ is the lengthscale and $p_i>0$ is the period.
98+
where $l_i > 0$ is the lengthscale and $p_i > 0$ is the period.
10099

101-
## Matern Kernels
100+
## Matérn Kernels
102101

103-
### Matern Kernel
102+
### General Matérn Kernel
104103

105104
The [`MaternKernel`](@ref) is defined as
106105

@@ -110,15 +109,23 @@ The [`MaternKernel`](@ref) is defined as
110109

111110
where $\nu > 0$.
112111

113-
### Matern 3/2 Kernel
112+
### Matérn 1/2 Kernel
113+
114+
The Matérn 1/2 kernel is defined as
115+
```math
116+
k(x,x') = \exp\left(-|x-x'|\right),
117+
```
118+
equivalent to the Exponential kernel. `Matern12Kernel` is an alias for [`ExponentialKernel`](@ref).
119+
120+
### Matérn 3/2 Kernel
114121

115122
The [`Matern32Kernel`](@ref) is defined as
116123

117124
```math
118125
k(x,x') = \left(1+\sqrt{3}|x-x'|\right)\exp\left(\sqrt{3}|x-x'|\right).
119126
```
120127

121-
### Matern 5/2 Kernel
128+
### Matérn 5/2 Kernel
122129

123130
The [`Matern52Kernel`](@ref) is defined as
124131

@@ -128,7 +135,7 @@ The [`Matern52Kernel`](@ref) is defined as
128135

129136
## Neural Network Kernel
130137

131-
The [`NeuralNetworkKernel`](@ref) (as in the kernel for an infinitely wide neural network interpretated as a Gaussian process) is defined as
138+
The [`NeuralNetworkKernel`](@ref) (as in the kernel for an infinitely wide neural network interpreted as a Gaussian process) is defined as
132139

133140
```math
134141
k(x, x') = \arcsin\left(\frac{\langle x, x'\rangle}{\sqrt{(1+\langle x, x\rangle)(1+\langle x',x'\rangle)}}\right).
@@ -142,19 +149,23 @@ The [`PeriodicKernel`](@ref) is defined as
142149
k(x,x';r) = \exp\left(-0.5 \sum_i (sin (π(x_i - x'_i))/r_i)^2\right),
143150
```
144151

145-
where $r$ has the same dimension as $x$ and $r_i >0$.
152+
where $r$ has the same dimension as $x$ and $r_i > 0$.
146153

147154
## Piecewise Polynomial Kernel
148155

149-
The [`PiecewisePolynomialKernel`](@ref) is defined as
150-
156+
The [`PiecewisePolynomialKernel`](@ref) is defined for $x, x'\in \mathbb{R}^D$, a positive-definite matrix $P \in \mathbb{R}^{D \times D}$, and $V \in \{0,1,2,3\}$ as
151157
```math
152-
k(x,x'; P, V) =& \max(1 - r, 0)^{j + V} f(r, j),\\
153-
r =& x^\top P x',\\
154-
j =& \lfloor \frac{D}{2}\rfloor + V + 1,
158+
k(x,x'; P, V) = \max(1 - \sqrt{x^\top P x'}, 0)^{j + V} f_V(\sqrt{x^\top P x'}, j),
159+
```
160+
where $j = \lfloor \frac{D}{2}\rfloor + V + 1$, and $f_V$ are polynomials defined as follows:
161+
```math
162+
\begin{aligned}
163+
f_0(r, j) &= 1, \\
164+
f_1(r, j) &= 1 + (j + 1) r, \\
165+
f_2(r, j) &= 1 + (j + 2) r + ((j^2 + 4j + 3) / 3) r^2, \\
166+
f_3(r, j) &= 1 + (j + 3) r + ((6 j^2 + 36j + 45) / 15) r^2 + ((j^3 + 9 j^2 + 23j + 15) / 15) r^3.
167+
\end{aligned}
155168
```
156-
where $x\in \mathbb{R}^D$, $V \in \{0,1,2,3\} and $P$ is a positive definite matrix.
157-
$f$ is a piecewise polynomial (see source code).
158169

159170
## Polynomial Kernels
160171

@@ -166,7 +177,7 @@ The [`LinearKernel`](@ref) is defined as
166177
k(x,x';c) = \langle x,x'\rangle + c,
167178
```
168179

169-
where $c \in \mathbb{R}$
180+
where $c \in \mathbb{R}$.
170181

171182
### Polynomial Kernel
172183

@@ -176,7 +187,7 @@ The [`PolynomialKernel`](@ref) is defined as
176187
k(x,x';c,d) = \left(\langle x,x'\rangle + c\right)^d,
177188
```
178189

179-
where $c \in \mathbb{R}$ and $d>0$
190+
where $c \in \mathbb{R}$ and $d>0$.
180191

181192

182193
## Rational Quadratic
@@ -223,43 +234,41 @@ where $i\in\{-1,0,1,2,3\}$ and coefficients $a_i$, $b_i$ are fixed and residuals
223234

224235
### Transformed Kernel
225236

226-
The [`TransformedKernel`](@ref) is a kernel where input are transformed via a function `f`
237+
The [`TransformedKernel`](@ref) is a kernel where inputs are transformed via a function `f`:
227238

228239
```math
229-
k(x,x';f,\widetile{k}) = \widetilde{k}(f(x),f(x')),
240+
k(x,x';f,\widetilde{k}) = \widetilde{k}(f(x),f(x')),
230241
```
231-
232-
Where $\widetilde{k}$ is another kernel and $f$ is an arbitrary mapping.
242+
where $\widetilde{k}$ is another kernel and $f$ is an arbitrary mapping.
233243

234244
### Scaled Kernel
235245

236246
The [`ScaledKernel`](@ref) is defined as
237247

238248
```math
239-
k(x,x';\sigma^2,\widetilde{k}) = \sigma^2\widetilde{k}(x,x')
249+
k(x,x';\sigma^2,\widetilde{k}) = \sigma^2\widetilde{k}(x,x') ,
240250
```
241-
242-
Where $\widetilde{k}$ is another kernel and $\sigma^2 > 0$.
251+
where $\widetilde{k}$ is another kernel and $\sigma^2 > 0$.
243252

244253
### Kernel Sum
245254

246-
The [`KernelSum`](@ref) is defined as a sum of kernels
255+
The [`KernelSum`](@ref) is defined as a sum of kernels:
247256

248257
```math
249258
k(x, x'; \{k_i\}) = \sum_i k_i(x, x').
250259
```
251260

252-
### KernelProduct
261+
### Kernel Product
253262

254-
The [`KernelProduct`](@ref) is defined as a product of kernels
263+
The [`KernelProduct`](@ref) is defined as a product of kernels:
255264

256265
```math
257266
k(x,x';\{k_i\}) = \prod_i k_i(x,x').
258267
```
259268

260269
### Tensor Product
261270

262-
The [`TensorProduct`](@ref) is defined as :
271+
The [`TensorProduct`](@ref) is defined as:
263272

264273
```math
265274
k(x,x';\{k_i\}) = \prod_i k_i(x_i,x'_i)

docs/src/metrics.md

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,19 @@
11
# Metrics
22

3-
KernelFunctions.jl relies on [Distances.jl](https://github.com/JuliaStats/Distances.jl) for computing the pairwise matrix.
4-
To do so a distance measure is needed for each kernel. Two very common ones can already be used : `SqEuclidean` and `Euclidean`.
5-
However all kernels do not rely on distances metrics respecting all the definitions. That's why additional metrics come with the package such as `DotProduct` (`<x,y>`) and `Delta` (`δ(x,y)`).
6-
Note that every `SimpleKernel` must have a defined metric defined as :
3+
`SimpleKernel` implementations rely on [Distances.jl](https://github.com/JuliaStats/Distances.jl) for efficiently computing the pairwise matrix.
4+
This requires a distance measure or metric, such as the commonly used `SqEuclidean` and `Euclidean`.
5+
6+
The metric used by a given kernel type is specified as
77
```julia
8-
KernelFunctions.metric(::CustomKernel) = SqEuclidean()
8+
KernelFunctions.metric(::CustomKernel) = SqEuclidean()
99
```
1010

11+
However, there are kernels that can be implemented efficiently using "metrics" that do not respect all the definitions expected by Distances.jl. For this reason, KernelFunctions.jl provides additional "metrics" such as `DotProduct` ($\langle x, y \rangle$) and `Delta` ($\delta(x,y)$).
12+
13+
1114
## Adding a new metric
1215

13-
If you want to create a new distance just implement the following :
16+
If you want to create a new "metric" just implement the following:
1417

1518
```julia
1619
struct Delta <: Distances.PreMetric

docs/src/transform.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
1-
# Transform
1+
# Input Transforms
22

33
`Transform` is the object that takes care of transforming the input data before distances are being computed. It can be as standard as `IdentityTransform` returning the same input, or multiplying the data by a scalar with `ScaleTransform` or by a vector with `ARDTransform`.
4-
There is a more general `Transform`: `FunctionTransform` that uses a function and apply it on each vector via `mapslices`.
5-
You can also create a pipeline of `Transform` via `TransformChain`. For example `LowRankTransform(rand(10,5))∘ScaleTransform(2.0)`.
4+
There is a more general `Transform`: `FunctionTransform` that uses a function and applies it on each vector via `mapslices`.
5+
You can also create a pipeline of `Transform` via `TransformChain`. For example, `LowRankTransform(rand(10,5))∘ScaleTransform(2.0)`.
66

7-
One apply a transformation on a matrix or a vector via `KernelFunctions.apply(t::Transform,v::AbstractVecOrMat)`
7+
A transformation `t` can be applied to a matrix or a vector `v` via `KernelFunctions.apply(t, v)`.
88

9-
Check the list on the [API page](@ref Transforms)
9+
Check the full list of provided transforms on the [API page](@ref Transforms).

0 commit comments

Comments
 (0)