Skip to content

Commit b2c9f7d

Browse files
committed
Add interpolation functionality, closes #9
1 parent 3b67db7 commit b2c9f7d

File tree

7 files changed

+84
-7
lines changed

7 files changed

+84
-7
lines changed

README.md

Lines changed: 46 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,21 +9,35 @@ Kernel density estimators for julia.
99

1010
### Univariate
1111
The main accessor function is `kde`:
12+
1213
```
1314
kde(data)
1415
```
15-
will construct a `UnivariateKDE` object from the real vector `data`. The optional keyword arguments are
16+
17+
will construct a `UnivariateKDE` object from the real vector `data`. The
18+
optional keyword arguments are
1619
* `boundary`: the lower and upper limits of the kde as a tuple. Due to the
1720
fourier transforms used internally, there should be sufficient spacing to
1821
prevent wrap-around at the boundaries.
1922
* `npoints`: the number of interpolation points to use. The function uses
2023
fast Fourier transforms (FFTs) internally, so for optimal efficiency this
2124
should be a power of 2 (default = 2048).
22-
* `kernel`: the distributional family to use as the kernel (default =
23-
`Normal`). To add your own kernel, extend the internal `kernel_dist` function.
25+
* `kernel`: the distributional family from
26+
[Distributions.jl](https://github.com/JuliaStats/Distributions.jl) to use as
27+
the kernel (default = `Normal`). To add your own kernel, extend the internal
28+
`kernel_dist` function.
2429
* `bandwidth`: the bandwidth of the kernel. Default is to use Silverman's
2530
rule.
2631

32+
A related function
33+
34+
``` kde_lscv(data) ```
35+
36+
will construct a `UnivariateKDE` object, with the bandwidth selected by
37+
least-squares cross validation. It accepts the above keyword arguments, except
38+
`bandwidth`.
39+
40+
2741
There are also some slightly more advanced interfaces:
2842
```
2943
kde(data, midpoints::Range)
@@ -56,6 +70,35 @@ kde(datamatrix)
5670
Similarly, the optional arguments all now take tuple arguments:
5771
e.g. `boundary` now takes a tuple of tuples `((xlo,xhi),(ylo,yhi))`.
5872

73+
### Interpolation
74+
75+
The KDE objects are stored as gridded density values, with attached
76+
coordinates. These are typically sufficient for plotting (see below), but
77+
intermediate values can be interpolated using the
78+
[Grid.jl](https://github.com/timholy/Grid.jl) package via the `pdf` method
79+
(extended from Distributions.jl).
80+
81+
```
82+
pdf(k::UnivariateKDE, x)
83+
pdf(k::BivariateKDE, x, y)
84+
```
85+
86+
where `x` and `y` are real numbers or arrays.
87+
88+
If you are making multiple calls to `pdf`, it will be more efficient to
89+
construct an intermediate `InterpKDE` to store the intermediate `InterpGrid`
90+
object:
91+
92+
```
93+
ik = InterpKDE(k)
94+
pdf(ik, x)
95+
```
96+
97+
`InterpKDE` can also take additional arguments specifying the
98+
`BoundaryCondition` (default=`BCnan`) and `InterpType` (default=
99+
`InterpQuadratic`).
100+
101+
59102
## Plotting
60103

61104
The [Winston.jl](https://github.com/nolta/Winston.jl) and

REQUIRE

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,4 @@ julia 0.3-
22
StatsBase
33
Distributions 0.4.6
44
Optim
5+
Grid

src/KernelDensity.jl

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,15 +3,17 @@ module KernelDensity
33
using StatsBase
44
using Distributions
55
using Optim
6+
using Grid
67

78
import Base: conv
89
import StatsBase: RealVector, RealMatrix
9-
import Distributions: twoπ
10+
import Distributions: twoπ, pdf
1011

11-
export kde, kde_lscv, UnivariateKDE, BivariateKDE
12+
export kde, kde_lscv, UnivariateKDE, BivariateKDE, InterpKDE, pdf
1213

1314
include("univariate.jl")
1415
include("bivariate.jl")
16+
include("interp.jl")
1517

1618
macro glue(pkg)
1719
path = joinpath(dirname(Base.source_path(nothing)),"glue",string(pkg,".jl"))

src/bivariate.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Store both grid and density for KDE over R2
2-
immutable BivariateKDE{Rx<:Range,Ry<:Range}
2+
type BivariateKDE{Rx<:Range,Ry<:Range}
33
x::Rx
44
y::Ry
55
density::Matrix{Float64}

src/interp.jl

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
type InterpKDE{K,G}
2+
kde::K
3+
grid::G
4+
end
5+
6+
function InterpKDE{BC<:BoundaryCondition, IT<:InterpType}(k::UnivariateKDE, bc::Type{BC}=BCnan,it::Type{IT}=InterpQuadratic)
7+
g = CoordInterpGrid(k.x, k.density, bc, it)
8+
InterpKDE(k,g)
9+
end
10+
function InterpKDE{BC<:BoundaryCondition, IT<:InterpType}(k::BivariateKDE, bc::Type{BC}=BCnan,it::Type{IT}=InterpQuadratic)
11+
g = CoordInterpGrid((k.x,k.y),k.density,bc,it)
12+
InterpKDE{typeof(k),typeof(g)}(k,g)
13+
end
14+
15+
16+
pdf(ik::InterpKDE,x...) = ik.grid[x...]
17+
18+
pdf(k::UnivariateKDE,x) = pdf(InterpKDE(k),x)
19+
pdf(k::BivariateKDE,x,y) = pdf(InterpKDE(k),x,y)

src/univariate.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Store both grid and density for KDE over the real line
2-
immutable UnivariateKDE{R<:Range}
2+
type UnivariateKDE{R<:Range}
33
x::R
44
density::Vector{Float64}
55
end

test/interp.jl

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
using Base.Test
2+
using KernelDensity
3+
using Grid
4+
5+
X = randn(100)
6+
Y = randn(100)
7+
8+
k = kde(X)
9+
@test_approx_eq pdf(k, k.x) k.density
10+
11+
k = kde((X,Y))
12+
@test_approx_eq pdf(k, k.x, k.y) k.density

0 commit comments

Comments
 (0)