Skip to content

Commit 89d3e97

Browse files
rework readme
1 parent 8cbc6e5 commit 89d3e97

File tree

3 files changed

+87
-76
lines changed

3 files changed

+87
-76
lines changed

README.md

Lines changed: 59 additions & 59 deletions
Original file line numberDiff line numberDiff line change
@@ -4,71 +4,100 @@
44

55
`FastTransforms.jl` allows the user to conveniently work with orthogonal polynomials with degrees well into the millions.
66

7-
Transforms include conversion between Jacobi polynomial expansions, with Chebyshev, Legendre, and ultraspherical polynomial transforms as special cases. For the signal processor, all three types of nonuniform fast Fourier transforms available. As well, spherical harmonic transforms and transforms between orthogonal polynomials on the triangle allow for the efficient simulation of partial differential equations of evolution.
7+
This package provides a Julia wrapper for the [C library](https://github.com/MikaelSlevinsky/FastTransforms) of the same name. Additionally, all three types of nonuniform fast Fourier transforms available, as well as the Padua transform.
88

9-
Algorithms include methods based on asymptotic formulae to relate the transforms to a small number of fast Fourier transforms, matrix factorizations based on the Hadamard product, hierarchical matrix decompositions à la Fast Multipole Method, and the butterfly algorithm.
9+
## Installation
1010

11-
## The Chebyshev—Legendre Transform
12-
13-
The Chebyshev—Legendre transform allows the fast conversion of Chebyshev expansion coefficients to Legendre expansion coefficients and back.
11+
The build script, which works on macOS and Linux systems with x86_64 processors, downloads precompiled binaries of the latest version of [FastTransforms](https://github.com/MikaelSlevinsky/FastTransforms). This library depends on `FFTW`, `MPFR`, and `OpenBLAS` (on Linux). Therefore, installation may be as straightforward as:
1412

1513
```julia
16-
julia> Pkg.add("FastTransforms")
14+
julia> if Sys.isapple()
15+
run(`brew install gcc@8 fftw mpfr`)
16+
elseif Sys.islinux()
17+
run(`apt-get gcc-8 libblas-dev libopenblas-base libfftw3-dev libmpfr-dev`)
18+
end
19+
20+
pkg> build FastTransforms
1721

1822
julia> using FastTransforms
1923

20-
julia> c = rand(10001);
24+
```
25+
26+
## Fast orthogonal polynomial transforms
27+
28+
The 26 orthogonal polynomial transforms are listed in `FastTransforms.kind2string.(0:25)`. Univariate transforms may be planned with the standard normalization or with orthonormalization. For multivariate transforms, the standard normalization may be too severe for floating-point computations, so it is omitted. Here are two examples:
29+
30+
### The Chebyshev--Legendre transform
31+
32+
```julia
33+
julia> c = rand(8192);
2134

2235
julia> leg2cheb(c);
2336

2437
julia> cheb2leg(c);
2538

26-
julia> norm(cheb2leg(leg2cheb(c))-c)
27-
5.564168202018823e-13
39+
julia> norm(cheb2leg(leg2cheb(c; normcheb=true); normcheb=true)-c)/norm(c)
40+
1.1866591414786334e-14
41+
2842
```
2943

30-
The implementation separates pre-computation into a type of plan. This type is constructed with either `plan_leg2cheb` or `plan_cheb2leg`. Let's see how much faster it is if we pre-compute.
44+
The implementation separates pre-computation into an `FTPlan`. This type is constructed with either `plan_leg2cheb` or `plan_cheb2leg`. Let's see how much faster it is if we pre-compute.
3145

3246
```julia
3347
julia> p1 = plan_leg2cheb(c);
3448

3549
julia> p2 = plan_cheb2leg(c);
3650

3751
julia> @time leg2cheb(c);
38-
0.082615 seconds (11.94 k allocations: 31.214 MiB, 6.75% gc time)
52+
0.433938 seconds (9 allocations: 64.641 KiB)
3953

4054
julia> @time p1*c;
41-
0.004297 seconds (6 allocations: 78.422 KiB)
55+
0.007927 seconds (79 allocations: 68.563 KiB)
4256

4357
julia> @time cheb2leg(c);
44-
0.110388 seconds (11.94 k allocations: 31.214 MiB, 8.16% gc time)
58+
0.423865 seconds (9 allocations: 64.641 KiB)
4559

4660
julia> @time p2*c;
47-
0.004500 seconds (6 allocations: 78.422 KiB)
61+
0.009164 seconds (89 allocations: 69.672 KiB)
62+
63+
```
64+
65+
Furthermore, for orthogonal polynomial connection problems that are degree-preserving, we should expect to be able to apply the transforms in-place:
66+
67+
```julia
68+
julia> lmul!(p1, c);
69+
70+
julia> lmul!(p2, c);
71+
72+
julia> ldiv!(p1, c);
73+
74+
julia> ldiv!(p2, c);
75+
4876
```
4977

50-
## The Chebyshev—Jacobi Transform
78+
### The spherical harmonic transform
5179

52-
The Chebyshev—Jacobi transform allows the fast conversion of Chebyshev expansion coefficients to Jacobi expansion coefficients and back.
80+
Let `F` be an array of spherical harmonic expansion coefficients with columns arranged by increasing order in absolute value, alternating between negative and positive orders. Then `sph2fourier` converts the representation into a bivariate Fourier series, and `fourier2sph` converts it back. Once in a bivariate Fourier series on the sphere, `plan_sph_synthesis` converts the coefficients to function samples on an equiangular grid that does not sample the poles, and `plan_sph_analysis` converts them back.
5381

5482
```julia
55-
julia> c = rand(10001);
83+
julia> F = sphrandn(Float64, 1024, 2047); # convenience method
84+
85+
julia> P = plan_sph2fourier(F);
86+
87+
julia> PS = plan_sph_synthesis(F);
5688

57-
julia> @time norm(icjt(cjt(c, 0.1, -0.2), 0.1, -0.2) - c, Inf)
58-
0.258390 seconds (431 allocations: 6.278 MB)
59-
1.4830359162942841e-12
89+
julia> PA = plan_sph_analysis(F);
6090

61-
julia> p1 = plan_cjt(c, 0.1, -0.2);
91+
julia> G = PS*(P*F);
6292

63-
julia> p2 = plan_icjt(c, 0.1, -0.2);
93+
julia> H = P\(PA*G);
6494

65-
julia> @time norm(p2*(p1*c) - c, Inf)
66-
0.244842 seconds (17 allocations: 469.344 KB)
67-
1.4830359162942841e-12
95+
julia> norm(F-H)/norm(F)
96+
2.1541073345177038e-15
6897

6998
```
7099

71-
Composition of transforms allows the Jacobi—Jacobi transform, computed via `jjt`. The remainder in Hahn's asymptotic expansion is valid for the half-open square `(α,β) ∈ (-1/2,1/2]^2`. Therefore, the fast transform works best when the parameters are inside. If the parameters `(α,β)` are not exceptionally beyond the square, then increment/decrement operators are used with linear complexity (and linear conditioning) in the degree.
100+
Due to the structure of the spherical harmonic connection problem, these transforms may also be performed in-place with `lmul!` and `ldiv!`.
72101

73102
## Nonuniform fast Fourier transforms
74103

@@ -124,41 +153,12 @@ julia> N = div((n+1)*(n+2), 2);
124153

125154
julia> v = rand(N); # The length of v is the number of Padua points
126155

127-
julia> @time norm(ipaduatransform(paduatransform(v)) - v)
128-
0.006571 seconds (846 allocations: 1.746 MiB)
129-
3.123637691861415e-14
156+
julia> @time norm(ipaduatransform(paduatransform(v)) - v)/norm(v)
157+
0.007373 seconds (543 allocations: 1.733 MiB)
158+
3.925164683252905e-16
130159

131160
```
132161

133-
## The Spherical Harmonic Transform
134-
135-
Let `F` be a matrix of spherical harmonic expansion coefficients with columns arranged by increasing order in absolute value, alternating between negative and positive orders. Then `sph2fourier` converts the representation into a bivariate Fourier series, and `fourier2sph` converts it back.
136-
137-
```julia
138-
julia> F = sphrandn(Float64, 256, 256);
139-
140-
julia> G = sph2fourier(F);
141-
142-
julia> H = fourier2sph(G);
143-
144-
julia> norm(F-H)
145-
4.950645831278297e-14
146-
147-
julia> F = sphrandn(Float64, 1024, 1024);
148-
149-
julia> G = sph2fourier(F; sketch = :none);
150-
Pre-computing...100%|███████████████████████████████████████████| Time: 0:00:04
151-
152-
julia> H = fourier2sph(G; sketch = :none);
153-
Pre-computing...100%|███████████████████████████████████████████| Time: 0:00:04
154-
155-
julia> norm(F-H)
156-
1.1510623098225283e-12
157-
158-
```
159-
160-
As with other fast transforms, `plan_sph2fourier` saves effort by caching the pre-computation. Be warned that for dimensions larger than `1,000`, this is no small feat!
161-
162162
# References:
163163

164164
[1] B. Alpert and V. Rokhlin. <a href="http://dx.doi.org/10.1137/0912009">A fast algorithm for the evaluation of Legendre expansions</a>, *SIAM J. Sci. Stat. Comput.*, **12**:158—179, 1991.
@@ -171,7 +171,7 @@ As with other fast transforms, `plan_sph2fourier` saves effort by caching the pr
171171

172172
[5] R. M. Slevinsky. <a href="https://doi.org/10.1093/imanum/drw070">On the use of Hahn's asymptotic formula and stabilized recurrence for a fast, simple, and stable Chebyshev—Jacobi transform</a>, *IMA J. Numer. Anal.*, **38**:102—124, 2018.
173173

174-
[6] R. M. Slevinsky. <a href="https://doi.org/10.1016/j.acha.2017.11.001">Fast and backward stable transforms between spherical harmonic expansions and bivariate Fourier series</a>, in press at *Appl. Comput. Harmon. Anal.*, 2017.
174+
[6] R. M. Slevinsky. <a href="https://doi.org/10.1016/j.acha.2017.11.001">Fast and backward stable transforms between spherical harmonic expansions and bivariate Fourier series</a>, *Appl. Comput. Harmon. Anal.*, **47**:585—606, 2019.
175175

176176
[7] R. M. Slevinsky, <a href="https://arxiv.org/abs/1711.07866">Conquering the pre-computation in two-dimensional harmonic polynomial transforms</a>, arXiv:1711.07866, 2017.
177177

examples/sphere.jl

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,13 @@
2626
# For the storage pattern of the arrays, please consult the documentation.
2727
#############
2828

29+
function threshold!(A::AbstractArray, ϵ)
30+
for i in eachindex(A)
31+
if abs(A[i]) < ϵ A[i] = 0 end
32+
end
33+
A
34+
end
35+
2936
using FastTransforms
3037

3138
# The colatitudinal grid (mod π):
@@ -51,19 +58,19 @@ P4 = x -> (35*x^4-30*x^2+3)/8
5158
# On the tensor product grid, our function samples are:
5259
F = [(P4(z(θ,φ)y) - P4(xy))/(z(θ,φ)y - xy) for θ in θ, φ in φ]
5360

54-
P = plan_sph2fourier(Float64, N)
55-
PA = plan_sph_analysis(Float64, N, M)
61+
P = plan_sph2fourier(F)
62+
PA = plan_sph_analysis(F)
5663

5764
# Its spherical harmonic coefficients demonstrate that it is degree-3:
5865
V = PA*F
59-
U3 = P\V
66+
U3 = threshold!(P\V, 400*eps())
6067

6168
# Similarly, on the tensor product grid, the Legendre polynomial P₄(z⋅y) is:
6269
F = [P4(z(θ,φ)y) for θ in θ, φ in φ]
6370

6471
# Its spherical harmonic coefficients demonstrate that it is exact-degree-4:
6572
V = PA*F
66-
U4 = P\V
73+
U4 = threshold!(P\V, 3*eps())
6774

6875
nrm1 = norm(U4);
6976

@@ -73,7 +80,7 @@ F = [P4(z(θ,φ)⋅x) for θ in θ, φ in φ]
7380
# It only has one nonnegligible spherical harmonic coefficient.
7481
# Can you spot it?
7582
V = PA*F
76-
U4 = P\V
83+
U4 = threshold!(P\V, 3*eps())
7784

7885
# That nonnegligible coefficient should be approximately √(2π/(4+1/2)),
7986
# since the convention in this library is to orthonormalize.

src/libfasttransforms.jl

Lines changed: 16 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,17 @@
11
# This file shows how to call `libfasttransforms` from Julia.
2+
# It is normally built by downloading precompiled binaries assuming that
3+
# dependencies are already installed. However, it may also be built from source.
24

3-
# Step 1: In this repository, `git clone -b v0.2.1 https://github.com/MikaelSlevinsky/FastTransforms.git deps/FastTransforms`
5+
# Step 1: In this repository,
6+
# `git clone -b v0.2.6 https://github.com/MikaelSlevinsky/FastTransforms.git deps/FastTransforms`
47

5-
# Step 2: use a version of gcc that supports OpenMP: on OS X, this means using a
6-
# version of `gcc` from Homebrew, `brew install gcc`; on linux, `gcc-4.6` and up should work.
7-
# `export CC=gcc-"the-right-version"`.
8+
# Step 2: Get the dependencies. On macOS, run `brew install gcc@8 fftw mpfr`.
9+
# On linux, run `apt-get gcc-8 libblas-dev libopenblas-base libfftw3-dev libmpfr-dev`.
810

9-
# Step 3: get the remaining dependencies: On OS X, either `brew install openblas`
10-
# or change the Make.inc to use `BLAS=APPLEBLAS` instead of `BLAS=OPENBLAS`.
11-
# Furthermore, `brew install fftw mpfr`. For linux, see the `Travis.yml` file.
12-
# For Windows, see the `Appveyor.yml` file.
11+
# Step 3: Build the library. On macOS, run `make CC=gcc-8 FT_USE_APPLEBLAS=1`.
12+
# On linux, run `make CC=gcc-8`.
1313

14-
# Step 4: run `make` and check the tests by running `./test_drivers 3 3 0`.
15-
# All the errors should be roughly on the order of machine precision.
14+
# Step 4: move `libfastfransforms.dylib` out of the folder to be found by ↓.
1615

1716
const libfasttransforms = joinpath(dirname(@__DIR__), "deps", "libfasttransforms")
1817

@@ -220,15 +219,15 @@ for f in (:leg2cheb, :cheb2leg, :ultra2ultra, :jac2jac,
220219
plan_f = Symbol("plan_", f)
221220
@eval begin
222221
$plan_f(x::AbstractArray{T}, y...; z...) where T = $plan_f(T, size(x, 1), y...; z...)
223-
$f(x::AbstractArray{T}, y...; z...) where T = $plan_f(x, y...; z...)*x
222+
$f(x::AbstractArray, y...; z...) = $plan_f(x, y...; z...)*x
224223
end
225224
end
226225

227226
for (f, plan_f) in ((:fourier2sph, :plan_sph2fourier), (:fourier2sphv, :plan_sphv2fourier),
228227
(:cxf2disk2, :plan_disk2cxf), (:cheb2tri, :plan_tri2cheb),
229228
(:cheb2tet, :plan_tet2cheb))
230229
@eval begin
231-
$f(x::AbstractArray{T}, y...; z...) where T = $plan_f(x, y...; z...)\x
230+
$f(x::AbstractArray, y...; z...) = $plan_f(x, y...; z...)\x
232231
end
233232
end
234233

@@ -435,6 +434,7 @@ for (fJ, fC, fE, K) in ((:plan_sph_synthesis, :ft_plan_sph_synthesis, :ft_execut
435434
(:plan_tri_synthesis, :ft_plan_tri_synthesis, :ft_execute_tri_synthesis, TRIANGLESYNTHESIS),
436435
(:plan_tri_analysis, :ft_plan_tri_analysis, :ft_execute_tri_analysis, TRIANGLEANALYSIS))
437436
@eval begin
437+
$fJ(x::Matrix{T}) where T = $fJ(T, size(x, 1), size(x, 2))
438438
function $fJ(::Type{Float64}, n::Integer, m::Integer)
439439
plan = ccall(($(string(fC)), libfasttransforms), Ptr{ft_plan_struct}, (Cint, Cint), n, m)
440440
return FTPlan{Float64, 2, $K}(plan, n, m)
@@ -449,6 +449,8 @@ for (fJ, fC, fE, K) in ((:plan_sph_synthesis, :ft_plan_sph_synthesis, :ft_execut
449449
end
450450
end
451451

452+
plan_tet_synthesis(x::Array{T, 3}) where T = plan_tet_synthesis(T, size(x, 1), size(x, 2), size(x, 3))
453+
452454
function plan_tet_synthesis(::Type{Float64}, n::Integer, l::Integer, m::Integer)
453455
plan = ccall((:ft_plan_tet_synthesis, libfasttransforms), Ptr{ft_plan_struct}, (Cint, Cint, Cint), n, l, m)
454456
return FTPlan{Float64, 3, TETRAHEDRONSYNTHESIS}(plan, n, l, m)
@@ -462,6 +464,8 @@ function lmul!(p::FTPlan{Float64, 3, TETRAHEDRONSYNTHESIS}, x::Array{Float64, 3}
462464
return x
463465
end
464466

467+
plan_tet_analysis(x::Array{T, 3}) where T = plan_tet_analysis(T, size(x, 1), size(x, 2), size(x, 3))
468+
465469
function plan_tet_analysis(::Type{Float64}, n::Integer, l::Integer, m::Integer)
466470
plan = ccall((:ft_plan_tet_analysis, libfasttransforms), Ptr{ft_plan_struct}, (Cint, Cint, Cint), n, l, m)
467471
return FTPlan{Float64, 3, TETRAHEDRONANALYSIS}(plan, n, l, m)

0 commit comments

Comments
 (0)