You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Experimental FFT library with an emphasis on runtime performance.
4
-
The goal of this library is to hand-unroll all loops in a procedural macro
5
-
for optimal SIMD throughput.
6
+
# Monarch Butterfly
6
7
7
-
This currently only works on powers of two.
8
+
Experimental FFT library where all FFTs are proc-macro generated const-evaluation functions. The use case is if you know the FFT size at compile time. However, knowing the FFT size at compile time gives immense gains.
8
9
9
-
This library implements FFTs for both `f32` and `f64` for the following sizes:
10
-
```
11
-
1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048
12
-
```
10
+
This library implements FFTs for both `f32` and `f64` sizes `1-200`. The FFTs are auto-generated so this limit could be increased above 200 at the expense of compile time.
13
11
14
-
This library will use all SIMD features your CPU has available including AVX512,
15
-
assuming you compile with those features (`RUSTFLAGS="-C target-cpu=native" cargo build`).
12
+
## Features
16
13
17
-
The larger the FFT sizes, the larger speed boost this library will give you.
14
+
- All functions are auto-generated with proc-macros with unrolled loops
15
+
- Zero `unsafe` code
16
+
- Completely portable
17
+
- Const-evaluation functions
18
18
19
-
As an example of AVX512 instructions, here is an example on just an FFT
20
-
of size 128: https://godbolt.org/z/rz48azEsd (`Ctrl+F` for "zmm" instructions)
19
+
## Limitations
20
+
21
+
- FFT size must be known at compile time
22
+
- By default, only FFTs up to size 200 are generated
21
23
22
-
If a larger FFT size is needed, just clone the repo and add the needed
23
-
sizes to the top of `crates\monarch-derive\src\lib.rs` and larger FFTs
24
-
will be generated. However, this comes at the cost of a longer compile time.
24
+

25
25
26
26
```
27
27
use monarch_butterfly::*;
28
28
use num_complex::Complex;
29
29
30
30
let input: Vec<_> = (0..8).map(|i| Complex::new(i as f32, 0.0)).collect();
31
-
let output_slice = fft8(&input);
32
-
let output_vec = fft8(input);
31
+
let output = fft::<8, _, _>(input);
33
32
```
34
33
34
+
This library will use all SIMD features your CPU has available including AVX512,
35
+
assuming you compile with those features (`RUSTFLAGS="-C target-cpu=native" cargo build`).
36
+
37
+
The larger the FFT sizes, the larger speed boost this library will give you.
38
+
39
+
As an example of AVX512 instructions, here is an example on just an FFT
40
+
of size 128: https://godbolt.org/z/Y58eh1x5a(`Ctrl+F` for "zmm" instructions)
41
+
35
42
The FFTs before unrolling are heavily inspired from [RustFFT](https://github.com/ejmahler/RustFFT).
36
43
Credit is given to Elliott Mahler as the RustFFT original author.
//! Experimental FFT library where all FFTs are proc-macro generated const-evaluation functions. //! The use case is if you know the FFT size at compile time. However, knowing the FFT size at //! compile time gives immense gains.
9
+
//!
10
+
//! This library implements FFTs for both `f32` and `f64` sizes `1-200`. The FFTs are //! auto-generated so this limit could be increased above 200 at the expense of compile time.
11
+
//!
12
+
//! ## Features
13
+
//!
14
+
//! - All functions are auto-generated with proc-macros with unrolled loops
15
+
//! - Zero `unsafe` code
16
+
//! - Completely portable
17
+
//! - Const-evaluation functions
18
+
//!
19
+
//! ## Limitations
20
+
//!
21
+
//! - FFT size must be known at compile time
22
+
//! - By default, only FFTs up to size 200 are generated
23
+
//!
24
+
//!
20
25
//! ```
21
26
//! use monarch_butterfly::*;
22
27
//! use num_complex::Complex;
23
-
//!
28
+
//!
24
29
//! let input: Vec<_> = (0..8).map(|i| Complex::new(i as f32, 0.0)).collect();
25
-
//! let output_slice = fft::<8, _, _>(&input);
26
-
//! let output_vec = fft::<8, _, _>(input);
30
+
//! let output = fft::<8, _, _>(input);
27
31
//! ```
32
+
//!
33
+
//! The top level functions are [`fft`] and [`ifft`].
34
+
//!
35
+
//! This library will use all SIMD features your CPU has available including AVX512,
36
+
//! assuming you compile with those features (`RUSTFLAGS="-C target-cpu=native" cargo build`).
37
+
//!
38
+
//! The larger the FFT sizes, the larger speed boost this library will give you.
39
+
//!
40
+
//! As an example of AVX512 instructions, here is an example on just an FFT
41
+
//! of size 128: <https://godbolt.org/z/Y58eh1x5a>(`Ctrl+F` for "zmm" instructions)
42
+
//!
43
+
//! The FFTs before unrolling are heavily inspired from [`RustFFT``](<https://github.com/ejmahler/RustFFT>).
44
+
//! Credit is given to Elliott Mahler as the RustFFT original author.
0 commit comments