|
1 | 1 | # Fast Fast Hadamard Transform |
2 | 2 |
|
3 | | -FFHT (Fast Fast Hadamard Transform) is a library that provides a heavily |
4 | | -optimized C99 implementation of the Fast Hadamard Transform. FFHT also provides |
5 | | -a thin Python wrapper that allows to perform the Fast Hadamard Transform on |
6 | | -one-dimensional [NumPy](http://www.numpy.org/) arrays. |
7 | | - |
8 | | -The Hadamard Transform is a linear orthogonal map defined on real vectors whose |
9 | | -length is a _power of two_. For the precise definition, see the |
10 | | -[Wikipedia entry](https://en.wikipedia.org/wiki/Hadamard_transform). The |
11 | | -Hadamard Transform has been recently used a lot in various machine learning |
12 | | -and numerical algorithms. |
13 | | - |
14 | | -FFHT uses [AVX](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) |
15 | | -to speed up the computation. |
16 | | - |
17 | | -The header file `fht.h` exports two functions: `int fht_float(float *buf, int |
18 | | -log_n)` and `int fht_double(double *buf, int log_n)`. The |
19 | | -only difference between them is the type of vector entries. So, in what follows, |
20 | | -we describe how the version for floats `fht_float` works. |
21 | | - |
22 | | -The function `fht_float` takes two parameters: |
23 | | - |
24 | | -* `buf` is a pointer to the data on which one needs to perform the Fast |
25 | | -Hadamard Transform. |
26 | | -* `log_n` is the binary logarithm of the length of `buffer`. |
27 | | -That is, the length is equal to `2^log_n`. |
28 | | - |
29 | | -The return value is -1 if the input is invalid and is zero otherwise. |
30 | | - |
31 | | -A header-only version of the library is provided in `fht_header_only.h`. |
32 | | - |
33 | | -In addition to the Fast Hadamard Transform, we provide two auxiliary programs: |
34 | | -`test_float` and `test_double`, which are implemented in C99. The exhaustively |
35 | | -test and benchmark the library. |
36 | | - |
37 | | -FFHT has been tested on 64-bit versions of Linux, OS X and Windows (the latter |
38 | | -is via Cygwin). |
39 | | - |
40 | | -To install the Python package, run `python setup.py install`. The script |
41 | | -`example.py` shows how to use FFHT from Python. |
42 | | - |
43 | | -## Benchmarks |
44 | | - |
45 | | -Below are the times for the Fast Hadamard Transform for vectors of |
46 | | -various lengths. The benchmarks were run on a machine with Intel |
47 | | -Core i7-6700K and 2133 MHz DDR4 RAM. We compare FFHT, |
48 | | -[FFTW 3.3.6](http://fftw.org/), and |
49 | | -[fht](https://github.com/nbarbey/fht) by |
50 | | -[Nicolas Barbey](https://github.com/nbarbey). |
51 | | - |
52 | | -Let us stress that FFTW is a great versatile tool, and the authors of FFTW did |
53 | | -not try to optimize the performace of the Fast Hadamard Transform. On the other |
54 | | -hand, FFHT does one thing (the Fast Hadamard Transform), but does it extremely |
55 | | -well. |
56 | | - |
57 | | -Vector size | FFHT (float) | FFHT (double) | FFTW 3.3.6 (float) | FFTW 3.3.6 (double) | fht (float) | fht (double) |
58 | | -:---: | :---: | :---: | :---: | :---: | :---: | :---: |
59 | | -2<sup>10</sup> | 0.31 us | 0.49 us | 4.48 us | 7.72 us | 17.4 us | 19.3 us |
60 | | -2<sup>20</sup> | 0.68 ms | 1.39 ms | 8.81 ms | 17.07 ms | 29.8 ms | 35.0 ms |
61 | | -2<sup>27</sup> | 0.22 s | 0.50 s | 2.08 s | 3.57 s | 6.89 s | 7.49 s |
62 | | - |
63 | | -## Troubleshooting |
64 | | - |
65 | | -For some versions of OS X the native `clang` compiler (that mimicks `gcc`) may |
66 | | -not recognize the availability of AVX. A solution for this problem is to use a |
67 | | -genuine `gcc` (say from [Homebrew](http://brew.sh/)) or to use `-march=corei7-avx` |
68 | | -instead of `-march=native` for compiler flags. |
69 | | - |
70 | | -A symptom of the above happening is the undefined macros `__AVX__`. |
71 | | - |
72 | | -## Related Work |
73 | | - |
74 | | -FFHT has been created as a part of |
75 | | -[FALCONN](https://github.com/falconn-lib/falconn): a library for similarity |
76 | | -search over high-dimensional data. FALCONN's underlying algorithms are described |
77 | | -and analyzed in the following research paper: |
78 | | - |
79 | | -> Alexandr Andoni, Piotr Indyk, Thijs Laarhoven, Ilya Razenshteyn and Ludwig |
80 | | -> Schmidt, "Practical and Optimal LSH for Angular Distance", NIPS 2015, full |
81 | | -> version available at [arXiv:1509.02897](http://arxiv.org/abs/1509.02897) |
82 | | -
|
83 | | -This is the right paper to cite, if you use FFHT for your research projects. |
84 | | - |
85 | | -## Acknowledgments |
86 | | - |
87 | | -We thank Ruslan Savchenko for useful discussions. |
88 | | - |
89 | | -Thanks to: |
90 | | - |
91 | | -* Clement Canonne |
92 | | -* Michal Forisek |
93 | | -* Rati Gelashvili |
94 | | -* Daniel Grier |
95 | | -* Dhiraj Holden |
96 | | -* Justin Holmgren |
97 | | -* Aleksandar Ivanovic |
98 | | -* Vladislav Isenbaev |
99 | | -* Jacob Kogler |
100 | | -* Ilya Kornakov |
101 | | -* Anton Lapshin |
102 | | -* Rio LaVigne |
103 | | -* Oleg Martynov |
104 | | -* Linar Mikeev |
105 | | -* Cameron Musco |
106 | | -* Sam Park |
107 | | -* Sunoo Park |
108 | | -* Amelia Perry |
109 | | -* Andrew Sabisch |
110 | | -* Abhishek Sarkar |
111 | | -* Ruslan Savchenko |
112 | | -* Vadim Semenov |
113 | | -* Arman Yessenamanov |
114 | | - |
115 | | -for helping us with testing FFHT. |
| 3 | +This directory contains a fork of https://github.com/FALCONN-LIB/FFHT |
| 4 | +(License: https://github.com/FALCONN-LIB/FFHT/blob/master/LICENSE.md) |
| 5 | +focused on ARM64 NEON code generation. |
0 commit comments