@@ -10,14 +10,46 @@ What's included in this project?
1010- a standalone C++ API (` libhaste ` )
1111- a TensorFlow Python API (` haste_tf ` )
1212- examples for writing your own custom C++ inference / training code using ` libhaste `
13+ - benchmarking programs to evaluate the performance of RNN implementations
1314
1415For questions or feedback about Haste, please open an issue on GitHub or send us an email at [ haste@lmnt.com ] ( mailto:haste@lmnt.com ) .
1516
17+ ## Performance
18+ Our LSTM benchmark indicates that Haste has the fastest publicly available implementation for nearly all problem sizes.
19+ <table >
20+ <tr ><td ><img src =" https://lmnt.com/assets/haste/benchmark/report_n=16_c=128.png " ></td ><td ><img src =" https://lmnt.com/assets/haste/benchmark/report_n=32_c=256.png " ></td ></tr >
21+ <tr ></tr >
22+ <tr ><td ><img src =" https://lmnt.com/assets/haste/benchmark/report_n=64_c=128.png " ></td ><td ><img src =" https://lmnt.com/assets/haste/benchmark/report_n=128_c=256.png " ></td ></tr >
23+ </table >
24+
25+ Here is our complete benchmark result grid:
26+ <br >
27+ [ ` N=1 C=64 ` ] ( https://lmnt.com/assets/haste/benchmark/report_n=1_c=64.png )
28+ [ ` N=1 C=128 ` ] ( https://lmnt.com/assets/haste/benchmark/report_n=1_c=128.png )
29+ [ ` N=1 C=256 ` ] ( https://lmnt.com/assets/haste/benchmark/report_n=1_c=256.png )
30+ [ ` N=1 C=512 ` ] ( https://lmnt.com/assets/haste/benchmark/report_n=1_c=512.png )
31+ <br >
32+ [ ` N=32 C=64 ` ] ( https://lmnt.com/assets/haste/benchmark/report_n=32_c=64.png )
33+ [ ` N=32 C=128 ` ] ( https://lmnt.com/assets/haste/benchmark/report_n=32_c=128.png )
34+ [ ` N=32 C=256 ` ] ( https://lmnt.com/assets/haste/benchmark/report_n=32_c=256.png )
35+ [ ` N=32 C=512 ` ] ( https://lmnt.com/assets/haste/benchmark/report_n=32_c=512.png )
36+ <br >
37+ [ ` N=64 C=64 ` ] ( https://lmnt.com/assets/haste/benchmark/report_n=64_c=64.png )
38+ [ ` N=64 C=128 ` ] ( https://lmnt.com/assets/haste/benchmark/report_n=64_c=128.png )
39+ [ ` N=64 C=256 ` ] ( https://lmnt.com/assets/haste/benchmark/report_n=64_c=256.png )
40+ [ ` N=64 C=512 ` ] ( https://lmnt.com/assets/haste/benchmark/report_n=64_c=512.png )
41+ <br >
42+ [ ` N=128 C=64 ` ] ( https://lmnt.com/assets/haste/benchmark/report_n=128_c=64.png )
43+ [ ` N=128 C=128 ` ] ( https://lmnt.com/assets/haste/benchmark/report_n=128_c=128.png )
44+ [ ` N=128 C=256 ` ] ( https://lmnt.com/assets/haste/benchmark/report_n=128_c=256.png )
45+ [ ` N=128 C=512 ` ] ( https://lmnt.com/assets/haste/benchmark/report_n=128_c=512.png )
46+
1647## Install
1748Here's what you'll need to get started:
18- - a [ CUDA Compute Capability] ( https://developer.nvidia.com/cuda-gpus ) 6.0+ GPU
19- - [ TensorFlow GPU] ( https://www.tensorflow.org/install/gpu ) 1.14+ or 2.0+ for TensorFlow integration
20- - [ Eigen 3] ( http://eigen.tuxfamily.org/ ) to build the C++ examples
49+ - a [ CUDA Compute Capability] ( https://developer.nvidia.com/cuda-gpus ) 6.0+ GPU (required)
50+ - [ TensorFlow GPU] ( https://www.tensorflow.org/install/gpu ) 1.14+ or 2.0+ for TensorFlow integration (optional)
51+ - [ Eigen 3] ( http://eigen.tuxfamily.org/ ) to build the C++ examples (optional)
52+ - [ cuDNN Developer Library] ( https://developer.nvidia.com/rdp/cudnn-archive ) to build benchmarking programs (optional)
2153
2254Once you have the prerequisites, run the following to build the code and install the TensorFlow API:
2355```
@@ -41,6 +73,7 @@ The TensorFlow Python API is documented in [`docs/tf/haste_tf.md`](docs/tf/haste
4173The C++ API is documented in [ ` lib/haste.h ` ] ( lib/haste.h ) and there are code samples in [ ` examples/ ` ] ( examples/ ) .
4274
4375## Code layout
76+ - [ ` benchmarks/ ` ] ( benchmarks ) : programs to evaluate performance of RNN implementations
4477- [ ` docs/tf/ ` ] ( docs/tf ) : API reference documentation for ` haste_tf `
4578- [ ` examples/ ` ] ( examples ) : examples for writing your own C++ inference / training code using ` libhaste `
4679- [ ` frameworks/tf/ ` ] ( frameworks/tf ) : TensorFlow Python API and custom op code
0 commit comments