Skip to content

Commit adfac20

Browse files
authored
Update README.md
1 parent 10aac3d commit adfac20

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

README.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -332,6 +332,8 @@ The final version of `matrixMultiply` will have good performance for both small
332332

333333
## Benchmark
334334

335+
We created some benchmarks for Compute.scala and ND4J on NVIDIA and AMD GPU in immutable style.
336+
335337
* [Compute.scala vs ND4J on a NVIDIA Titan X GPU](http://jmh.morethan.io/?source=https://thoughtworksinc.github.io/Compute.scala/benchmarks/nvidia-gpu.json)
336338
* [Compute.scala on a AMD RX480 GPU](http://jmh.morethan.io/?source=https://thoughtworksinc.github.io/Compute.scala/benchmarks/amd-gpu.json)
337339

@@ -340,7 +342,9 @@ Some information can be found in the benchmark result:
340342
* Apparently, Compute.scala supports both NVIDIA GPU and AMD GPU, while ND4J does not support AMD GPU.
341343
* Compute.scala is faster than ND4J on large arrays or complex expressions.
342344
* ND4J is faster than Compute.scala when performing one simple primary operation on very small arrays.
343-
* ND4J's `permute` and `broadcast` are extremely slow, causing very low score in the convolution benchmark (unlike this benchmark, Deeplearning4j's convolution operation internally uses some undocumented variant of `permute` and `broadcast` in ND4J, which are not extremely slow).
345+
* ND4J's `permute` and `broadcast` are extremely slow, causing very low score in the convolution benchmark.
346+
347+
Note that the above result of ND4J is not the same as the performance in Deeplearning4j, because Deeplearning4j uses ND4J in mutable style (i.e. `a *= b; a += c` instead of `a * b + c`) and ND4J has some undocumented optimizions for `permute` and `broadcast` when they are invoked with some special parameters from Deeplearning4j.
344348

345349
## Future work
346350

0 commit comments

Comments
 (0)