You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+56-50Lines changed: 56 additions & 50 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,6 +14,7 @@ This package also provides optimized functions to compute column-wise and pairwi
14
14
15
15
* Euclidean distance
16
16
* Squared Euclidean distance
17
+
* Periodic Euclidean distance
17
18
* Cityblock distance
18
19
* Total variation distance
19
20
* Jaccard distance
@@ -38,12 +39,12 @@ This package also provides optimized functions to compute column-wise and pairwi
38
39
* Root mean squared deviation
39
40
* Normalized root mean squared deviation
40
41
* Bray-Curtis dissimilarity
41
-
* Bregman divergence
42
+
* Bregman divergence
42
43
43
44
For ``Euclidean distance``, ``Squared Euclidean distance``, ``Cityblock distance``, ``Minkowski distance``, and ``Hamming distance``, a weighted version is also provided.
44
45
45
46
46
-
## Basic Use
47
+
## Basic use
47
48
48
49
The library supports three ways of computation: *computing the distance between two vectors*, *column-wise computation*, and *pairwise computation*.
49
50
@@ -140,6 +141,7 @@ Each distance corresponds to a distance type. The type name and the correspondin
**Note:** The formulas above are using *Julia*'s functions. These formulas are mainly for conveying the math concepts in a concise way. The actual implementation may use a faster way. The arguments `x` and `y` are arrays of real numbers; `k` and `l` are arrays of distinct elements of any kind; a and b are arrays of Bools; and finally, `p` and `q` are arrays forming a discrete probability distribution and are therefore both expected to sum to one.
We can see that using ``colwise`` instead of a simple loop yields considerable gain (2x - 4x), especially when the internal computation of each distance is simple. Nonetheless, when the computation of a single distance is heavy enough (e.g. *KLDivergence*, *RenyiDivergence*), the gain is not as significant.
244
248
@@ -248,28 +252,30 @@ The table below compares the performance (measured in terms of average elapsed t
For distances of which a major part of the computation is a quadratic form (e.g. *Euclidean*, *CosineDist*, *Mahalanobis*), the performance can be drastically improved by restructuring the computation and delegating the core part to ``GEMM`` in *BLAS*. The use of this strategy can easily lead to 100x performance gain over simple loops (see the highlighted part of the table above).
0 commit comments