Skip to content

Commit 4d36d59

Browse files
authored
Merge pull request #82 from alyst/fix_chisq
Fix ChiSq distance around (0,0)
2 parents dbbacca + 1ec7304 commit 4d36d59

File tree

3 files changed

+9
-7
lines changed

3 files changed

+9
-7
lines changed

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -137,7 +137,7 @@ Each distance corresponds to a distance type. The type name and the correspondin
137137
| Chebyshev | `chebyshev(x, y)` | `max(abs(x - y))` |
138138
| Minkowski | `minkowski(x, y, p)` | `sum(abs(x - y).^p) ^ (1/p)` |
139139
| Hamming | `hamming(k, l)` | `sum(k .!= l)` |
140-
| Rogers-Tanimoto | `rogerstanimoto(a, b)` | `2(sum(a&!b) + sum(!a&b)) / (2(sum(a&!b) + sum(!a&b)) + sum(a&b) + sum(!a&!b))` |
140+
| RogersTanimoto | `rogerstanimoto(a, b)` | `2(sum(a&!b) + sum(!a&b)) / (2(sum(a&!b) + sum(!a&b)) + sum(a&b) + sum(!a&!b))` |
141141
| Jaccard | `jaccard(x, y)` | `1 - sum(min(x, y)) / sum(max(x, y))` |
142142
| CosineDist | `cosine_dist(x, y)` | `1 - dot(x, y) / (norm(x) * norm(y))` |
143143
| CorrDist | `corr_dist(x, y)` | `cosine_dist(x - mean(x), y - mean(y))` |
@@ -146,12 +146,12 @@ Each distance corresponds to a distance type. The type name and the correspondin
146146
| GenKLDivergence | `gkl_divergence(x, y)` | `sum(p .* log(p ./ q) - p + q)` |
147147
| RenyiDivergence | `renyi_divergence(p, q, k)`| `log(sum( p .* (p ./ q) .^ (k - 1))) / (k - 1)` |
148148
| JSDivergence | `js_divergence(p, q)` | `KL(p, m) / 2 + KL(p, m) / 2 with m = (p + q) / 2` |
149-
| SpanNormDist | `spannorm_dist(x, y)` | `max(x - y) - min(x - y )` |
149+
| SpanNormDist | `spannorm_dist(x, y)` | `max(x - y) - min(x - y)` |
150150
| BhattacharyyaDist | `bhattacharyya(x, y)` | `-log(sum(sqrt(x .* y) / sqrt(sum(x) * sum(y)))` |
151151
| HellingerDist | `hellinger(x, y) ` | `sqrt(1 - sum(sqrt(x .* y) / sqrt(sum(x) * sum(y))))` |
152152
| Haversine | `haversine(x, y, r)` | [Haversine formula](https://en.wikipedia.org/wiki/Haversine_formula) |
153153
| Mahalanobis | `mahalanobis(x, y, Q)` | `sqrt((x - y)' * Q * (x - y))` |
154-
| SqMahalanobis | `sqmahalanobis(x, y, Q)` | ` (x - y)' * Q * (x - y)` |
154+
| SqMahalanobis | `sqmahalanobis(x, y, Q)` | `(x - y)' * Q * (x - y)` |
155155
| MeanAbsDeviation | `meanad(x, y)` | `mean(abs.(x - y))` |
156156
| MeanSqDeviation | `msd(x, y)` | `mean(abs2.(x - y))` |
157157
| RMSDeviation | `rmsd(x, y)` | `sqrt(msd(x, y))` |
@@ -194,9 +194,9 @@ julia> pairwise(Euclidean(1e-12), x, x)
194194

195195
## Benchmarks
196196

197-
The implementation has been carefully optimized based on benchmarks. The script in `benchmark/benchmarks.jl` defines a benchmark suite
198-
for a variety of distances, under column-wise and pairwise settings.
199-
197+
The implementation has been carefully optimized based on benchmarks. The script in `benchmark/benchmarks.jl` defines a benchmark suite
198+
for a variety of distances, under column-wise and pairwise settings.
199+
200200
Here are benchmarks obtained running Julia 0.6 on a computer with a quad-core Intel Core i5-2500K processor @ 3.3 GHz.
201201
The tables below can be replicated using the script in `benchmark/print_table.jl`.
202202

src/metrics.jl

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -239,7 +239,7 @@ corr_dist(a::AbstractArray, b::AbstractArray) = evaluate(CorrDist(), a, b)
239239
result_type(::CorrDist, a::AbstractArray, b::AbstractArray) = result_type(CosineDist(), a, b)
240240

241241
# ChiSqDist
242-
@inline eval_op(::ChiSqDist, ai, bi) = abs2(ai - bi) / (ai + bi)
242+
@inline eval_op(::ChiSqDist, ai, bi) = (d = abs2(ai - bi) / (ai + bi); ifelse(ai != bi, d, zero(d)))
243243
@inline eval_reduce(::ChiSqDist, s1, s2) = s1 + s2
244244
chisq_dist(a::AbstractArray, b::AbstractArray) = evaluate(ChiSqDist(), a, b)
245245

test/test_dists.jl

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -161,6 +161,8 @@ end
161161
@test wminkowski(x, y, w, 2) weuclidean(x, y, w)
162162
end
163163

164+
# Test ChiSq doesn't give NaN at zero
165+
@test chisq_dist([0.0], [0.0]) == 0.0
164166

165167
# Test weighted Hamming distances with even weights
166168
a = T.([1.0, 2.0, 1.0, 3.0, 2.0, 1.0])

0 commit comments

Comments
 (0)