Skip to content

[ENH] Louvain Clustering: Add cosine similarity#4864

Merged
pavlin-policar merged 1 commit intobiolab:masterfrom
janezd:louvain-cosine
Jun 20, 2020
Merged

[ENH] Louvain Clustering: Add cosine similarity#4864
pavlin-policar merged 1 commit intobiolab:masterfrom
janezd:louvain-cosine

Conversation

@janezd
Copy link
Contributor

@janezd janezd commented Jun 19, 2020

As suggested by @pavlin-policar in a comment to #4855, cosine distance can be computed as Euclidean on row-normalized data, because we care only about ranking. This allows NearestNeighbours to use ball trees.

Includes
  • Code changes
  • Tests
  • Documentation

@janezd
Copy link
Contributor Author

janezd commented Jun 19, 2020

@pavlin-policar, I didn't really read the entire widget's code, but it seems that it would suffice to normalize the data and change the metric in matrix_to_knn_graph. As I see, the absolute distances are not needed at all (return_distance=False, and I see no other mention of metric).

Am I correct?

@codecov
Copy link

codecov bot commented Jun 19, 2020

Codecov Report

Merging #4864 into master will decrease coverage by 0.00%.
The diff coverage is 91.07%.

@@            Coverage Diff             @@
##           master    #4864      +/-   ##
==========================================
- Coverage   84.17%   84.17%   -0.01%     
==========================================
  Files         277      277              
  Lines       56541    56503      -38     
==========================================
- Hits        47596    47564      -32     
+ Misses       8945     8939       -6     

@pavlin-policar
Copy link
Collaborator

As I see, the absolute distances are not needed at all

Yes, that's right. We build a graph, and the edges are weighted with the Jaccard coefficient of shared nearest neighbors between points. Distances are not used anywhere.

This codecov is doing strange things again...

@pavlin-policar pavlin-policar merged commit 56c12a5 into biolab:master Jun 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants