Skip to content

Commit b0cb670

Browse files
Clarify inputs to HDBSCAN must be numeric
1 parent 64b055e commit b0cb670

File tree

1 file changed

+6
-0
lines changed

1 file changed

+6
-0
lines changed

docs/basic_hdbscan.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -286,6 +286,12 @@ relationships as long as there exists a path between two points that
286286
contains defined distances (i.e. if there are too many distances
287287
missing, the clustering is going to fail).
288288

289+
NOTE: The input vector _must_ contain numerical data. If you have a
290+
distance matrix for non-numerical vectors, you will need to map your
291+
input vectors to numerical vectors. (e.g use map ['A', 'G', 'C', 'T']->
292+
[ 1, 2, 3, 4] to replace input vector ['A', 'A', 'A', 'C', 'G'] with
293+
[ 1, 1, 1, 3, 2])
294+
289295
.. code:: python
290296
291297
from sklearn.metrics.pairwise import pairwise_distances

0 commit comments

Comments
 (0)