docs: update DBSCAN docs

alyst · web-flow · commit 523a28066844 · 2023-03-24T09:21:20.000-07:00
update w.r.t. #248 changes
diff --git a/docs/source/dbscan.md b/docs/source/dbscan.md
@@ -19,27 +19,27 @@ points. Then, ``q`` is considered to be *density reachable* by ``p`` if
 there exists a sequence ``p_1, p_2, \ldots, p_n`` such that ``p_1 = p``
 and ``p_{i+1}`` is directly density reachable from ``p_i``.
 
-A cluster, which is a subset of the given set of points, satisfies two
-properties:
- 1. All points within the cluster are mutually *density-connected*,
+The points within DBSCAN clusters are categorized into *core* (or *seeds*)
+and *boundary*:
+ 1. All points of the cluster *core* are mutually *density-connected*,
     meaning that for any two distinct points ``p`` and ``q`` in a
-    cluster, there exists a point ``o`` such that both ``p`` and ``q``
-    are density reachable from ``o``.
- 2. If a point is density-connected to any point of a cluster, it is
-    also part of that cluster.
+    core, there exists a point ``o`` such that both ``p`` and ``q``
+    are *density reachable* from ``o``.
+ 2. If a point is *density-connected* to any point of a cluster core, it is
+    also part of the core.
+ 3. All points within the ``\epsilon``-neighborhood of any core point, but
+    not belonging to that core (i.e. not *density reachable* from the core),
+    are considered cluster *boundary*.
 
 ## Interface
 
-There are two implementations of *DBSCAN* algorithm in this package
-(both provided by [`dbscan`](@ref) function):
- - Distance (adjacency) matrix-based. It requires ``O(N^2)`` memory to run.
-   Boundary points cannot be shared between the clusters.
- - Adjacency list-based. The input is the ``d \times n`` matrix of point
-   coordinates. The adjacency list is built on the fly. The performance is much
-   better both in terms of running time and memory usage. Returns a vector of
-   [`DbscanCluster`](@ref) objects that contain the indices of the *core* and
-   *boundary* points, making it possible to share the boundary points between
-   multiple clusters.
+The implementation of *DBSCAN* algorithm provided by [`dbscan`](@ref) function
+supports the two ways of specifying clustering data:
+ - The ``d \times n`` matrix of point coordinates. This is the preferred method
+   as it uses memory- and time-efficient neighboring points queries via
+   [NearestNeighbors.jl](https://github.com/KristofferC/NearestNeighbors.jl) package.
+ - The ``n\times n`` matrix of precalculated pairwise point distances.
+   It requires ``O(n^2)`` memory and time to run.
 
 ```@docs
 dbscan