Skip to content

Commit 523a280

Browse files
authored
docs: update DBSCAN docs
update w.r.t. #248 changes
1 parent 7e7ab35 commit 523a280

File tree

1 file changed

+17
-17
lines changed

1 file changed

+17
-17
lines changed

docs/source/dbscan.md

Lines changed: 17 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -19,27 +19,27 @@ points. Then, ``q`` is considered to be *density reachable* by ``p`` if
1919
there exists a sequence ``p_1, p_2, \ldots, p_n`` such that ``p_1 = p``
2020
and ``p_{i+1}`` is directly density reachable from ``p_i``.
2121

22-
A cluster, which is a subset of the given set of points, satisfies two
23-
properties:
24-
1. All points within the cluster are mutually *density-connected*,
22+
The points within DBSCAN clusters are categorized into *core* (or *seeds*)
23+
and *boundary*:
24+
1. All points of the cluster *core* are mutually *density-connected*,
2525
meaning that for any two distinct points ``p`` and ``q`` in a
26-
cluster, there exists a point ``o`` such that both ``p`` and ``q``
27-
are density reachable from ``o``.
28-
2. If a point is density-connected to any point of a cluster, it is
29-
also part of that cluster.
26+
core, there exists a point ``o`` such that both ``p`` and ``q``
27+
are *density reachable* from ``o``.
28+
2. If a point is *density-connected* to any point of a cluster core, it is
29+
also part of the core.
30+
3. All points within the ``\epsilon``-neighborhood of any core point, but
31+
not belonging to that core (i.e. not *density reachable* from the core),
32+
are considered cluster *boundary*.
3033

3134
## Interface
3235

33-
There are two implementations of *DBSCAN* algorithm in this package
34-
(both provided by [`dbscan`](@ref) function):
35-
- Distance (adjacency) matrix-based. It requires ``O(N^2)`` memory to run.
36-
Boundary points cannot be shared between the clusters.
37-
- Adjacency list-based. The input is the ``d \times n`` matrix of point
38-
coordinates. The adjacency list is built on the fly. The performance is much
39-
better both in terms of running time and memory usage. Returns a vector of
40-
[`DbscanCluster`](@ref) objects that contain the indices of the *core* and
41-
*boundary* points, making it possible to share the boundary points between
42-
multiple clusters.
36+
The implementation of *DBSCAN* algorithm provided by [`dbscan`](@ref) function
37+
supports the two ways of specifying clustering data:
38+
- The ``d \times n`` matrix of point coordinates. This is the preferred method
39+
as it uses memory- and time-efficient neighboring points queries via
40+
[NearestNeighbors.jl](https://github.com/KristofferC/NearestNeighbors.jl) package.
41+
- The ``n\times n`` matrix of precalculated pairwise point distances.
42+
It requires ``O(n^2)`` memory and time to run.
4343

4444
```@docs
4545
dbscan

0 commit comments

Comments
 (0)