Skip to content

Commit 989fb13

Browse files
authored
Merge pull request #271 from jaksle/docs
Small corrections in docs validate.md
2 parents acfdaa8 + 567b0b8 commit 989fb13

File tree

1 file changed

+39
-35
lines changed

1 file changed

+39
-35
lines changed

docs/source/validate.md

Lines changed: 39 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ mutualinfo
8585

8686
## Clustering quality indices
8787

88-
[`clustering_quality()`][@ref clustering_quality] methods allow computing *intrinsic* clustering quality indices,
88+
[`clustering_quality()`](@ref clustering_quality) methods allow computing *intrinsic* clustering quality indices,
8989
i.e. the metrics that depend only on the clustering itself and do not use the external knowledge.
9090
These metrics can be used to compare different clustering algorithms or choose the optimal number of clusters.
9191

@@ -180,7 +180,7 @@ Higher values indicate better separation of clusters w.r.t. point distances.
180180
silhouettes
181181
```
182182

183-
[`clustering_quality(..., quality_index=:silhouettes)`][@ref clustering_quality]
183+
[`clustering_quality(..., quality_index=:silhouettes)`](@ref clustering_quality)
184184
provides mean silhouette metric for the datapoints. Higher values indicate better quality.
185185

186186
## References
@@ -191,51 +191,52 @@ provides mean silhouette metric for the datapoints. Higher values indicate bette
191191
### Examples
192192

193193
Exemplary data with 3 real clusters.
194-
```@example
195-
using Plots, Clustering
196-
X = hcat([4., 5.] .+ 0.4 * randn(2, 10),
197-
[9., -5.] .+ 0.4 * randn(2, 5),
198-
[-4., -9.] .+ 1 * randn(2, 5))
199-
194+
```@example clu_quality
195+
using Plots, Plots.PlotMeasures, Clustering
196+
X_clusters = [(center = [4., 5.], std = 0.4, n = 10),
197+
(center = [9., -5.], std = 0.4, n = 5),
198+
(center = [-4., -9.], std = 1, n = 5)]
199+
X = mapreduce(hcat, X_clusters) do (center, std, n)
200+
center .+ std .* randn(length(center), n)
201+
end
202+
X_assignments = mapreduce(vcat, enumerate(X_clusters)) do (i, (_, _, n))
203+
fill(i, n)
204+
end
200205
201206
scatter(view(X, 1, :), view(X, 2, :),
202-
label = "data points",
203-
xlabel = "x",
204-
ylabel = "y",
205-
legend = :right,
206-
)
207+
markercolor = X_assignments,
208+
plot_title = "Data", label = nothing,
209+
xlabel = "x", ylabel = "y",
210+
legend = :outerright,
211+
size = (600, 500)
212+
);
213+
savefig("clu_quality_data.svg"); nothing # hide
207214
```
215+
![](clu_quality_data.svg)
208216

209-
Hard clustering quality for K-means method with 2 to 5 clusters:
217+
Hard clustering quality for [K-means](@ref) method with 2 to 5 clusters:
210218

211-
```@example
212-
using Plots, Clustering
213-
X = hcat([4., 5.] .+ 0.4 * randn(2, 10),
214-
[9., -5.] .+ 0.4 * randn(2, 5),
215-
[-4., -9.] .+ 1 * randn(2, 5))
216-
217-
nclusters = 2:5
218-
clusterings = kmeans.(Ref(X), nclusters)
219+
```@example clu_quality
220+
hard_nclusters = 2:5
221+
clusterings = kmeans.(Ref(X), hard_nclusters)
219222
220223
plot((
221-
plot(nclusters,
224+
plot(hard_nclusters,
222225
clustering_quality.(Ref(X), clusterings, quality_index = qidx),
223226
marker = :circle,
224227
title = ":$qidx", label = nothing,
225228
) for qidx in [:silhouettes, :calinski_harabasz, :xie_beni, :davies_bouldin, :dunn])...,
226-
layout = (3, 2),
227-
xaxis = "N clusters",
228-
plot_title = "\"Hard\" clustering quality indices"
229+
layout = (2, 3),
230+
xaxis = "N clusters", yaxis = "Quality",
231+
plot_title = "\"Hard\" clustering quality indices",
232+
size = (1000, 600), left_margin = 10pt
229233
)
234+
savefig("clu_quality_hard.svg"); nothing # hide
230235
```
236+
![](clu_quality_hard.svg)
231237

232238
Fuzzy clustering quality for fuzzy C-means method with 2 to 5 clusters:
233-
```@example
234-
using Plots, Clustering
235-
X = hcat([4., 5.] .+ 0.4 * randn(2, 10),
236-
[9., -5.] .+ 0.4 * randn(2, 5),
237-
[-4., -9.] .+ 1 * randn(2, 5))
238-
239+
```@example clu_quality
239240
fuzziness = 2
240241
fuzzy_nclusters = 2:5
241242
fuzzy_clusterings = fuzzy_cmeans.(Ref(X), fuzzy_nclusters, fuzziness)
@@ -247,11 +248,14 @@ plot((
247248
marker = :circle,
248249
title = ":$qidx", label = nothing,
249250
) for qidx in [:calinski_harabasz, :xie_beni])...,
250-
layout = (2, 1),
251-
xaxis = "N clusters",
252-
plot_title = "\"Soft\" clustering quality indices"
251+
layout = (1, 2),
252+
xaxis = "N clusters", yaxis = "Quality",
253+
plot_title = "\"Soft\" clustering quality indices",
254+
size = (700, 350), left_margin = 10pt
253255
)
256+
savefig("clu_quality_soft.svg"); nothing # hide
254257
```
258+
![](clu_quality_soft.svg)
255259

256260

257261
## Other packages

0 commit comments

Comments
 (0)