@@ -85,7 +85,7 @@ mutualinfo
85
85
86
86
## Clustering quality indices
87
87
88
- [ ` clustering_quality() ` ] [ @ref clustering_quality ] methods allow computing * intrinsic* clustering quality indices,
88
+ [ ` clustering_quality() ` ] ( @ref clustering_quality) methods allow computing * intrinsic* clustering quality indices,
89
89
i.e. the metrics that depend only on the clustering itself and do not use the external knowledge.
90
90
These metrics can be used to compare different clustering algorithms or choose the optimal number of clusters.
91
91
@@ -180,7 +180,7 @@ Higher values indicate better separation of clusters w.r.t. point distances.
180
180
silhouettes
181
181
```
182
182
183
- [ ` clustering_quality(..., quality_index=:silhouettes) ` ] [ @ref clustering_quality ]
183
+ [ ` clustering_quality(..., quality_index=:silhouettes) ` ] ( @ref clustering_quality)
184
184
provides mean silhouette metric for the datapoints. Higher values indicate better quality.
185
185
186
186
## References
@@ -191,51 +191,52 @@ provides mean silhouette metric for the datapoints. Higher values indicate bette
191
191
### Examples
192
192
193
193
Exemplary data with 3 real clusters.
194
- ``` @example
195
- using Plots, Clustering
196
- X = hcat([4., 5.] .+ 0.4 * randn(2, 10),
197
- [9., -5.] .+ 0.4 * randn(2, 5),
198
- [-4., -9.] .+ 1 * randn(2, 5))
199
-
194
+ ``` @example clu_quality
195
+ using Plots, Plots.PlotMeasures, Clustering
196
+ X_clusters = [(center = [4., 5.], std = 0.4, n = 10),
197
+ (center = [9., -5.], std = 0.4, n = 5),
198
+ (center = [-4., -9.], std = 1, n = 5)]
199
+ X = mapreduce(hcat, X_clusters) do (center, std, n)
200
+ center .+ std .* randn(length(center), n)
201
+ end
202
+ X_assignments = mapreduce(vcat, enumerate(X_clusters)) do (i, (_, _, n))
203
+ fill(i, n)
204
+ end
200
205
201
206
scatter(view(X, 1, :), view(X, 2, :),
202
- label = "data points",
203
- xlabel = "x",
204
- ylabel = "y",
205
- legend = :right,
206
- )
207
+ markercolor = X_assignments,
208
+ plot_title = "Data", label = nothing,
209
+ xlabel = "x", ylabel = "y",
210
+ legend = :outerright,
211
+ size = (600, 500)
212
+ );
213
+ savefig("clu_quality_data.svg"); nothing # hide
207
214
```
215
+ ![ ] ( clu_quality_data.svg )
208
216
209
- Hard clustering quality for K-means method with 2 to 5 clusters:
217
+ Hard clustering quality for [ K-means] ( @ref ) method with 2 to 5 clusters:
210
218
211
- ``` @example
212
- using Plots, Clustering
213
- X = hcat([4., 5.] .+ 0.4 * randn(2, 10),
214
- [9., -5.] .+ 0.4 * randn(2, 5),
215
- [-4., -9.] .+ 1 * randn(2, 5))
216
-
217
- nclusters = 2:5
218
- clusterings = kmeans.(Ref(X), nclusters)
219
+ ``` @example clu_quality
220
+ hard_nclusters = 2:5
221
+ clusterings = kmeans.(Ref(X), hard_nclusters)
219
222
220
223
plot((
221
- plot(nclusters ,
224
+ plot(hard_nclusters ,
222
225
clustering_quality.(Ref(X), clusterings, quality_index = qidx),
223
226
marker = :circle,
224
227
title = ":$qidx", label = nothing,
225
228
) for qidx in [:silhouettes, :calinski_harabasz, :xie_beni, :davies_bouldin, :dunn])...,
226
- layout = (3, 2),
227
- xaxis = "N clusters",
228
- plot_title = "\"Hard\" clustering quality indices"
229
+ layout = (2, 3),
230
+ xaxis = "N clusters", yaxis = "Quality",
231
+ plot_title = "\"Hard\" clustering quality indices",
232
+ size = (1000, 600), left_margin = 10pt
229
233
)
234
+ savefig("clu_quality_hard.svg"); nothing # hide
230
235
```
236
+ ![ ] ( clu_quality_hard.svg )
231
237
232
238
Fuzzy clustering quality for fuzzy C-means method with 2 to 5 clusters:
233
- ``` @example
234
- using Plots, Clustering
235
- X = hcat([4., 5.] .+ 0.4 * randn(2, 10),
236
- [9., -5.] .+ 0.4 * randn(2, 5),
237
- [-4., -9.] .+ 1 * randn(2, 5))
238
-
239
+ ``` @example clu_quality
239
240
fuzziness = 2
240
241
fuzzy_nclusters = 2:5
241
242
fuzzy_clusterings = fuzzy_cmeans.(Ref(X), fuzzy_nclusters, fuzziness)
@@ -247,11 +248,14 @@ plot((
247
248
marker = :circle,
248
249
title = ":$qidx", label = nothing,
249
250
) for qidx in [:calinski_harabasz, :xie_beni])...,
250
- layout = (2, 1),
251
- xaxis = "N clusters",
252
- plot_title = "\"Soft\" clustering quality indices"
251
+ layout = (1, 2),
252
+ xaxis = "N clusters", yaxis = "Quality",
253
+ plot_title = "\"Soft\" clustering quality indices",
254
+ size = (700, 350), left_margin = 10pt
253
255
)
256
+ savefig("clu_quality_soft.svg"); nothing # hide
254
257
```
258
+ ![ ] ( clu_quality_soft.svg )
255
259
256
260
257
261
## Other packages
0 commit comments