Skip to content

Commit 76da511

Browse files
derrickburnsclaude
andcommitted
feat: Add MultiViewKMeans estimator for multi-feature clustering
Implements MultiViewKMeans estimator with: - Per-view divergences (different distance measures for each view) - Per-view weights (importance weighting) - Combine strategies: "weighted" (default), "max", "min" - ViewSpec case class for view configuration - Full persistence support (save/load) - 21 comprehensive tests Use cases: - Document clustering (content + metadata + citations) - Image clustering (pixels + captions + metadata) - Multi-modal data (text + audio + video features) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 8877c87 commit 76da511

File tree

4 files changed

+1489
-3
lines changed

4 files changed

+1489
-3
lines changed

CHANGELOG.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,13 +28,20 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
2828
- Single entry point for all 8 Bregman divergences
2929
- Auto-selection based on data sparsity with `forSparsity()` method
3030
- Canonical divergence name constants in `KernelFactory.Divergence`
31-
- **Test suites for new components** (129 new tests, 737 total):
31+
- **MultiViewKMeans estimator** for clustering data with multiple feature representations (21 tests)
32+
- Per-view divergences (different distance measures for each view)
33+
- Per-view weights (importance weighting)
34+
- Combine strategies: "weighted" (default), "max", "min"
35+
- `ViewSpec` case class for view configuration
36+
- Full persistence support (save/load)
37+
- **Test suites for new components** (150 new tests, 758 total):
3238
- OutlierDetectionSuite: 16 tests for distance-based and trimmed outlier detection
3339
- SparseBregmanKernelSuite: 28 tests for sparse-optimized SE, KL, L1, Spherical kernels
3440
- ConstraintsSuite: 30 tests for must-link/cannot-link constraints and penalty computation
3541
- ConstrainedKMeansSuite: 17 tests for semi-supervised clustering with soft/hard constraints
3642
- RobustKMeansSuite: 17 tests for robust clustering with outlier handling and persistence
3743
- SparseKMeansSuite: 21 tests for sparse clustering with auto-detection and persistence
44+
- MultiViewKMeansSuite: 21 tests for multi-view clustering with persistence
3845

3946
### Architecture
4047
- Moved AcceleratedSEAssignment to `strategies/impl/` subpackage for better organization

ROADMAP.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Goal: land the highest-demand capabilities and supporting docs.
2525

2626
- ~~**Robust Bregman clustering + outlier handling** (3.11 / 5.8)~~**DONE**: `RobustKMeans` with trim/noise_cluster/m_estimator modes, outlier scoring, persistence.
2727
- ~~**Sparse Bregman clustering** (3.12)~~**DONE**: `SparseKMeans` estimator with auto-sparsity detection, `KernelFactory` for unified kernel creation.
28-
- **Multi-view clustering** (3.13 / 5.9) — implement `MultiViewKMeans` with shared `MultiViewAssignment`, per-view weights/divergences.
28+
- ~~**Multi-view clustering** (3.13 / 5.9)~~**DONE**: `MultiViewKMeans` estimator with per-view weights/divergences, combine strategies (weighted/max/min), `ViewSpec` configuration.
2929
- **Docs & notebooks** (6.1) — quick-start notebook, divergence selection guide, X-Means auto-k demo, soft-clustering interpretation examples.
3030

3131
---
@@ -57,7 +57,7 @@ These frameworks unblock multiple roadmap items; prefer delivering them before d
5757
| Component | Priority | Enables | Notes |
5858
|-----------|----------|---------|-------|
5959
| ~~Outlier Detection (5.8)~~ | ~~P1~~ | ~~Robust Bregman clustering (3.11)~~ | **DONE**: Trim/noise-cluster strategies, scoring column |
60-
| Multi-View (5.9) | P1 | Multi-view clustering (3.13) | View specs, weights, divergences |
60+
| ~~Multi-View (5.9)~~ | ~~P1~~ | ~~Multi-view clustering (3.13)~~ | **DONE**: ViewSpec, per-view weights/divergences, combine strategies |
6161
| Sequence Kernels (5.10) | P2 | Time-series clustering (3.15) | DTW/shape kernels, barycenters |
6262
| Consensus (5.11) | P2 | Ensemble clustering (3.16) | Base generator + co-association |
6363
| Federated (5.12) | P2 | Federated Bregman clustering (3.17) | Secure aggregation, optional DP |
@@ -87,6 +87,7 @@ These frameworks unblock multiple roadmap items; prefer delivering them before d
8787
| 2025-12-15 | Use phased delivery for accelerations and new iterators | Keep CI stable while iterating |
8888
| 2025-12-16 | Created `KernelFactory` for unified kernel creation | Single API for dense/sparse kernels, reduces duplication |
8989
| 2025-12-16 | Moved assignment strategies to `impl/` subpackage | Better organization, backward-compatible via type aliases |
90+
| 2025-12-16 | Implemented `MultiViewKMeans` with ViewSpec configuration | Per-view divergences/weights, weighted/max/min combine strategies |
9091

9192
---
9293

0 commit comments

Comments
 (0)