Skip to content

Commit 4bb21d5

Browse files
derrickburnsclaude
andcommitted
chore: Remove obsolete documentation files and update references
- Remove docs/guides/ (replaced by Diátaxis _tutorials/_howto/_reference/_explanation) - Remove docs/README.md (referenced removed Scaladoc API) - Remove release-notes/ (consolidated into CHANGELOG.md) - Remove DATAFRAME_API_EXAMPLES.md (covered by tutorials) - Remove RELEASE_NOTES_0.6.0.md (consolidated into CHANGELOG.md) - Remove HACKERNEWS_ANNOUNCEMENT.md (promotional material) - Update AGENTS.md to reference new Diátaxis structure - Update CHANGELOG.md with documentation restructure and removed files - Update ROADMAP.md decision log with documentation changes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent f0f84bc commit 4bb21d5

File tree

12 files changed

+22
-3745
lines changed

12 files changed

+22
-3745
lines changed

AGENTS.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,9 @@
33
Use this guide to make concise, high-signal contributions to the generalized k-means clustering library.
44

55
## Project Structure & Module Organization
6-
- Scala sources live in `src/main/scala` (DataFrame/ML API under `com.massivedatascience.clusterer.ml`), with version-specific shims in `src/main/scala-2.12` and `src/main/scala-2.13`. Legacy RDD code remains in `com.massivedatascience.clusterer`.
6+
- Scala sources live in `src/main/scala` (DataFrame/ML API under `com.massivedatascience.clusterer.ml`), with version-specific shims in `src/main/scala-2.12` and `src/main/scala-2.13`.
77
- Tests use ScalaTest under `src/test/scala` with Spark-local fixtures; shared data is in `src/test/resources`. Executable examples sit in `src/main/scala/examples`.
8-
- Python wrapper lives in `python/` (`massivedatascience` package, examples, and tests). Docs and release notes are in `docs/`, `release-notes/`, `ARCHITECTURE.md`, and `DATAFRAME_API_EXAMPLES.md`.
8+
- Python wrapper lives in `python/` (`massivedatascience` package, examples, and tests). Documentation follows Diátaxis structure in `docs/` (`_tutorials/`, `_howto/`, `_reference/`, `_explanation/`). Architecture and changelog are in `ARCHITECTURE.md` and `CHANGELOG.md`.
99

1010
## Build, Test, and Development Commands
1111
- `sbt compile` — compile against the default Scala/Spark matrix; use `sbt ++2.13.14` or `sbt ++2.12.18` to pin versions.

CHANGELOG.md

Lines changed: 15 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
99

1010
### Added
1111
- Comprehensive CI validation DAG with cross-version testing
12-
- Production quality blockers documented in ACTION_ITEMS.md
1312
- SECURITY.md with vulnerability reporting guidelines
1413
- CONTRIBUTING.md with development guidelines
1514
- Test suite fixes for Scala 2.12/2.13 compatibility
@@ -63,11 +62,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
6362
- Inertia (WCSS) computation
6463
- Cluster sizes and balance metrics
6564
- Elbow curve helper for finding optimal k
66-
- **Documentation guides** in `docs/guides/`:
67-
- Quick Start Guide - get running in 5 minutes
68-
- Divergence Selection Guide - comprehensive decision flowchart and examples
69-
- X-Means Auto-K Demo - automatic cluster count selection with BIC/AIC
70-
- Soft Clustering Guide - interpreting probabilistic memberships
65+
- **Documentation restructured** using Diátaxis framework:
66+
- `docs/_tutorials/` - Learning-oriented guides (first clustering, PySpark, algorithm selection)
67+
- `docs/_howto/` - Task-oriented guides (installation, choosing divergence, finding optimal k, outliers)
68+
- `docs/_reference/` - Information-oriented (parameters, algorithms, divergences)
69+
- `docs/_explanation/` - Understanding-oriented (Bregman divergences, Lloyd's algorithm, performance)
70+
- Custom domain: https://generalized-kmeans-clustering.massivedatascience.com
71+
- Jekyll-based site with automatic deployment
72+
- **SPECIFICATION.md** - Compressed specification enabling AI reconstruction of the codebase
7173
- **Test suites for new components** (215 new tests, 854 total):
7274
- OutlierDetectionSuite: 16 tests for distance-based and trimmed outlier detection
7375
- SparseBregmanKernelSuite: 28 tests for sparse-optimized SE, KL, L1, Spherical kernels
@@ -106,6 +108,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
106108

107109
### Removed
108110
- Legacy RDD API and associated coreset/transform modules (DataFrame/ML API is now the sole surface)
111+
- Obsolete documentation files consolidated into Diátaxis structure:
112+
- `ACTION_ITEMS.md` (superseded by ROADMAP.md)
113+
- `DATAFRAME_API_EXAMPLES.md` (covered by tutorials)
114+
- `RELEASE_NOTES_0.6.0.md` (consolidated into CHANGELOG.md)
115+
- `HACKERNEWS_ANNOUNCEMENT.md` (promotional material)
116+
- `docs/guides/` (replaced by Diátaxis categories)
117+
- `release-notes/` directory (consolidated into CHANGELOG.md)
109118

110119
## [0.6.0] - 2025-10-18
111120

0 commit comments

Comments
 (0)