2026-03-04 Progressive Disclosure API + project identity documentation

eddmpython · eddmpython · commit 6fdd32fcc7f4 · 2026-03-04T14:22:30.000+09:00
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -5,6 +5,36 @@ All notable changes to Vectrix will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [0.0.11] - 2026-03-04
+
+Progressive Disclosure release — Easy API now supports Level 2 guided control with model selection, ensemble strategy, and confidence interval parameters, while maintaining full backward compatibility with Level 1 zero-config usage.
+
+### Added
+
+**Easy API Progressive Disclosure (Level 2 Parameters)**
+- `forecast()`: `models=` (select specific model IDs), `ensemble=` ('mean', 'weighted', 'median', 'best'), `confidence=` (0.80~0.99 CI level)
+- `analyze()`: `features=` (toggle feature extraction), `changepoints=` (toggle detection), `anomalies=` (toggle detection), `anomaly_threshold=` (z-score sensitivity)
+- `regress()`: `alpha=` (regularization strength for ridge/lasso), `diagnostics=` (auto-run diagnostics)
+- `compare()`: `models=` (compare specific model subset)
+
+**Vectrix Class Level 2 Parameters**
+- `Vectrix.forecast()`: `models=`, `ensembleMethod=`, `confidenceLevel=` parameters with full validation
+- Ensemble methods: 'mean' (simple average), 'weighted' (inverse-MAPE), 'median', 'best' (single model)
+- Confidence interval rescaling from 0.95 default to any level using scipy.stats.norm
+
+**Documentation**
+- README.md / README_KR.md: "Philosophy & Roadmap" section with identity, API layers (Level 1-3), roadmap, expansion principles
+- CLAUDE.md: Expansion/maintenance principles (API design, engine, speed, accuracy, docs/marketing)
+- Updated Easy API examples showing Level 1 and Level 2 usage side by side
+
+### Changed
+
+- `easy.py`: All functions now accept Level 2 parameters with sensible defaults preserving Level 1 behavior
+- `vectrix.py`: `forecast()` accepts `models`, `ensembleMethod`, `confidenceLevel` with validation and error messages
+- Identity principles updated: added Progressive Disclosure and benchmark-based honesty
+
+[0.0.11]: https://github.com/eddmpython/vectrix/compare/v0.0.10...v0.0.11
+
 ## [0.0.10] - 2026-03-04
 
 Research & stability release — 16 DOT improvement experiments (E020~E030, E013~E015), DOT-Hybrid engine OWA 0.885 re-verified, CES combination approach tested and rolled back.
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -347,6 +347,35 @@ src/vectrix/
 - 구현: numpy + scipy.optimize, 난이도 중
 - 기대: "가장 놀랍지 않은" 예측 = 과적합 방지
 
+## 확장/유지보수 원칙 (2026-03-04 확정)
+
+### API 설계 원칙
+1. **Progressive Disclosure**: Level 1(제로설정) → Level 2(가이드 제어) → Level 3(엔진 직접)
+2. **새 파라미터 추가 시 반드시 기본값 제공** — `forecast(data, steps=12)`는 영원히 동작해야 함
+3. **Easy API 파라미터는 Vectrix 클래스의 기능을 투과** — easy.py가 vectrix.py를 래핑, 중복 구현 금지
+4. **파라미터 네이밍**: Easy API는 snake_case 허용 (models, ensemble, confidence), 내부는 camelCase
+
+### 엔진 확장 원칙
+1. **새 모델 추가 시**: engine/ 아래 독립 파일, `fit(data)` + `predict(steps)` 인터페이스 준수
+2. **M4 100K 벤치마크 통과 필수** — OWA < 1.0 (Naive2 대비) 확인 후 통합
+3. **NATIVE_MODELS dict에 등록** + `__init__.py` export + 테스트 추가
+4. **잔차 다양성 우선** — 기존 모델과 잔차 상관 < 0.5인 모델이 앙상블에 가치 있음
+
+### 속도 확장 원칙
+1. **핫 루프 식별 → Rust 이전** — profiling으로 병목 확인 후 rust/src/lib.rs에 추가
+2. **Python 오버헤드 최소화** — 모델 선택/CV 로직의 불필요한 복사/변환 제거
+3. **벤치마크 측정 필수** — 변경 전후 `forecast()` 전체 latency 비교
+
+### 정확도 확장 원칙
+1. **앙상블 전략이 단일 모델보다 중요** — DNA 기반 가중치, 잔차 다양성 활용
+2. **빈도별 전략 분리** — Yearly/Quarterly는 Theta계열, Hourly/Daily는 다중 계절성 특화
+3. **실험 → 검증 → 통합 파이프라인** — experiments/에서 실험, M4로 검증, engine/으로 통합
+
+### 문서/마케팅 원칙
+1. **모든 주장에 벤치마크 수치 첨부** — "빠르다"가 아닌 "5.6x faster (295ms → 52ms)"
+2. **블로그는 교육 중심** — 기초부터 설명, Vectrix 홍보는 자연스럽게 녹여냄
+3. **비교 표는 공정하게** — 경쟁사의 장점도 인정, 우리가 약한 부분도 투명하게 공개
+
 ## 약점 및 개선 필요사항 (2026-03-03 업데이트)
 
 ### [긴급] 정확도 — 가장 큰 문제
@@ -389,12 +418,14 @@ src/vectrix/
 - AI 에이전트 통합 (llms.txt, MCP, Skills) — 업계 최선두
 - M4 Hourly VX-Ensemble OWA 0.696 — 세계급
 
-## 아이덴티티 원칙 (2026-03-03 확정)
+## 아이덴티티 원칙 (2026-03-04 확정)
 
 - **"Pure Python" 표현 사용 금지** — Rust 엔진이 내장된 패키지
 - **30+ 모델은 기본 능력** — 차별점으로 내세우지 않음
 - **Rust는 투명** — Polars처럼 사용자가 의식하지 않아도 빠름
 - **Python 문법, Rust 속도** — 이것이 정체성
+- **Progressive Disclosure** — 초보자는 제로설정, 전문가는 완전 제어. 같은 함수, 같은 패키지
+- **벤치마크 기반 정직성** — 약한 부분(Daily/Hourly)도 투명하게 공개. 수치로 증명
 
 ## 개선 로드맵 (2026-03-03~)
 
diff --git a/README.md b/README.md
@@ -273,7 +273,15 @@ pip install "vectrix[all]"         # Everything
 ```python
 from vectrix import forecast, analyze, regress, compare
 
+# Level 1 — Zero Config
 result = forecast([100, 120, 115, 130, 125, 140], steps=5)
+
+# Level 2 — Guided Control
+result = forecast(df, date="date", value="sales", steps=12,
+                  models=["dot", "auto_ets", "auto_ces"],
+                  ensemble="mean",
+                  confidence=0.90)
+
 print(result.compare())          # All model rankings
 print(result.all_forecasts())    # Every model's predictions
 
@@ -369,11 +377,13 @@ Full results with sMAPE/MASE breakdown: [benchmarks](https://eddmpython.github.i
 
 | Function | Description |
 |:---------|:------------|
-| `forecast(data, steps=30)` | Auto model selection forecasting |
-| `analyze(data)` | DNA profiling, changepoints, anomalies |
+| `forecast(data, steps, models, ensemble, confidence)` | Auto or guided forecasting |
+| `analyze(data, period, features)` | DNA profiling, changepoints, anomalies |
 | `regress(y, X)` / `regress(data=df, formula="y ~ x")` | Regression with diagnostics |
-| `compare(data, steps=30)` | All model comparison (DataFrame) |
-| `quick_report(data, steps=30)` | Combined analysis + forecast |
+| `compare(data, steps, models)` | Model comparison (DataFrame) |
+| `quick_report(data, steps)` | Combined analysis + forecast |
+
+All parameters beyond `data` are optional with sensible defaults. See [Progressive Disclosure](#api-layers) for the Level 1 → 2 → 3 design.
 
 ### Classic API
 
@@ -488,6 +498,47 @@ Skills are auto-loaded when working in the Vectrix project directory.
 
 <br>
 
+## ◈ Philosophy & Roadmap
+
+### Identity
+
+Vectrix is a **zero-config forecasting engine with built-in Rust acceleration**. The design philosophy:
+
+- **Python syntax, Rust speed** — Like Polars, the Rust engine is invisible. Users write Python; hot loops run in Rust automatically.
+- **Progressive disclosure** — Beginners call `forecast(data, steps=12)` with zero configuration. Experts pass `models=`, `ensemble=`, `confidence=` to control every aspect. Engine-level access (`AutoETS`, `AutoARIMA`) is always available for full control.
+- **3 dependencies, no compiler** — NumPy, SciPy, Pandas. No system packages, no Numba JIT warmup, no CmdStan. `pip install vectrix` and you're done.
+- **Correctness over features** — We'd rather have 15 models that beat Naive2 on every frequency than 50 models that fail on Daily and Hourly.
+
+### API Layers
+
+| Layer | Target | Example |
+|:------|:-------|:--------|
+| **Level 1 — Zero Config** | Beginners, quick prototypes | `forecast(data, steps=12)` |
+| **Level 2 — Guided Control** | Data scientists, production | `forecast(data, steps=12, models=["dot", "auto_ets"], ensemble="mean", confidence=0.90)` |
+| **Level 3 — Engine Direct** | Researchers, custom pipelines | `AutoETS(period=7).fit(data).predict(30)` |
+
+Every parameter at Level 2 has a sensible default that reproduces Level 1 behavior. No parameter is ever required.
+
+### Roadmap
+
+| Priority | Area | Current | Target | Status |
+|:---------|:-----|:--------|:-------|:-------|
+| **P0** | M4 Accuracy | OWA 0.885 | OWA < 0.850 | In progress |
+| **P1** | Easy API Progressive Disclosure | Level 1 only | Levels 1-3 | In progress |
+| **P2** | Pipeline Speed | 48ms forecast() | < 10ms | Planned |
+| **P3** | Foundation Model Depth | Basic wrappers | Full integration | Planned |
+| **P4** | Community Growth | Early stage | Blog, Reddit, Kaggle | In progress |
+
+### Expansion Principles
+
+1. **Accuracy first, speed second** — A wrong answer delivered fast is still wrong. Improve M4 OWA before optimizing latency.
+2. **Never break zero-config** — Every new parameter must have a default. `forecast(data, steps=12)` must always work.
+3. **One identity** — "Python syntax, Rust speed, zero config." Every feature, doc, and marketing message aligns with this.
+4. **Benchmark-driven** — Every engine change is validated against M4 100K series. No "it seems better" — show the OWA.
+5. **Minimal dependencies** — Adding a dependency requires strong justification. If it can be implemented in numpy/scipy, it should be.
+
+<br>
+
 ## ◈ Contributing
 
 ```bash
diff --git a/README_KR.md b/README_KR.md
@@ -270,7 +270,15 @@ pip install "vectrix[all]"           # 전체
 ```python
 from vectrix import forecast, analyze, regress, compare
 
+# Level 1 — 제로 설정
 result = forecast([100, 120, 115, 130, 125, 140], steps=5)
+
+# Level 2 — 가이드 제어
+result = forecast(df, date="date", value="sales", steps=12,
+                  models=["dot", "auto_ets", "auto_ces"],
+                  ensemble="mean",
+                  confidence=0.90)
+
 print(result.compare())          # 전체 모델 순위
 print(result.all_forecasts())    # 모든 모델의 예측값
 
@@ -366,11 +374,13 @@ sMAPE/MASE 상세 결과: [벤치마크 상세](https://eddmpython.github.io/vec
 
 | 함수 | 설명 |
 |:-----|:-----|
-| `forecast(data, steps=30)` | 자동 모델 선택 예측 |
-| `analyze(data)` | DNA 프로파일링, 변환점, 이상치 |
+| `forecast(data, steps, models, ensemble, confidence)` | 자동 또는 가이드 예측 |
+| `analyze(data, period, features)` | DNA 프로파일링, 변환점, 이상치 |
 | `regress(y, X)` / `regress(data=df, formula="y ~ x")` | 진단 포함 회귀분석 |
-| `compare(data, steps=30)` | 전체 모델 비교 (DataFrame) |
-| `quick_report(data, steps=30)` | 분석 + 예측 통합 |
+| `compare(data, steps, models)` | 모델 비교 (DataFrame) |
+| `quick_report(data, steps)` | 분석 + 예측 통합 |
+
+`data` 이외의 모든 파라미터는 합리적인 기본값을 가진 선택 사항입니다. [Progressive Disclosure](#api-레이어) 설계를 참조하세요.
 
 ### Classic API
 
@@ -485,6 +495,47 @@ Vectrix 프로젝트 디렉토리에서 작업할 때 자동으로 로드됩니
 
 <br>
 
+## ◈ 철학 & 로드맵
+
+### 정체성
+
+Vectrix는 **내장 Rust 가속을 갖춘 제로 설정 예측 엔진**입니다. 설계 철학:
+
+- **Python 문법, Rust 속도** — Polars처럼 Rust 엔진은 투명합니다. 사용자는 Python을 작성하고, 핫 루프는 자동으로 Rust에서 실행됩니다.
+- **점진적 공개(Progressive Disclosure)** — 초보자는 `forecast(data, steps=12)`를 설정 없이 호출합니다. 전문가는 `models=`, `ensemble=`, `confidence=`를 전달하여 모든 측면을 제어합니다. 엔진 직접 접근(`AutoETS`, `AutoARIMA`)은 항상 가능합니다.
+- **의존성 3개, 컴파일러 불필요** — NumPy, SciPy, Pandas. 시스템 패키지, Numba JIT 워밍업, CmdStan 없음. `pip install vectrix`로 끝.
+- **기능보다 정확성** — 모든 빈도에서 Naive2를 이기는 15개 모델이, Daily/Hourly에서 실패하는 50개 모델보다 낫습니다.
+
+### API 레이어
+
+| 레이어 | 대상 | 예시 |
+|:-------|:-----|:-----|
+| **Level 1 — 제로 설정** | 초보자, 빠른 프로토타입 | `forecast(data, steps=12)` |
+| **Level 2 — 가이드 제어** | 데이터 과학자, 프로덕션 | `forecast(data, steps=12, models=["dot", "auto_ets"], ensemble="mean", confidence=0.90)` |
+| **Level 3 — 엔진 직접** | 연구자, 커스텀 파이프라인 | `AutoETS(period=7).fit(data).predict(30)` |
+
+Level 2의 모든 파라미터에는 Level 1 동작을 재현하는 합리적인 기본값이 있습니다. 필수 파라미터는 없습니다.
+
+### 로드맵
+
+| 우선순위 | 영역 | 현재 | 목표 | 상태 |
+|:---------|:-----|:-----|:-----|:-----|
+| **P0** | M4 정확도 | OWA 0.885 | OWA < 0.850 | 진행 중 |
+| **P1** | Easy API Progressive Disclosure | Level 1만 | Level 1-3 | 진행 중 |
+| **P2** | 파이프라인 속도 | 48ms forecast() | < 10ms | 계획 |
+| **P3** | Foundation Model 깊이 | 기본 래퍼 | 완전 통합 | 계획 |
+| **P4** | 커뮤니티 성장 | 초기 단계 | 블로그, Reddit, Kaggle | 진행 중 |
+
+### 확장 원칙
+
+1. **정확도 우선, 속도 그다음** — 빠르게 전달된 잘못된 답은 여전히 잘못됩니다. M4 OWA 개선이 지연 시간 최적화보다 먼저.
+2. **제로 설정을 절대 깨지 않는다** — 모든 새 파라미터에는 기본값이 있어야 합니다. `forecast(data, steps=12)`는 항상 작동해야 합니다.
+3. **하나의 정체성** — "Python 문법, Rust 속도, 제로 설정." 모든 기능, 문서, 마케팅 메시지가 이에 부합합니다.
+4. **벤치마크 기반** — 모든 엔진 변경은 M4 100K 시리즈로 검증합니다. "더 좋아 보인다"가 아닌 OWA로 보여주기.
+5. **최소 의존성** — 의존성 추가에는 강력한 근거가 필요합니다. numpy/scipy로 구현 가능하면 그렇게 해야 합니다.
+
+<br>
+
 ## ◈ 기여
 
 ```bash
diff --git a/pyproject.toml b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "vectrix"
-version = "0.0.10"
+version = "0.0.11"
 description = "Zero-config time series forecasting & analysis library. 30+ models with built-in Rust engine for blazing-fast performance."
 readme = "README.md"
 license = {file = "LICENSE"}
diff --git a/src/vectrix/__init__.py b/src/vectrix/__init__.py
@@ -82,7 +82,7 @@
 )
 from .vectrix import Vectrix
 
-__version__ = "0.0.10"
+__version__ = "0.0.11"
 __all__ = [
     "Vectrix",
     "ForecastResult",
diff --git a/src/vectrix/easy.py b/src/vectrix/easy.py
diff --git a/src/vectrix/vectrix.py b/src/vectrix/vectrix.py

Original file line number	Diff line number	Diff line change
`@@ -82,7 +82,7 @@`
`82`	`82`	`)`
`83`	`83`	`from .vectrix import Vectrix`
`84`	`84`
`85`		`-__version__ = "0.0.10"`
	`85`	`+__version__ = "0.0.11"`
`86`	`86`	`__all__ = [`
`87`	`87`	`"Vectrix",`
`88`	`88`	`"ForecastResult",`