Alph4d0g · Alph4d0g · May 20, 2026 · May 17, 2026 · May 19, 2026 · May 19, 2026
diff --git a/.gitignore b/.gitignore
@@ -43,3 +43,6 @@ temp/
 
 # Worktrees
 .worktrees/
+
+# Local planning artifacts
+docs/superpowers/
diff --git a/.opencode/config.json b/.opencode/config.json
diff --git a/.opencode/opencode.json b/.opencode/opencode.json
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,6 +2,62 @@
 
 All notable changes to this project are documented in this file.
 
+## [1.2.1] - 2026-05-19
+
+### Added
+
+- **models.dev Reliability Pipeline** — Complete rewrite of `fetchModelsDevData()` with production-grade resilience:
+  - Bounded retry loop (max 3 attempts) with exponential backoff (250ms, 500ms).
+  - Structured failure classification into 6 categories: `timeout`, `network`, `http_retryable`, `http_non_retryable`, `parse`, `invalid_structure`.
+  - Stale in-memory cache fallback: if live refresh fails, previously cached enrichment data is returned instead of skipping enrichment entirely.
+  - Fail-open cold-start behavior: returns `null` only when no cache exists and all attempts fail, preserving plugin functionality.
+  - Per-attempt structured logging: attempt number, failure class, HTTP status (when applicable), and elapsed duration.
+  - Success logging: total elapsed duration and provider count for observability.
+  - Timeout increase: default per-attempt timeout raised from 1000ms to 5000ms.
+- **9 New Test Cases** (`test/models-dev.test.mjs`) covering:
+  - Fresh cache hit (no redundant network call)
+  - Timeout recovery on retry
+  - 503 retryable HTTP failure recovery
+  - Stale cache fallback when all refresh attempts fail
+  - Null return on cold-start total failure
+  - 404 fail-fast behavior (no unnecessary retries)
+  - Invalid response structure with stale cache fallback
+  - Malformed provider entry rejection before cache update
+  - End-to-end integration with `getModelsDevIndex()`
+- **Model Variant Support Fix** — Comprehensive fix for variant-suffixed models (e.g., `codex/gpt-5.5-xhigh`, `codex/gpt-5.5-high`):
+  - Added `groupVariantModels()` in `src/models.ts` — pure two-pass algorithm that merges variant-suffixed models under their base model ID
+  - Added `variants?: Record<string, OmniRouteModelVariant>` to `OmniRouteModel` interface in `src/types.ts`
+  - Extended `OmniRouteModelVariant.reasoningEffort` to include `'xhigh'` (was `'low' | 'medium' | 'high'`)
+  - Synthetic base model creation: when only variants are returned (no explicit base), creates a synthetic base from the first variant with merged metadata (max `contextWindow`, max `maxTokens`, unioned capability flags)
+  - Pipeline integration: `fetchModels()` now flows `normalizeModel` → `deduplicateModels` → `groupVariantModels` → `enrichModelMetadata` → `toProviderModels`
+  - `toProviderModel()` in `src/plugin.ts` now prioritizes pre-populated `model.variants` over generated `{low, medium, high}` defaults
+- **Test Cache Isolation** (`test/plugin.test.mjs`) — Added `clearModelCache()` and `clearModelsDevCache()` to `afterEach` to prevent cross-test contamination from mutable in-memory caches
+- **2 New Regression Tests** (`test/plugin.test.mjs`) covering variant grouping and synthetic base model creation
+- **1 New Regression Test** (`test/models.test.mjs`) covering capability union across grouped variants
+
+### Fixed
+
+- **Default Context Limit** — `DEFAULT_CONTEXT_LIMIT` corrected from `4096` to `128000` to match actual OmniRoute API defaults.
+- **getModelFamily() for Provider-Prefixed Models** — Fixed incorrect family extraction for versioned models with provider prefixes. Before: `getModelFamily('codex/gpt-5.5-xhigh')` → `'codex/gpt'`. After: → `'gpt'`. Implementation now strips provider prefix before splitting on `-`.
+
+### Changed
+
+- **Internal Helpers** — Extracted `fetchModelsDevOnce()`, `shouldRetryModelsDevFailure()`, and `sleep()` helpers in `src/models-dev.ts` to keep retry logic isolated from lookup/index logic.
+
+### Removed
+
+- **Test Config Artifacts** — Removed `.opencode/config.json` and `.opencode/opencode.json` files that were committed accidentally.
+
+### Fixed (Code Review)
+
+- **Cache Isolation** — `modelsDevCache` is now keyed by URL (`Map<string, ModelsDevCache>`) instead of a single global variable. Prevents cross-config data leakage when different configs specify different `modelsDev.url` values. (`src/models-dev.ts`)
+- **JSDoc Accuracy** — Updated `OmniRouteModelsDevConfig.timeoutMs` JSDoc comment from `(default: 1000ms)` to `(default: 5000ms)` to match the actual constant. (`src/types.ts`)
+- **Lockfile Version Sync** — Updated `package-lock.json` version from `1.2.0` to `1.2.1` to match `package.json`. (`package-lock.json`)
+- **Test Suite Speed** — Eliminated real `setTimeout` sleeps from `test/models-dev.test.mjs` by using `cacheTtl: 0` to mark cache immediately stale instead of waiting for TTL expiry. Reduces test runtime and improves scalability.
+- **Latency Documentation** — Added explicit JSDoc on `fetchModelsDevData()` documenting worst-case cold-start latency (~15.75s) as an accepted reliability trade-off per design spec. (`src/models-dev.ts`)
+- **models.dev Structural Validation** — Added runtime validation for provider entries and nested `models` records before accepting fetched models.dev data. Prevents malformed upstream objects from being cached or indexed. (`src/models-dev.ts`)
+- **Variant Capability Union** — Grouped variant models now merge `supportsVision`, `supportsTools`, `supportsStreaming`, `supportsTemperature`, and `supportsAttachment` into the base model when any variant supports them. (`src/models.ts`)
+
 ## [1.2.0] - 2026-05-17
 
 ### Added

diff --git a/docs/release-notes-v1.2.1.md b/docs/release-notes-v1.2.1.md
@@ -0,0 +1,84 @@
+# Release v1.2.1
+
+## Highlights
+
+- **Model Variant Support Fix** — Variant-suffixed models (e.g., `codex/gpt-5.5-xhigh`, `codex/gpt-5.5-high`) are now grouped under their base model ID instead of appearing as duplicate top-level entries. Includes synthetic base model creation, `xhigh` variant support, and `getModelFamily()` fix for provider-prefixed versioned models.
+- **models.dev enrichment no longer fails on transient network slowness.** The fetch pipeline now retries up to 3 times with exponential backoff and falls back to stale cached data if live refresh fails.
+- **Default context limit corrected** from 4096 to 128000 tokens to match OmniRoute API behavior.
+- **Structured observability** for enrichment failures with per-attempt diagnostics and fallback decisions.
+
+## What Changed
+
+### Reliability
+
+- `fetchModelsDevData()` now uses a bounded retry loop:
+  - Maximum 3 attempts with 250ms / 500ms backoff.
+  - Retries on: timeouts (`AbortError`), network errors, HTTP 429, and HTTP 5xx.
+  - Fail-fast on: HTTP 4xx (non-429) and structurally invalid responses.
+- Stale in-memory cache fallback:
+  - If cached data exists but TTL expired, live refresh is attempted first.
+  - If all refresh attempts fail, the stale cached data is returned instead of `null`.
+  - If no cache exists and all attempts fail, returns `null` (safe fail-open).
+- Timeout budget increased from 1000ms to 5000ms per attempt.
+- Failure classification: `timeout`, `network`, `http_retryable`, `http_non_retryable`, `parse`, `invalid_structure`.
+
+### Model Variant Support Fix
+
+**Problem:** When OmniRoute lists variant-suffixed models separately (e.g., `codex/gpt-5.5-xhigh`, `codex/gpt-5.5-high`), each variant appeared as an independent top-level entry with incorrect generated variants (`{low, medium, high}`), causing duplicate/confusing model entries in OpenCode's model picker.
+
+**Solution:**
+- Added `groupVariantModels()` in `src/models.ts` — a pure two-pass algorithm that:
+  1. **Categorizes** models into real bases and variants using `stripVariantSuffix()`
+  2. **Builds result**: real bases pass through unchanged; for each base with variants, merges all variants under the base model with a `variants` Record
+  3. **Synthetic bases**: when only variants are returned (no explicit base), creates a synthetic base from the first variant, copying all fields and setting `id`/`name` to the stripped base ID
+  4. **Metadata merging**: base inherits **max** `contextWindow`, **max** `maxTokens`, and the union of supported capability flags across all variants
+- Integrated into `fetchModels()` pipeline: `normalizeModel` → `deduplicateModels` → `groupVariantModels` → `enrichModelMetadata` → `toProviderModels`
+- Fixed `toProviderModel()` in `src/plugin.ts` to prioritize pre-populated `model.variants` over generated `{low, medium, high}` defaults
+- Added `'xhigh'` to `OmniRouteModelVariant.reasoningEffort` type and generated variants
+
+**Edge Cases Handled:**
+| Scenario | Behavior |
+|----------|----------|
+| Only variants returned, no base model | Creates synthetic base from first variant |
+| Base model + variants both returned | Uses real base; merges variant metadata (max limits) |
+| Non-reasoning suffix (e.g., `-preview`) | `stripVariantSuffix()` ignores it; no grouping |
+| Mixed provider prefixes post-dedup | Grouping operates on canonical IDs |
+| Variant without `supportsReasoning=true` | Still grouped; base becomes `true` if any variant has it |
+
+### Fixes
+
+- `DEFAULT_CONTEXT_LIMIT` corrected from `4096` to `128000`.
+- **`getModelFamily()` for Provider-Prefixed Models** — Fixed incorrect family extraction for versioned models with provider prefixes. `getModelFamily('codex/gpt-5.5-xhigh')` now correctly returns `'gpt'` (was `'codex/gpt'`).
+
+### Code Review Fixes
+
+- **Cache Isolation** — `modelsDevCache` is now keyed by URL (`Map<string, ModelsDevCache>`) to prevent cross-config data leakage when different configs specify different `modelsDev.url` values.
+- **JSDoc Accuracy** — `OmniRouteModelsDevConfig.timeoutMs` JSDoc updated to reflect the new `5000ms` default.
+- **Lockfile Sync** — `package-lock.json` version aligned with `package.json` (`1.2.1`).
+- **Test Suite Speed** — Eliminated real `setTimeout` sleeps from `test/models-dev.test.mjs` by using `cacheTtl: 0` for stale-cache tests. Reduces test runtime and improves scalability.
+- **Latency Documentation** — Explicit JSDoc added on `fetchModelsDevData()` documenting worst-case cold-start latency (~15.75s) as an accepted reliability trade-off.
+- **models.dev Structural Validation** — Fetched models.dev payloads now validate provider entries and nested `models` records before accepting data, preventing malformed upstream objects from entering cache/index paths.
+- **Variant Capability Union** — Grouped variants now merge `supportsVision`, `supportsTools`, `supportsStreaming`, `supportsTemperature`, and `supportsAttachment` into the base model when any variant advertises those capabilities.
+
+### Testing
+
+- Added 9 focused tests in `test/models-dev.test.mjs` covering all retry, cache, fallback, and malformed-provider validation paths.
+- Added 1 unit test in `test/models.test.mjs` covering capability union across grouped variants.
+- Added 2 regression tests in `test/plugin.test.mjs` for variant grouping and synthetic base model creation.
+- Added cache isolation (`clearModelCache()`, `clearModelsDevCache()`) to `test/plugin.test.mjs` `afterEach` to prevent cross-test contamination.
+- Full regression suite: 54/54 tests pass (0 failures).
+
+### Documentation
+
+- Internal `docs/superpowers/` planning/spec artifacts are kept local only and are excluded from the GitHub repository.
+
+## Verification
+
+- `npm run prepublishOnly` passes (`clean`, `build`, `check:exports`).
+- `npm test` passes: 54 tests, 0 failures.
+- TypeScript strict mode compiles cleanly.
+
+## Upgrade Notes
+
+- No breaking changes. Plugin behavior remains safe when `models.dev` is fully unavailable.
+- Existing `modelsDev.timeoutMs` and `modelsDev.cacheTtl` config options continue to work as before.