Skip to content

Commit 5cfbdf1

Browse files
authored
Merge pull request #22 from TextureHQ/feat/provider-capabilities-fallback
RFC 0001: Provider capability model, health scoring, and fallback policy (Phase 1)
2 parents 90f8e29 + 5fdc2a7 commit 5cfbdf1

20 files changed

+830
-3
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,9 @@ web_modules/
7979
.env.production.local
8080
.env.local
8181

82+
# Crush agent metadata
83+
.crush
84+
8285
# parcel-bundler cache (https://parceljs.org/)
8386
.cache
8487
.parcel-cache

CRUSH.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
CRUSH.md — Quick guide for agentic contributors
2+
3+
Build / test / lint
4+
- Install: npm install
5+
- Build all: npm run build (runs build:cjs and build:esm)
6+
- Build CJS only: npm run build:cjs
7+
- Build ESM only: npm run build:esm
8+
- Clean before build: handled by prebuild (rimraf ./dist)
9+
- Run tests: npm test
10+
- Watch tests: npm run test:watch
11+
- Coverage: npm run coverage (outputs coverage/lcov.info)
12+
- Run a single test file: npx jest path/to/file.test.ts
13+
- Run tests by name/pattern: npx jest -t "pattern"
14+
15+
Project conventions
16+
- Language: TypeScript (strict true). ESM and CJS builds emitted to dist.
17+
- Testing: Jest with ts-jest preset; test files use .test.ts/.spec.ts naming.
18+
- Imports: Use module paths as in source; ES import syntax. Prefer named exports; default exports only when necessary.
19+
- Formatting: Follow existing style; no formatter config present. Keep two-space indentation, single quotes, semicolons consistent with repo.
20+
- Types: Prefer explicit return types on public functions; enable strict null checks; use interfaces for shapes and zod for runtime validation when present.
21+
- Naming: camelCase for variables/functions, PascalCase for types/classes, UPPER_SNAKE_CASE for constants.
22+
- Error handling: Throw typed errors from src/errors.ts where applicable; wrap external I/O (axios, redis) and surface meaningful messages without leaking secrets.
23+
- Caching: See src/cache.ts for simple Redis-based cache helpers; ensure keys are namespaced and TTL respected.
24+
- Providers: Implement IWeatherProvider in src/providers/IWeatherProvider.ts; keep provider-specific logic in their subfolders (nws, openweather). Add tests alongside provider code.
25+
- Env/config: No .env checked in; do not commit secrets. Use environment variables at runtime; update README if adding new vars.
26+
27+
Tooling details
28+
- Node 20 in CI; ensure compatibility.
29+
- Jest config: jest.config.ts (ts-jest preset, node environment, ignores dist/*).
30+
- TypeScript configs: tsconfig.cjs.json and tsconfig.esm.json drive dual builds; tsconfig.json is repo default for tooling.
31+
32+
Assistant notes
33+
- No Cursor or Copilot rules files detected.
34+
- Add any new scripts to package.json and mirror in this CRUSH.md.
35+
- Prefer small, testable changes and keep providers decoupled via interfaces.
Lines changed: 126 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,126 @@
1+
# RFC 0001: Provider Capability Model, Health Scoring, and Fallback Policies
2+
3+
Status: Proposed
4+
Authors: TextureHQ
5+
Created: 2025-08-15
6+
Target: Minor release (backward-compatible)
7+
8+
## Motivation
9+
We currently use a fixed provider precedence and implicit fallback. This RFC formalizes:
10+
- Provider capability metadata (what each provider supports)
11+
- Provider health scoring and circuit breakers
12+
- Pluggable fallback policies (priority, priority-then-health, weighted)
13+
- Normalized error taxonomy
14+
All defaults preserve existing behavior.
15+
16+
## Goals & Non-Goals
17+
Goals
18+
- Additive types and options; no breaking API changes
19+
- Deterministic default behavior identical to today
20+
- Clear extension path for new providers
21+
Non-Goals
22+
- Changing existing return shapes by default
23+
- Mandatory logging/metrics dependencies
24+
25+
## High-level Design
26+
We introduce a ProviderRegistry that knows provider capabilities and health, and selects providers per request via a policy engine. WeatherService uses the registry; if no config is provided, it registers built-in providers in current order and uses the priority policy (preserves behavior).
27+
28+
### Capability Model
29+
Each provider publishes static metadata:
30+
- id: string (e.g., "nws", "openweather")
31+
- supports: { current: boolean; hourly?: boolean; daily?: boolean; alerts?: boolean }
32+
- regions?: string[] | GeoJSON region hint (optional)
33+
- units?: ("standard"|"metric"|"imperial")[] (optional)
34+
- locales?: string[] (optional)
35+
36+
### Health Model
37+
We keep a rolling in-memory snapshot per provider:
38+
- successRate: EMA or sliding window over last N calls
39+
- p95LatencyMs: recent percentile
40+
- lastFailureAt?: number (epoch)
41+
- circuit: "closed" | "open" | "half-open"
42+
Outcomes are recorded on every provider call: { ok: boolean, latencyMs, errorCode? }.
43+
44+
### Error Taxonomy
45+
Normalize provider errors to:
46+
- NetworkError, RateLimitError, NotFoundError, ValidationError, ParseError, UpstreamError, UnavailableError
47+
Attach: { provider, status?, retryAfterMs?, endpoint? }. Do not log/propagate secrets.
48+
When all providers fail, return CompositeProviderError with per-provider normalized entries.
49+
50+
### Policies
51+
- priority (default): try in configured order
52+
- priority-then-health: priority order, but skip providers that are open-circuit or below health thresholds; probe half-open last
53+
- weighted: choose initial provider by weights among healthy providers; fallback to next-best healthy
54+
55+
### Configuration (all optional)
56+
WeatherService options additions:
57+
- providerPolicy?: "priority" | "priority-then-health" | "weighted"
58+
- providerWeights?: Record<string, number>
59+
- healthThresholds?: { minSuccessRate?: number; maxP95Ms?: number }
60+
- circuit?: { failureCountToOpen?: number; halfOpenAfterMs?: number; successToClose?: number }
61+
- logger?: { trace/debug/info/warn/error(fielded) }
62+
- metrics?: hooks for counters/histograms (noop by default)
63+
64+
## Detailed Design
65+
66+
### New/updated modules
67+
- src/providers/providerRegistry.ts
68+
- register(providerId, adapter, capability)
69+
- recordOutcome(providerId, outcome)
70+
- getHealth(providerId)
71+
- listProviders(intent): filters by capabilities
72+
- src/providers/policy.ts
73+
- selectCandidates(intent, registry, config): ProviderId[] with reasons for skips
74+
- src/errors.ts (additions)
75+
- new error classes + CompositeProviderError
76+
- src/providers/* adapters
77+
- export capability metadata
78+
- report outcomes via registry hook
79+
- src/weatherService.ts
80+
- wire registry + policy; defaults keep current order and behavior when no options provided
81+
82+
### Data types (sketch)
83+
- ProviderId = "nws" | "openweather" | string
84+
- Capability
85+
- HealthSnapshot
86+
- Outcome: { ok: true, latencyMs } | { ok: false, latencyMs, code, status?, retryAfterMs? }
87+
- PolicyConfig (see above)
88+
89+
### Circuit Breaker
90+
- Open after N consecutive failures
91+
- Half-open after halfOpenAfterMs, allow limited probes
92+
- Close after successToClose consecutive successes
93+
94+
## Backward Compatibility
95+
- Defaults: priority policy, all providers registered in existing precedence, no health filtering, no circuit breaker unless configured with safe defaults (or enabled with conservative thresholds)
96+
- Existing method signatures unchanged
97+
98+
## Observability
99+
- Optional logger interface with structured fields
100+
- Optional metrics hooks (provider_success/failure, latency histograms, circuit_state)
101+
- CorrelationId is accepted/propagated when provided
102+
103+
## Testing Strategy
104+
- Unit: capability validation, policy selection logic, circuit transitions (fake timers), error normalization
105+
- Integration: simulated provider failures and latency distributions; ensure default path equals current behavior
106+
- Snapshot: CompositeProviderError structure
107+
108+
## Rollout Plan
109+
1. Land types/interfaces, provider capability exports, no behavior change
110+
2. Implement registry + outcome recording; keep disabled by default
111+
3. Implement policy engine; enable via options, default unchanged
112+
4. Add circuit + thresholds with conservative defaults (opt-in)
113+
5. Docs + examples; minor release
114+
115+
## Alternatives Considered
116+
- Single global retry layer only (insufficient insight/control)
117+
- Hard-coding health logic inside WeatherService (less modular/extendable)
118+
119+
## Security & Privacy
120+
- Never include tokens or credentials in errors/logs/metrics
121+
- Ensure PII isn’t logged; redact query params as needed
122+
123+
## Open Questions
124+
- Default thresholds for health and circuit when enabled?
125+
- How granular should regions be (country list vs polygons)?
126+
- Should we persist health across process restarts?

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "weather-plus",
3-
"version": "1.1.0",
3+
"version": "1.2.0",
44
"description": "Weather Plus is a powerful wrapper around various Weather APIs that simplifies adding weather data to your application",
55
"main": "./dist/cjs/index.js",
66
"module": "./dist/esm/index.js",

src/providers/capabilities.test.ts

Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
import { OPENWEATHER_CAPABILITY } from './openweather/client';
2+
import { NWS_CAPABILITY } from './nws/client';
3+
import { ProviderCapability } from './capabilities';
4+
5+
describe('provider capabilities', () => {
6+
it('NWS_CAPABILITY matches expected shape and values', () => {
7+
const cap: ProviderCapability = NWS_CAPABILITY;
8+
expect(cap.supports.current).toBe(true);
9+
expect(cap.supports.hourly).toBeFalsy();
10+
expect(cap.supports.daily).toBeFalsy();
11+
expect(cap.supports.alerts).toBeFalsy();
12+
expect(cap.regions).toContain('US');
13+
});
14+
15+
it('OPENWEATHER_CAPABILITY matches expected shape and values', () => {
16+
const cap: ProviderCapability = OPENWEATHER_CAPABILITY;
17+
expect(cap.supports.current).toBe(true);
18+
expect(cap.supports.hourly).toBe(true);
19+
expect(cap.supports.daily).toBe(true);
20+
expect(cap.supports.alerts).toBe(true);
21+
expect(cap.units).toEqual(expect.arrayContaining(['standard', 'metric', 'imperial']));
22+
});
23+
24+
it('validates the built-in capabilities map explicitly', () => {
25+
const map = {
26+
nws: NWS_CAPABILITY,
27+
openweather: OPENWEATHER_CAPABILITY,
28+
} as const;
29+
expect(map).toEqual({
30+
nws: {
31+
supports: { current: true, hourly: false, daily: false, alerts: false },
32+
regions: ['US'],
33+
},
34+
openweather: {
35+
supports: { current: true, hourly: true, daily: true, alerts: true },
36+
units: ['standard', 'metric', 'imperial'],
37+
locales: [],
38+
},
39+
});
40+
});
41+
});

src/providers/capabilities.ts

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
export type ProviderId = 'nws' | 'openweather' | (string & {});
2+
3+
export interface ProviderCapability {
4+
supports: {
5+
current: boolean;
6+
hourly?: boolean;
7+
daily?: boolean;
8+
alerts?: boolean;
9+
};
10+
regions?: string[];
11+
units?: Array<'standard' | 'metric' | 'imperial'>;
12+
locales?: string[];
13+
}
14+
15+
export type ProviderCircuitState = 'closed' | 'open' | 'half-open';
16+
17+
export interface ProviderHealthSnapshot {
18+
successRate: number;
19+
p95LatencyMs?: number;
20+
lastFailureAt?: number;
21+
circuit: ProviderCircuitState;
22+
}
23+
24+
export type ProviderErrorCode =
25+
| 'NetworkError'
26+
| 'RateLimitError'
27+
| 'NotFoundError'
28+
| 'ValidationError'
29+
| 'ParseError'
30+
| 'UpstreamError'
31+
| 'UnavailableError';
32+
33+
export type ProviderCallOutcome =
34+
| { ok: true; latencyMs: number }
35+
| { ok: false; latencyMs: number; code: ProviderErrorCode; status?: number; retryAfterMs?: number };
36+
37+
export interface FallbackPolicyConfig {
38+
providerPolicy?: 'priority' | 'priority-then-health' | 'weighted';
39+
providerWeights?: Record<string, number>;
40+
healthThresholds?: { minSuccessRate?: number; maxP95Ms?: number };
41+
circuit?: { failureCountToOpen?: number; halfOpenAfterMs?: number; successToClose?: number };
42+
}

src/providers/capabilitiesMap.ts

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
import { OPENWEATHER_CAPABILITY } from './openweather/client';
2+
import { NWS_CAPABILITY } from './nws/client';
3+
import { ProviderCapability } from './capabilities';
4+
5+
export function getBuiltInCapabilities(): Record<string, ProviderCapability> {
6+
return {
7+
nws: NWS_CAPABILITY,
8+
openweather: OPENWEATHER_CAPABILITY,
9+
} as const;
10+
}

src/providers/index.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
11
export { IWeatherProvider } from './IWeatherProvider';
22
export { NWSProvider } from './nws/client';
33
export { OpenWeatherProvider } from './openweather/client';
4+
export * from './capabilities';
5+
export { defaultOutcomeReporter, NoopProviderOutcomeReporter, ProviderOutcomeReporter } from './outcomeReporter';

src/providers/nws/client.ts

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,11 +12,18 @@ import { InvalidProviderLocationError } from '../../errors'; // Import the error
1212
import { isLocationInUS } from '../../utils/locationUtils';
1313
import { standardizeCondition } from './condition';
1414
import { getCloudinessFromCloudLayers } from './cloudiness';
15+
import { ProviderCapability } from '../capabilities';
16+
import { defaultOutcomeReporter } from '../outcomeReporter';
1517

1618
const log = debug('weather-plus:nws:client');
1719

1820
export const WEATHER_KEYS = Object.values(IWeatherKey);
1921

22+
export const NWS_CAPABILITY: ProviderCapability = Object.freeze({
23+
supports: { current: true, hourly: false, daily: false, alerts: false },
24+
regions: ['US'],
25+
});
26+
2027
export class NWSProvider implements IWeatherProvider {
2128
name = 'nws';
2229

@@ -31,6 +38,7 @@ export class NWSProvider implements IWeatherProvider {
3138
const data: Partial<IWeatherProviderWeatherData> = {};
3239
const weatherData: Partial<IWeatherProviderWeatherData>[] = [];
3340

41+
const start = Date.now();
3442
try {
3543
const observationStations = await fetchObservationStationUrl(lat, lng);
3644
const stations = await fetchNearbyStations(observationStations);
@@ -78,9 +86,17 @@ export class NWSProvider implements IWeatherProvider {
7886
throw new Error('Invalid observation data');
7987
}
8088

89+
defaultOutcomeReporter.record('nws', { ok: true, latencyMs: Date.now() - start });
8190
return data;
8291
} catch (error) {
8392
log('Error in getWeather:', error);
93+
try {
94+
defaultOutcomeReporter.record('nws', {
95+
ok: false,
96+
latencyMs: Date.now() - start,
97+
code: 'UpstreamError',
98+
});
99+
} catch {}
84100
throw error;
85101
}
86102
}

src/providers/openweather/client.ts

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,17 @@ import { IWeatherUnits, IWeatherProviderWeatherData } from '../../interfaces';
44
import { IOpenWeatherResponse } from './interfaces';
55
import { IWeatherProvider } from '../IWeatherProvider';
66
import { standardizeCondition} from './condition';
7+
import { ProviderCapability } from '../capabilities';
8+
import { defaultOutcomeReporter } from '../outcomeReporter';
79

810
const log = debug('weather-plus:openweather:client');
911

12+
export const OPENWEATHER_CAPABILITY: ProviderCapability = Object.freeze({
13+
supports: { current: true, hourly: true, daily: true, alerts: true },
14+
units: ['standard', 'metric', 'imperial'] as Array<'standard' | 'metric' | 'imperial'>,
15+
locales: [] as string[],
16+
});
17+
1018
export class OpenWeatherProvider implements IWeatherProvider {
1119
private apiKey: string;
1220
name = 'openweather';
@@ -19,6 +27,7 @@ export class OpenWeatherProvider implements IWeatherProvider {
1927
}
2028

2129
public async getWeather(lat: number, lng: number): Promise<Partial<IWeatherProviderWeatherData>> {
30+
const start = Date.now();
2231
const url = `https://api.openweathermap.org/data/3.0/onecall`;
2332

2433
const params = {
@@ -32,9 +41,22 @@ export class OpenWeatherProvider implements IWeatherProvider {
3241

3342
try {
3443
const response = await axios.get<IOpenWeatherResponse>(url, { params });
35-
return convertToWeatherData(response.data);
36-
} catch (error) {
44+
const result = convertToWeatherData(response.data);
45+
defaultOutcomeReporter.record('openweather', { ok: true, latencyMs: Date.now() - start });
46+
return result;
47+
} catch (error: any) {
3748
log('Error in getWeather:', error);
49+
try {
50+
defaultOutcomeReporter.record('openweather', {
51+
ok: false,
52+
latencyMs: Date.now() - start,
53+
code: 'UpstreamError',
54+
status: error?.response?.status,
55+
retryAfterMs: error?.response?.headers?.['retry-after']
56+
? Number(error.response.headers['retry-after']) * 1000
57+
: undefined,
58+
});
59+
} catch {}
3860
throw error;
3961
}
4062
}

0 commit comments

Comments
 (0)