Skip to content

Commit 8360d8c

Browse files
committed
feat(quota): ✨ add Firmware.ai quota tracking with 5-hour rolling window
Implement quota tracking for Firmware.ai provider using their /api/v1/quota endpoint. The provider tracks a 5-hour rolling window quota where `used` is already a 0-1 ratio from the API. - Add FirmwareQuotaTracker mixin with configurable api_base - Add FirmwareProvider with background job for periodic quota refresh - Parse ISO 8601 reset timestamps with proper Z suffix handling - Validate API response types and clamp remaining fraction to 0.0-1.0 - Support FIRMWARE_QUOTA_REFRESH_INTERVAL env var (default: 300s)
1 parent b7c5dad commit 8360d8c

File tree

5 files changed

+697
-5
lines changed

5 files changed

+697
-5
lines changed

PLAN.md

Lines changed: 194 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,194 @@
1+
# Firmware.ai Quota Tracking Implementation Plan
2+
3+
## Overview
4+
5+
Implement quota tracking for the Firmware.ai provider based on their `/api/v1/quota` API endpoint. This follows the established mixin pattern used by Chutes and NanoGPT quota tracking.
6+
7+
## API Specification
8+
9+
**Endpoint:** `GET https://app.firmware.ai/api/v1/quota`
10+
**Authentication:** `Authorization: Bearer <api_key>`
11+
12+
**Response:**
13+
```json
14+
{
15+
"used": 0.75, // Ratio from 0 to 1 (quota utilization)
16+
"reset": "2024-01-15T14:30:00Z" // ISO UTC timestamp, or null when no active window
17+
}
18+
```
19+
20+
**Key Characteristics:**
21+
- **5-hour rolling window** (unlike Chutes daily / NanoGPT daily+monthly)
22+
- `used` is already a ratio (no calculation needed, unlike Chutes which returns absolute values)
23+
- `reset` can be `null` when no credits have been spent recently
24+
- Simpler response than other providers
25+
26+
## Implementation Approach
27+
28+
### Pattern Analysis
29+
30+
| Aspect | Chutes | NanoGPT | Firmware.ai (Proposed) |
31+
|--------|--------|---------|------------------------|
32+
| Quota Window | Daily (00:00 UTC) | Daily + Monthly | 5-hour rolling |
33+
| API Response | `{quota: int, used: float}` | `{daily: {...}, monthly: {...}}` | `{used: 0-1, reset: ISO\|null}` |
34+
| Calculation | `remaining = quota - used` | `remaining / limit` | `remaining = 1 - used` |
35+
| Reset Handling | Calculate next midnight | Parse from API | Parse ISO string |
36+
| Tier Detection | From quota value | From state field | N/A (no tiers) |
37+
38+
### Files to Create
39+
40+
#### 1. `src/rotator_library/providers/utilities/firmware_quota_tracker.py`
41+
42+
Mixin class providing quota tracking functionality:
43+
44+
```python
45+
class FirmwareQuotaTracker:
46+
"""
47+
Mixin class providing quota tracking for Firmware.ai provider.
48+
49+
Required provider attributes:
50+
self._quota_cache: Dict[str, Dict[str, Any]] = {}
51+
self._quota_refresh_interval: int = 300
52+
"""
53+
54+
async def fetch_quota_usage(api_key, client=None) -> Dict[str, Any]:
55+
"""
56+
Returns:
57+
{
58+
"status": "success" | "error",
59+
"error": str | None,
60+
"used": float, # 0.0 to 1.0
61+
"remaining_fraction": float, # 1.0 - used
62+
"reset_at": float | None, # Unix timestamp
63+
"has_active_window": bool, # True if reset is not null
64+
"fetched_at": float,
65+
}
66+
"""
67+
```
68+
69+
**Key Implementation Details:**
70+
- Parse ISO 8601 timestamp to Unix timestamp
71+
- Handle `null` reset gracefully (no active spend window)
72+
- Since `used` is already a ratio, `remaining_fraction = 1.0 - used`
73+
- No tier detection needed (Firmware.ai doesn't expose tier info)
74+
75+
#### 2. `src/rotator_library/providers/firmware_provider.py` (Modifications)
76+
77+
Integrate the mixin into the existing Firmware provider:
78+
79+
```python
80+
from .utilities.firmware_quota_tracker import FirmwareQuotaTracker
81+
82+
class FirmwareProvider(FirmwareQuotaTracker, ProviderInterface):
83+
def __init__(self, ...):
84+
# Add quota tracking state
85+
self._quota_cache: Dict[str, Dict[str, Any]] = {}
86+
self._quota_refresh_interval = int(
87+
os.getenv("FIRMWARE_QUOTA_REFRESH_INTERVAL", "300")
88+
)
89+
90+
def get_background_job_config(self) -> Optional[Dict[str, Any]]:
91+
return {
92+
"name": "firmware_quota_refresh",
93+
"interval": self._quota_refresh_interval,
94+
}
95+
96+
async def run_background_job(self, job_name: str, usage_manager) -> None:
97+
# Refresh quota for all credentials
98+
# Update usage_manager baselines
99+
```
100+
101+
### Files to Modify
102+
103+
#### 3. `src/rotator_library/usage_manager.py`
104+
105+
May need modifications if not already generic enough:
106+
- Ensure `update_quota_baseline()` handles null reset timestamps
107+
- Verify quota group handling works with rolling windows
108+
109+
#### 4. Configuration / Environment
110+
111+
Add environment variable:
112+
- `FIRMWARE_QUOTA_REFRESH_INTERVAL`: Refresh interval in seconds (default: 300)
113+
114+
Given the 5-hour window, 5-minute refresh is reasonable (60 checks per window).
115+
116+
## Implementation Steps
117+
118+
### Phase 1: Core Quota Tracker
119+
1. [ ] Create `firmware_quota_tracker.py` mixin
120+
2. [ ] Implement `fetch_quota_usage()` with ISO timestamp parsing
121+
3. [ ] Implement `get_remaining_fraction()` and `get_reset_timestamp()`
122+
4. [ ] Handle edge cases (null reset, API errors)
123+
124+
### Phase 2: Provider Integration
125+
5. [ ] Add `FirmwareQuotaTracker` mixin to `FirmwareProvider`
126+
6. [ ] Initialize quota cache and refresh interval in `__init__`
127+
7. [ ] Implement `get_background_job_config()`
128+
8. [ ] Implement `run_background_job()` for quota refresh
129+
130+
### Phase 3: Usage Manager Integration
131+
9. [ ] Verify `UsageManager` compatibility with rolling windows
132+
10. [ ] Add virtual model `firmware/_quota` for tracking
133+
11. [ ] Configure quota group `firmware_global` for shared credential tracking
134+
135+
### Phase 4: Testing & Validation
136+
12. [ ] Unit tests for quota tracker (mock API responses)
137+
13. [ ] Integration test with real API (if credentials available)
138+
14. [ ] Verify background refresh works correctly
139+
15. [ ] Test cooldown behavior when quota exhausted
140+
141+
## Edge Cases to Handle
142+
143+
### 1. Null Reset Timestamp
144+
When `reset: null`, there's no active spending window:
145+
```python
146+
if reset is None:
147+
return {
148+
"remaining_fraction": 1.0, # Full quota available
149+
"reset_at": None,
150+
"has_active_window": False,
151+
}
152+
```
153+
154+
### 2. Rolling Window Behavior
155+
Unlike daily resets, the 5-hour window starts when spending begins:
156+
- Don't calculate "next reset" - use API-provided timestamp
157+
- If `reset` is in the past, treat as full quota available
158+
159+
### 3. API Unavailability
160+
Follow established pattern:
161+
- Return `status: "error"` with error message
162+
- Keep using cached data if available
163+
- Log warning but don't crash background job
164+
165+
## Configuration Summary
166+
167+
| Variable | Default | Description |
168+
|----------|---------|-------------|
169+
| `FIRMWARE_QUOTA_REFRESH_INTERVAL` | 300 | Seconds between quota API checks |
170+
| `FIRMWARE_API_BASE` | `https://app.firmware.ai` | API base URL |
171+
172+
## Testing Checklist
173+
174+
- [ ] Quota fetch returns correct remaining_fraction
175+
- [ ] ISO timestamp parsed correctly to Unix timestamp
176+
- [ ] Null reset handled (remaining_fraction = 1.0)
177+
- [ ] HTTP errors return structured error response
178+
- [ ] Background job refreshes all credentials
179+
- [ ] UsageManager baseline updated correctly
180+
- [ ] Cooldown set when remaining_fraction = 0.0
181+
- [ ] Cooldown cleared when reset timestamp passes
182+
183+
## Notes
184+
185+
### Differences from Chutes/NanoGPT
186+
187+
1. **Simpler API**: Already provides ratio, no tier info
188+
2. **Rolling window**: 5 hours from first spend, not fixed daily/monthly
189+
3. **No tier detection**: Firmware.ai doesn't expose subscription tiers
190+
4. **ISO timestamps**: Need to parse ISO 8601, not epoch milliseconds
191+
192+
### API Base URL Assumption
193+
194+
The plan uses `https://app.firmware.ai` as the base URL with endpoint path `/api/v1/quota`, verified against official documentation at docs.firmware.ai.

src/rotator_library/client.py

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3122,7 +3122,9 @@ async def get_quota_stats(
31223122
)
31233123
else:
31243124
group_stats["total_requests_remaining"] = 0
3125-
group_stats["total_remaining_pct"] = None
3125+
# Fallback to avg_remaining_pct when max_requests unavailable
3126+
# This handles providers like Firmware that only provide percentage
3127+
group_stats["total_remaining_pct"] = group_stats.get("avg_remaining_pct")
31263128

31273129
prov_stats["quota_groups"][group_name] = group_stats
31283130

@@ -3188,14 +3190,22 @@ async def get_quota_stats(
31883190
requests_remaining = (
31893191
max(0, max_req - req_count) if max_req else 0
31903192
)
3193+
3194+
# Determine display format
3195+
# Priority: requests (if max known) > percentage (if baseline available) > unknown
3196+
if max_req:
3197+
display = f"{requests_remaining}/{max_req}"
3198+
elif remaining_pct is not None:
3199+
display = f"{remaining_pct}%"
3200+
else:
3201+
display = "?/?"
3202+
31913203
cred["model_groups"][group_name] = {
31923204
"remaining_pct": remaining_pct,
31933205
"requests_used": req_count,
31943206
"requests_remaining": requests_remaining,
31953207
"requests_max": max_req,
3196-
"display": f"{requests_remaining}/{max_req}"
3197-
if max_req
3198-
else f"?/?",
3208+
"display": display,
31993209
"is_exhausted": is_exhausted,
32003210
"reset_time_iso": reset_iso,
32013211
"models": group_models,

0 commit comments

Comments
 (0)