|
| 1 | +# Firmware.ai Quota Tracking Implementation Plan |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +Implement quota tracking for the Firmware.ai provider based on their `/api/v1/quota` API endpoint. This follows the established mixin pattern used by Chutes and NanoGPT quota tracking. |
| 6 | + |
| 7 | +## API Specification |
| 8 | + |
| 9 | +**Endpoint:** `GET https://app.firmware.ai/api/v1/quota` |
| 10 | +**Authentication:** `Authorization: Bearer <api_key>` |
| 11 | + |
| 12 | +**Response:** |
| 13 | +```json |
| 14 | +{ |
| 15 | + "used": 0.75, // Ratio from 0 to 1 (quota utilization) |
| 16 | + "reset": "2024-01-15T14:30:00Z" // ISO UTC timestamp, or null when no active window |
| 17 | +} |
| 18 | +``` |
| 19 | + |
| 20 | +**Key Characteristics:** |
| 21 | +- **5-hour rolling window** (unlike Chutes daily / NanoGPT daily+monthly) |
| 22 | +- `used` is already a ratio (no calculation needed, unlike Chutes which returns absolute values) |
| 23 | +- `reset` can be `null` when no credits have been spent recently |
| 24 | +- Simpler response than other providers |
| 25 | + |
| 26 | +## Implementation Approach |
| 27 | + |
| 28 | +### Pattern Analysis |
| 29 | + |
| 30 | +| Aspect | Chutes | NanoGPT | Firmware.ai (Proposed) | |
| 31 | +|--------|--------|---------|------------------------| |
| 32 | +| Quota Window | Daily (00:00 UTC) | Daily + Monthly | 5-hour rolling | |
| 33 | +| API Response | `{quota: int, used: float}` | `{daily: {...}, monthly: {...}}` | `{used: 0-1, reset: ISO\|null}` | |
| 34 | +| Calculation | `remaining = quota - used` | `remaining / limit` | `remaining = 1 - used` | |
| 35 | +| Reset Handling | Calculate next midnight | Parse from API | Parse ISO string | |
| 36 | +| Tier Detection | From quota value | From state field | N/A (no tiers) | |
| 37 | + |
| 38 | +### Files to Create |
| 39 | + |
| 40 | +#### 1. `src/rotator_library/providers/utilities/firmware_quota_tracker.py` |
| 41 | + |
| 42 | +Mixin class providing quota tracking functionality: |
| 43 | + |
| 44 | +```python |
| 45 | +class FirmwareQuotaTracker: |
| 46 | + """ |
| 47 | + Mixin class providing quota tracking for Firmware.ai provider. |
| 48 | +
|
| 49 | + Required provider attributes: |
| 50 | + self._quota_cache: Dict[str, Dict[str, Any]] = {} |
| 51 | + self._quota_refresh_interval: int = 300 |
| 52 | + """ |
| 53 | + |
| 54 | + async def fetch_quota_usage(api_key, client=None) -> Dict[str, Any]: |
| 55 | + """ |
| 56 | + Returns: |
| 57 | + { |
| 58 | + "status": "success" | "error", |
| 59 | + "error": str | None, |
| 60 | + "used": float, # 0.0 to 1.0 |
| 61 | + "remaining_fraction": float, # 1.0 - used |
| 62 | + "reset_at": float | None, # Unix timestamp |
| 63 | + "has_active_window": bool, # True if reset is not null |
| 64 | + "fetched_at": float, |
| 65 | + } |
| 66 | + """ |
| 67 | +``` |
| 68 | + |
| 69 | +**Key Implementation Details:** |
| 70 | +- Parse ISO 8601 timestamp to Unix timestamp |
| 71 | +- Handle `null` reset gracefully (no active spend window) |
| 72 | +- Since `used` is already a ratio, `remaining_fraction = 1.0 - used` |
| 73 | +- No tier detection needed (Firmware.ai doesn't expose tier info) |
| 74 | + |
| 75 | +#### 2. `src/rotator_library/providers/firmware_provider.py` (Modifications) |
| 76 | + |
| 77 | +Integrate the mixin into the existing Firmware provider: |
| 78 | + |
| 79 | +```python |
| 80 | +from .utilities.firmware_quota_tracker import FirmwareQuotaTracker |
| 81 | + |
| 82 | +class FirmwareProvider(FirmwareQuotaTracker, ProviderInterface): |
| 83 | + def __init__(self, ...): |
| 84 | + # Add quota tracking state |
| 85 | + self._quota_cache: Dict[str, Dict[str, Any]] = {} |
| 86 | + self._quota_refresh_interval = int( |
| 87 | + os.getenv("FIRMWARE_QUOTA_REFRESH_INTERVAL", "300") |
| 88 | + ) |
| 89 | + |
| 90 | + def get_background_job_config(self) -> Optional[Dict[str, Any]]: |
| 91 | + return { |
| 92 | + "name": "firmware_quota_refresh", |
| 93 | + "interval": self._quota_refresh_interval, |
| 94 | + } |
| 95 | + |
| 96 | + async def run_background_job(self, job_name: str, usage_manager) -> None: |
| 97 | + # Refresh quota for all credentials |
| 98 | + # Update usage_manager baselines |
| 99 | +``` |
| 100 | + |
| 101 | +### Files to Modify |
| 102 | + |
| 103 | +#### 3. `src/rotator_library/usage_manager.py` |
| 104 | + |
| 105 | +May need modifications if not already generic enough: |
| 106 | +- Ensure `update_quota_baseline()` handles null reset timestamps |
| 107 | +- Verify quota group handling works with rolling windows |
| 108 | + |
| 109 | +#### 4. Configuration / Environment |
| 110 | + |
| 111 | +Add environment variable: |
| 112 | +- `FIRMWARE_QUOTA_REFRESH_INTERVAL`: Refresh interval in seconds (default: 300) |
| 113 | + |
| 114 | +Given the 5-hour window, 5-minute refresh is reasonable (60 checks per window). |
| 115 | + |
| 116 | +## Implementation Steps |
| 117 | + |
| 118 | +### Phase 1: Core Quota Tracker |
| 119 | +1. [ ] Create `firmware_quota_tracker.py` mixin |
| 120 | +2. [ ] Implement `fetch_quota_usage()` with ISO timestamp parsing |
| 121 | +3. [ ] Implement `get_remaining_fraction()` and `get_reset_timestamp()` |
| 122 | +4. [ ] Handle edge cases (null reset, API errors) |
| 123 | + |
| 124 | +### Phase 2: Provider Integration |
| 125 | +5. [ ] Add `FirmwareQuotaTracker` mixin to `FirmwareProvider` |
| 126 | +6. [ ] Initialize quota cache and refresh interval in `__init__` |
| 127 | +7. [ ] Implement `get_background_job_config()` |
| 128 | +8. [ ] Implement `run_background_job()` for quota refresh |
| 129 | + |
| 130 | +### Phase 3: Usage Manager Integration |
| 131 | +9. [ ] Verify `UsageManager` compatibility with rolling windows |
| 132 | +10. [ ] Add virtual model `firmware/_quota` for tracking |
| 133 | +11. [ ] Configure quota group `firmware_global` for shared credential tracking |
| 134 | + |
| 135 | +### Phase 4: Testing & Validation |
| 136 | +12. [ ] Unit tests for quota tracker (mock API responses) |
| 137 | +13. [ ] Integration test with real API (if credentials available) |
| 138 | +14. [ ] Verify background refresh works correctly |
| 139 | +15. [ ] Test cooldown behavior when quota exhausted |
| 140 | + |
| 141 | +## Edge Cases to Handle |
| 142 | + |
| 143 | +### 1. Null Reset Timestamp |
| 144 | +When `reset: null`, there's no active spending window: |
| 145 | +```python |
| 146 | +if reset is None: |
| 147 | + return { |
| 148 | + "remaining_fraction": 1.0, # Full quota available |
| 149 | + "reset_at": None, |
| 150 | + "has_active_window": False, |
| 151 | + } |
| 152 | +``` |
| 153 | + |
| 154 | +### 2. Rolling Window Behavior |
| 155 | +Unlike daily resets, the 5-hour window starts when spending begins: |
| 156 | +- Don't calculate "next reset" - use API-provided timestamp |
| 157 | +- If `reset` is in the past, treat as full quota available |
| 158 | + |
| 159 | +### 3. API Unavailability |
| 160 | +Follow established pattern: |
| 161 | +- Return `status: "error"` with error message |
| 162 | +- Keep using cached data if available |
| 163 | +- Log warning but don't crash background job |
| 164 | + |
| 165 | +## Configuration Summary |
| 166 | + |
| 167 | +| Variable | Default | Description | |
| 168 | +|----------|---------|-------------| |
| 169 | +| `FIRMWARE_QUOTA_REFRESH_INTERVAL` | 300 | Seconds between quota API checks | |
| 170 | +| `FIRMWARE_API_BASE` | `https://app.firmware.ai` | API base URL | |
| 171 | + |
| 172 | +## Testing Checklist |
| 173 | + |
| 174 | +- [ ] Quota fetch returns correct remaining_fraction |
| 175 | +- [ ] ISO timestamp parsed correctly to Unix timestamp |
| 176 | +- [ ] Null reset handled (remaining_fraction = 1.0) |
| 177 | +- [ ] HTTP errors return structured error response |
| 178 | +- [ ] Background job refreshes all credentials |
| 179 | +- [ ] UsageManager baseline updated correctly |
| 180 | +- [ ] Cooldown set when remaining_fraction = 0.0 |
| 181 | +- [ ] Cooldown cleared when reset timestamp passes |
| 182 | + |
| 183 | +## Notes |
| 184 | + |
| 185 | +### Differences from Chutes/NanoGPT |
| 186 | + |
| 187 | +1. **Simpler API**: Already provides ratio, no tier info |
| 188 | +2. **Rolling window**: 5 hours from first spend, not fixed daily/monthly |
| 189 | +3. **No tier detection**: Firmware.ai doesn't expose subscription tiers |
| 190 | +4. **ISO timestamps**: Need to parse ISO 8601, not epoch milliseconds |
| 191 | + |
| 192 | +### API Base URL Assumption |
| 193 | + |
| 194 | +The plan uses `https://app.firmware.ai` as the base URL with endpoint path `/api/v1/quota`, verified against official documentation at docs.firmware.ai. |
0 commit comments