Skip to content

Commit 51e4c74

Browse files
JacobCoffeeclaude
andauthored
feat: Phase 3.2 - Add retry logic and circuit breakers to API client (#133)
Co-authored-by: Claude <[email protected]>
1 parent 1a71df1 commit 51e4c74

File tree

5 files changed

+736
-64
lines changed

5 files changed

+736
-64
lines changed

services/bot/README.md

Lines changed: 68 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -198,9 +198,69 @@ The bot service calls the Byte API for database operations. See `src/byte_bot/ap
198198
- `create_guild(guild_data: dict)` - Create new guild
199199
- `update_guild(guild_id: str, guild_data: dict)` - Update guild
200200
- `delete_guild(guild_id: str)` - Delete guild
201+
- `health_check()` - Check API service health
201202

202203
All database operations go through the API - the bot does not directly access the database.
203204

205+
### API Client Retry Logic
206+
207+
The bot's API client implements automatic retry logic with exponential backoff to improve reliability when communicating with the API service.
208+
209+
#### Configuration
210+
211+
- **Max Retries**: 3 attempts per request
212+
- **Backoff**: Exponential (1s, 2s, 4s, max 10s between retries)
213+
- **Retryable Errors**: HTTP 5xx (server errors), connection errors
214+
- **Non-Retryable**: HTTP 4xx (client errors like 400, 401, 404)
215+
- **Timeout**: 10s request timeout
216+
217+
#### Retry Behavior
218+
219+
**Retryable Errors (Automatic Retry):**
220+
221+
The client automatically retries on:
222+
223+
- HTTP 5xx errors: 500 Internal Server Error, 502 Bad Gateway, 503 Service Unavailable, 504 Gateway Timeout
224+
- Connection errors: `ConnectError`, `ConnectTimeout`, `ReadTimeout`
225+
226+
**Non-Retryable Errors (Immediate Failure):**
227+
228+
The client does **not** retry on:
229+
230+
- HTTP 4xx errors: 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found
231+
- These errors indicate client-side issues that won't be resolved by retrying
232+
233+
#### Retry Statistics
234+
235+
Access retry statistics via `api_client.retry_stats`:
236+
237+
```python
238+
client = ByteAPIClient("http://api:8000")
239+
240+
# Make some requests...
241+
await client.create_guild(...)
242+
await client.get_guild(...)
243+
244+
# Check stats
245+
print(client.retry_stats)
246+
# {
247+
# "total_retries": 5, # Total retry attempts across all methods
248+
# "failed_requests": 2, # Failed requests (4xx errors)
249+
# "retried_methods": { # Retries per method
250+
# "create_guild": 3,
251+
# "get_guild": 2,
252+
# }
253+
# }
254+
```
255+
256+
#### Logging
257+
258+
The client logs retry attempts at **WARNING** level before each retry:
259+
260+
```
261+
WARNING Retrying byte_bot.api_client.ByteAPIClient.create_guild in 1 seconds as it raised HTTPStatusError: Server error '500 Internal Server Error'
262+
```
263+
204264
## Testing
205265

206266
```bash
@@ -210,23 +270,25 @@ uv run pytest services/bot/tests
210270

211271
## Status
212272

213-
**Phase 1.2**: ✅ Complete
273+
**Phase 3.2**: ✅ Complete
214274

215275
- [x] Bot service extracted from monolith
216276
- [x] All plugins migrated to `services/bot/src/byte_bot/plugins/`
217277
- [x] All views migrated to `services/bot/src/byte_bot/views/`
218-
- [x] API client implemented
278+
- [x] API client implemented with retry logic
219279
- [x] Configuration system set up
220280
- [x] Entry points configured
221281
- [x] Import errors fixed
222282
- [x] Dockerfile created
223-
- [ ] Tests implemented (TODO: Phase 1.3)
283+
- [x] Comprehensive test suite (25+ retry tests)
284+
- [x] Retry logic with exponential backoff
285+
- [x] Retry statistics tracking
224286

225287
## Next Steps
226288

227-
- **Phase 1.3**: Implement comprehensive test suite
228-
- **Phase 2**: Remove old monolith code (`byte_bot/byte/`)
229-
- **Phase 3**: Deploy bot and API services independently
289+
- **Phase 3.3**: Implement circuit breaker pattern (optional)
290+
- **Phase 4**: Add caching layer to API client
291+
- **Phase 5**: Implement health checks and monitoring
230292

231293
## License
232294

services/bot/pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ dependencies = [
1414
"python-dateutil>=2.9.0.post0",
1515
"python-dotenv>=1.0.0",
1616
"anyio>=4.1.0",
17+
"tenacity>=8.2.0",
1718
]
1819
requires-python = ">=3.13,<4.0"
1920
license = { text = "MIT" }

0 commit comments

Comments
 (0)