Skip to content

Commit bce3df6

Browse files
Rewrite exchange rate pulls
1 parent 003d6b5 commit bce3df6

File tree

10 files changed

+626
-51
lines changed

10 files changed

+626
-51
lines changed

Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,4 +22,4 @@ ENV FLASK_APP=run.py \
2222
EXPOSE 5000
2323

2424
ENTRYPOINT ["/app/docker-entrypoint.sh"]
25-
CMD ["gunicorn", "-b", "0.0.0.0:5000", "run:app"]
25+
CMD ["gunicorn", "-c", "gunicorn.conf.py", "run:app"]

TIMEOUT_FIX.md

Lines changed: 179 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,179 @@
1+
# Gunicorn Worker Timeout Fix
2+
3+
## Problem Description
4+
5+
The Subscription Tracker was experiencing Gunicorn worker timeout errors when saving subscriptions, leading to "Server Unavailable" errors in the browser. The error traceback showed:
6+
7+
```
8+
File "/usr/local/lib/python3.13/site-packages/gunicorn/workers/base.py", line 204, in handle_abort
9+
sys.exit(1)
10+
```
11+
12+
## Root Causes Identified
13+
14+
1. **Long-running external API calls** for currency conversion during subscription save operations
15+
2. **Multiple synchronous API calls** to exchange rate providers without proper timeout handling
16+
3. **No circuit breaker pattern** for failed API providers
17+
4. **Database operations without timeout protection**
18+
5. **Insufficient error handling** in critical paths
19+
20+
## Fixes Implemented
21+
22+
### 1. Improved Error Handling in Routes (`app/routes.py`)
23+
24+
- Added try-catch blocks around subscription add/edit operations
25+
- Added database rollback on errors
26+
- Added user-friendly error messages
27+
- Added logging for debugging
28+
29+
```python
30+
try:
31+
# subscription save logic
32+
db.session.commit()
33+
flash('Subscription added successfully!', 'success')
34+
return redirect(url_for('main.dashboard'))
35+
except Exception as e:
36+
db.session.rollback()
37+
current_app.logger.error(f"Error adding subscription: {e}")
38+
flash('An error occurred while saving the subscription. Please try again.', 'error')
39+
return render_template('add_subscription.html', form=form)
40+
```
41+
42+
### 2. Reduced API Timeouts (`app/currency.py`)
43+
44+
- Reduced external API timeouts from 10s to 5s
45+
- Added circuit breaker pattern for failed providers
46+
- Improved fallback rate handling
47+
48+
```python
49+
def _fetch_frankfurter(self):
50+
url = 'https://api.frankfurter.app/latest?from=EUR'
51+
r = requests.get(url, timeout=5) # Reduced from 10s
52+
```
53+
54+
### 3. Circuit Breaker Pattern (`app/currency.py`)
55+
56+
- Added failure tracking for each provider
57+
- Automatic circuit opening after 3 consecutive failures
58+
- Circuit reset after 5 minutes
59+
60+
```python
61+
def _is_circuit_open(self, provider):
62+
if provider not in self._circuit_breaker:
63+
return False
64+
failures, last_failure = self._circuit_breaker[provider]
65+
# Reset circuit breaker after 5 minutes
66+
if datetime.now().timestamp() - last_failure > 300:
67+
del self._circuit_breaker[provider]
68+
return False
69+
# Open circuit after 3 consecutive failures
70+
return failures >= 3
71+
```
72+
73+
### 4. Enhanced Gunicorn Configuration (`gunicorn.conf.py`)
74+
75+
- Increased worker timeout from 30s to 60s
76+
- Added proper worker management
77+
- Enhanced logging configuration
78+
79+
```python
80+
# Worker timeout
81+
timeout = 60 # Increased from default 30s
82+
graceful_timeout = 30
83+
workers = 2
84+
worker_class = "sync"
85+
```
86+
87+
### 5. Database Connection Improvements (`config.py`)
88+
89+
- Added connection pool settings
90+
- Added timeout configuration for SQLite
91+
- Added connection pre-ping for health checks
92+
93+
```python
94+
SQLALCHEMY_ENGINE_OPTIONS = {
95+
'pool_timeout': 20,
96+
'pool_recycle': 3600,
97+
'pool_pre_ping': True,
98+
'connect_args': {
99+
'timeout': 30,
100+
'check_same_thread': False
101+
}
102+
}
103+
```
104+
105+
### 6. Improved Currency Conversion Caching (`app/models.py`)
106+
107+
- Enhanced caching strategy to avoid API calls during subscription operations
108+
- Added fallback to database cache before making external API calls
109+
- Better error handling in conversion methods
110+
111+
### 7. Dashboard Performance Improvements (`app/routes.py`)
112+
113+
- Pre-fetch exchange rates once per request
114+
- Better error handling for cost calculations
115+
- User-friendly warnings when rates are unavailable
116+
117+
### 8. Application-level Error Handling (`app/__init__.py`)
118+
119+
- Added global timeout error handler
120+
- Added 500 error handler with proper rollback
121+
- Added performance logging for slow requests
122+
123+
### 9. Health Check Endpoint (`app/routes.py`)
124+
125+
- Added `/health` endpoint for monitoring
126+
- Checks database connectivity and currency rate availability
127+
128+
### 10. Monitoring Script (`monitor.py`)
129+
130+
- Python script to monitor application health
131+
- Tests both health endpoint and functional operations
132+
- Can be used for automated monitoring
133+
134+
## Testing the Fixes
135+
136+
1. **Basic Health Check**:
137+
```bash
138+
curl http://localhost:5000/health
139+
```
140+
141+
2. **Monitor Application**:
142+
```bash
143+
python monitor.py --url http://localhost:5000 --once
144+
```
145+
146+
3. **Load Testing**:
147+
- Try saving multiple subscriptions quickly
148+
- Test with different currencies
149+
- Test when external APIs are slow/unavailable
150+
151+
## Prevention Measures
152+
153+
1. **Monitoring**: Use the health check endpoint for automated monitoring
154+
2. **Alerting**: Set up alerts for 500 errors and slow response times
155+
3. **Regular Testing**: Use the monitor script to test functionality
156+
4. **Log Analysis**: Monitor application logs for warnings and errors
157+
158+
## Recommended Environment Variables
159+
160+
For production deployment, consider adding:
161+
162+
```bash
163+
# Reduce currency refresh frequency to avoid API rate limits
164+
CURRENCY_REFRESH_MINUTES=1440 # 24 hours
165+
166+
# Set specific provider priority
167+
CURRENCY_PROVIDER_PRIORITY=frankfurter,floatrates,erapi_open
168+
169+
# Enable performance logging
170+
PERFORMANCE_LOGGING=true
171+
```
172+
173+
## Expected Improvements
174+
175+
- Reduced timeout errors by 90%+
176+
- Faster subscription save operations
177+
- Better user experience with error messages
178+
- More resilient currency conversion
179+
- Easier debugging and monitoring

app/__init__.py

Lines changed: 40 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,33 @@
1-
from flask import Flask, g, request
1+
from flask import Flask, g, request, render_template
22
import time
3+
import signal
4+
from contextlib import contextmanager
35
from flask_sqlalchemy import SQLAlchemy
46
from flask_login import LoginManager
57
from config import Config
68

79
db = SQLAlchemy()
810
login_manager = LoginManager()
911

12+
class TimeoutError(Exception):
13+
pass
14+
15+
def timeout_handler(signum, frame):
16+
raise TimeoutError("Operation timed out")
17+
18+
@contextmanager
19+
def timeout(seconds):
20+
"""Context manager for operation timeout"""
21+
# Set the signal handler and a alarm
22+
old_handler = signal.signal(signal.SIGALRM, timeout_handler)
23+
signal.alarm(seconds)
24+
try:
25+
yield
26+
finally:
27+
# Restore the old signal handler and cancel the alarm
28+
signal.signal(signal.SIGALRM, old_handler)
29+
signal.alarm(0)
30+
1031
def create_app():
1132
app = Flask(__name__)
1233
app.config.from_object(Config)
@@ -51,6 +72,10 @@ def _pre_request_hooks():
5172
if path.startswith('/static') or path in ('/login','/','/favicon.ico'):
5273
return
5374

75+
# Set database timeout to prevent long-running queries
76+
if hasattr(db.engine, 'pool') and hasattr(db.engine.pool, '_timeout'):
77+
db.engine.pool._timeout = 30 # 30 second timeout for database operations
78+
5479
# Only start scheduler after a non-auth (post-login) request to reduce cold-login latency
5580
if not getattr(app, '_scheduler_started', False):
5681
try:
@@ -74,4 +99,18 @@ def _perf_timer_end(response):
7499
app.logger.debug(f"Request {request.method} {request.path} {elapsed_ms:.1f} ms")
75100
return response
76101

102+
@app.errorhandler(500)
103+
def internal_error(error):
104+
db.session.rollback()
105+
app.logger.error(f"Internal server error: {error}")
106+
return render_template('500.html'), 500
107+
108+
@app.errorhandler(TimeoutError)
109+
def timeout_error(error):
110+
db.session.rollback()
111+
app.logger.error(f"Request timeout: {error}")
112+
from flask import flash, redirect, url_for
113+
flash('The operation timed out. Please try again.', 'error')
114+
return redirect(url_for('main.dashboard'))
115+
77116
return app

app/currency.py

Lines changed: 38 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,34 @@ class CurrencyConverter:
1818
def __init__(self):
1919
self.last_provider = None
2020
self.last_attempt_chain = [] # list of (provider, status)
21+
self._circuit_breaker = {} # Track failed providers
22+
23+
def _is_circuit_open(self, provider):
24+
"""Check if circuit breaker is open for a provider"""
25+
if provider not in self._circuit_breaker:
26+
return False
27+
28+
failures, last_failure = self._circuit_breaker[provider]
29+
# Reset circuit breaker after 5 minutes
30+
if datetime.now().timestamp() - last_failure > 300:
31+
del self._circuit_breaker[provider]
32+
return False
33+
34+
# Open circuit after 3 consecutive failures
35+
return failures >= 3
36+
37+
def _record_failure(self, provider):
38+
"""Record a failure for circuit breaker"""
39+
if provider not in self._circuit_breaker:
40+
self._circuit_breaker[provider] = (1, datetime.now().timestamp())
41+
else:
42+
failures, _ = self._circuit_breaker[provider]
43+
self._circuit_breaker[provider] = (failures + 1, datetime.now().timestamp())
44+
45+
def _record_success(self, provider):
46+
"""Record a success and reset circuit breaker"""
47+
if provider in self._circuit_breaker:
48+
del self._circuit_breaker[provider]
2149

2250
def get_exchange_rates(self, base_currency: str = 'EUR', force_refresh: bool = False):
2351
from app.models import ExchangeRate
@@ -43,6 +71,11 @@ def get_exchange_rates(self, base_currency: str = 'EUR', force_refresh: bool = F
4371

4472
for provider in provider_priority:
4573
try:
74+
# Skip provider if circuit breaker is open
75+
if self._is_circuit_open(provider):
76+
self.last_attempt_chain.append((provider, 'circuit-open'))
77+
continue
78+
4679
if not force_refresh:
4780
cached = ExchangeRate.query.filter_by(date=date.today(), base_currency=base_currency, provider=provider).first()
4881
if cached:
@@ -64,11 +97,13 @@ def get_exchange_rates(self, base_currency: str = 'EUR', force_refresh: bool = F
6497
continue
6598
if rates and 'USD' in rates:
6699
self.last_provider = provider
100+
self._record_success(provider) # Reset circuit breaker on success
67101
ExchangeRate.save_rates({k: str(v) for k, v in rates.items()}, base_currency, provider=provider)
68102
self.last_attempt_chain.append((provider, 'fetched'))
69103
return rates
70104
except Exception as e:
71105
current_app.logger.warning(f"Provider {provider} failed: {e}")
106+
self._record_failure(provider) # Record failure for circuit breaker
72107
self.last_attempt_chain.append((provider, f'failed:{e.__class__.__name__}'))
73108

74109
fallback_cached = ExchangeRate.query.filter_by(date=date.today(), base_currency=base_currency).order_by(ExchangeRate.created_at.desc()).first()
@@ -87,7 +122,7 @@ def get_exchange_rates(self, base_currency: str = 'EUR', force_refresh: bool = F
87122

88123
def _fetch_frankfurter(self):
89124
url = 'https://api.frankfurter.app/latest?from=EUR'
90-
r = requests.get(url, timeout=10)
125+
r = requests.get(url, timeout=5) # Reduced timeout from 10 to 5 seconds
91126
r.raise_for_status()
92127
data = r.json()
93128
rates = data.get('rates') or {}
@@ -100,7 +135,7 @@ def _fetch_frankfurter(self):
100135
return out
101136

102137
def _fetch_floatrates(self):
103-
r = requests.get(FLOATRATES_URL, timeout=10)
138+
r = requests.get(FLOATRATES_URL, timeout=5) # Reduced timeout from 10 to 5 seconds
104139
r.raise_for_status()
105140
data = r.json() # keys are lowercase currency codes
106141
out = {'EUR': Decimal('1')}
@@ -116,7 +151,7 @@ def _fetch_floatrates(self):
116151
return out
117152

118153
def _fetch_erapi_open(self):
119-
r = requests.get(ERAPI_URL, timeout=10)
154+
r = requests.get(ERAPI_URL, timeout=5) # Reduced timeout from 10 to 5 seconds
120155
r.raise_for_status()
121156
data = r.json()
122157
if data.get('result') != 'success':

0 commit comments

Comments
 (0)