Skip to content

Commit 055ac41

Browse files
committed
feat: update readme to include jupyterhub fix doc
1 parent 5bf516c commit 055ac41

File tree

2 files changed

+6
-284
lines changed

2 files changed

+6
-284
lines changed

JUPYTERHUB_INTEGRATION_FIX.md

Lines changed: 2 additions & 284 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ jhsingle-native-proxy (port 8888) ← JupyterHub spawns this
2525
↓ handles OAuth authentication
2626
↓ Request: /superset/welcome/ (prefix stripped)
2727
28-
Gunicorn (dynamic port) ← jhsingle manages this
28+
Flask dev server (port 8088) ← jhsingle manages this
2929
↓ Flask serves from / (no APPLICATION_ROOT)
3030
↓ receives /superset/welcome/, matches route ✓
3131
↓ generates URLs and redirects
@@ -211,286 +211,4 @@ href="/tablemodelview/list/"
211211
href={SupersetClient.getUrl('/tablemodelview/list/')}
212212
```
213213
214-
**Why this is needed:** Ensures dataset creation UI links include the JupyterHub prefix.
215-
216-
---
217-
218-
## Deployment Configuration
219-
220-
### Environment Variables (Set by JupyterHub)
221-
222-
| Variable | Example | Description |
223-
|----------|---------|-------------|
224-
| `JUPYTERHUB_SERVICE_PREFIX` | `/user/[email protected]/` | User-specific URL prefix |
225-
| `JUPYTERHUB_USER` | `polus@example.com` | Current JupyterHub user |
226-
| `JUPYTERHUB_API_TOKEN` | `<token>` | OAuth token for JupyterHub API |
227-
| `JUPYTERHUB_BASE_URL` | `/` | JupyterHub base URL |
228-
229-
### Superset Configuration
230-
231-
**File:** `deploy/Docker/app-stacks/superset_test/superset_config.py`
232-
233-
**Key configurations:**
234-
235-
```python
236-
# Get JupyterHub prefix
237-
JUPYTERHUB_SERVICE_PREFIX = os.environ.get('JUPYTERHUB_SERVICE_PREFIX', '').rstrip('/')
238-
239-
# Logo target should be relative (Flask adds prefix via url_for)
240-
LOGO_TARGET_PATH = "/superset/welcome/"
241-
242-
# Set static assets prefix to JupyterHub prefix only
243-
if JUPYTERHUB_SERVICE_PREFIX:
244-
STATIC_ASSETS_PREFIX = JUPYTERHUB_SERVICE_PREFIX
245-
246-
# Custom middleware to add prefix to Location headers in redirects
247-
class PrefixRedirectMiddleware:
248-
def __init__(self, app, prefix):
249-
self.app = app
250-
self.prefix = prefix.rstrip('/')
251-
252-
def __call__(self, environ, start_response):
253-
def custom_start_response(status, headers, exc_info=None):
254-
if status.startswith(('301', '302', '303', '307', '308')):
255-
modified_headers = []
256-
for name, value in headers:
257-
if name.lower() == 'location':
258-
if value.startswith('/') and not value.startswith(self.prefix):
259-
value = self.prefix + value
260-
modified_headers.append((name, value))
261-
headers = modified_headers
262-
return start_response(status, headers, exc_info)
263-
return self.app(environ, custom_start_response)
264-
265-
# Flask app mutator
266-
def FLASK_APP_MUTATOR(app):
267-
# ProxyFix for X-Forwarded headers (x_prefix=0 because jhsingle doesn't send it)
268-
app.wsgi_app = ProxyFix(
269-
app.wsgi_app, x_for=1, x_proto=1, x_host=1, x_port=1, x_prefix=0
270-
)
271-
272-
jupyterhub_prefix = os.environ.get('JUPYTERHUB_SERVICE_PREFIX', '').rstrip('/')
273-
if jupyterhub_prefix:
274-
# Add redirect middleware
275-
app.wsgi_app = PrefixRedirectMiddleware(app.wsgi_app, jupyterhub_prefix)
276-
277-
# Patch Flask's url_for
278-
from flask import url_for as flask_url_for
279-
def prefixed_flask_url_for(endpoint, **values):
280-
url = flask_url_for(endpoint, **values)
281-
if url.startswith('/') and not url.startswith(jupyterhub_prefix):
282-
url = jupyterhub_prefix + url
283-
return url
284-
import flask
285-
flask.url_for = prefixed_flask_url_for
286-
287-
# Patch Jinja's url_for
288-
original_jinja_url_for = app.jinja_env.globals['url_for']
289-
def prefixed_jinja_url_for(endpoint, **values):
290-
url = original_jinja_url_for(endpoint, **values)
291-
if url.startswith('/') and not url.startswith(jupyterhub_prefix):
292-
url = jupyterhub_prefix + url
293-
return url
294-
app.jinja_env.globals['url_for'] = prefixed_jinja_url_for
295-
```
296-
297-
**Why this approach:**
298-
- **No `APPLICATION_ROOT`**: Flask serves from `/` because jhsingle strips the prefix from requests
299-
- **Custom middleware**: Adds prefix to `Location` headers in redirect responses
300-
- **Patched `url_for()`**: Ensures all Flask-generated URLs include the prefix
301-
- **`STATIC_ASSETS_PREFIX`**: Set to prefix so static assets load from correct path
302-
303-
### Launch Command
304-
305-
**File:** `deploy/Docker/app-stacks/superset_test/start-dashboard.sh`
306-
307-
```bash
308-
# Use standard jhsingle command pattern
309-
$JHSINGLE_COMMAND \
310-
--destport 0 \
311-
--ready-check-path /health \
312-
--ready-timeout 60 \
313-
-- \
314-
gunicorn \
315-
--bind "127.0.0.1:{port}" \
316-
--workers "$GUNICORN_WORKERS" \
317-
--threads "$GUNICORN_THREADS" \
318-
--timeout "$GUNICORN_TIMEOUT" \
319-
"superset.app:create_app()"
320-
```
321-
322-
**Why Gunicorn:**
323-
- Flask's dev server is single-threaded and not production-ready
324-
- Gunicorn provides multiple workers/threads for better performance
325-
- Required for proper WSGI application serving
326-
327-
## How It Works Together
328-
329-
1. **JupyterHub** spawns container with `JUPYTERHUB_SERVICE_PREFIX=/user/username/`
330-
2. **jhsingle-native-proxy** starts, strips prefix from incoming requests, handles OAuth
331-
3. **Backend** (via `superset_config.py`):
332-
- Reads `JUPYTERHUB_SERVICE_PREFIX` environment variable
333-
- Configures `STATIC_ASSETS_PREFIX` for frontend asset loading
334-
- Installs `PrefixRedirectMiddleware` to add prefix to redirect responses
335-
- Patches `url_for()` to add prefix to all generated URLs
336-
4. **Backend** (via `superset/views/base.py`):
337-
- Injects prefix into bootstrap data as `application_root` and `static_assets_prefix`
338-
- Manually adds prefix to user menu URLs (Info, Logout)
339-
5. **Frontend** (via `setupClient.ts`):
340-
- Reads `application_root` from bootstrap data
341-
- Configures `SupersetClient` to prepend prefix to all API calls
342-
6. **Frontend** (via `public-path.ts`):
343-
- Uses `static_assets_prefix` from bootstrap data for webpack public path
344-
- All static assets (JS, CSS, images) load from correct prefixed path
345-
346-
## Testing the Integration
347-
348-
### 1. Verify Bootstrap Data
349-
350-
Open browser console after logging in:
351-
```javascript
352-
window.bootstrapData.common.application_root
353-
// Should show: "/user/username/"
354-
355-
window.bootstrapData.common.static_assets_prefix
356-
// Should show: "/user/username/"
357-
```
358-
359-
### 2. Verify API Calls
360-
361-
Check Network tab in browser DevTools:
362-
- ✅ Should show: `/user/username/api/v1/dashboard/`
363-
- ❌ NOT: `/api/v1/dashboard/` or `/hub/api/v1/dashboard/`
364-
365-
### 3. Verify Static Assets
366-
367-
Check loaded resources in Network tab:
368-
- ✅ Should load from: `/user/username/static/assets/...`
369-
- ❌ NOT: `/static/assets/...` or `/hub/static/assets/...`
370-
371-
### 4. Verify Redirects
372-
373-
After login:
374-
- ✅ Should redirect to: `/user/username/superset/welcome/`
375-
- ❌ NOT: `/superset/welcome/` or `/hub/superset/welcome/`
376-
377-
### 5. Verify User Menu
378-
379-
Click on user menu in top right:
380-
- Info link: ✅ `/user/username/user_info/`
381-
- Logout link: ✅ `/user/username/logout/`
382-
383-
## Debugging
384-
385-
### Check Environment Variables
386-
387-
```bash
388-
docker exec <container-id> env | grep JUPYTERHUB
389-
```
390-
391-
Expected output:
392-
```
393-
JUPYTERHUB_SERVICE_PREFIX=/user/username/
394-
JUPYTERHUB_USER=username
395-
JUPYTERHUB_API_TOKEN=<token>
396-
JUPYTERHUB_BASE_URL=/
397-
```
398-
399-
### Check Startup Logs
400-
401-
```bash
402-
docker logs <container-id> | grep -A 5 "SUPERSET CONFIG"
403-
```
404-
405-
Expected output:
406-
```
407-
SUPERSET CONFIG DEBUG:
408-
JUPYTERHUB_SERVICE_PREFIX (env): /user/username/
409-
STATIC_ASSETS_PREFIX (Flask config): /user/username/
410-
LOGO_TARGET_PATH: /superset/welcome/
411-
```
412-
413-
### Check Gunicorn Configuration
414-
415-
```bash
416-
docker logs <container-id> | grep "Gunicorn config"
417-
```
418-
419-
Expected output:
420-
```
421-
Gunicorn config: workers=4, threads=4, timeout=60
422-
```
423-
424-
## Files Modified Summary
425-
426-
### Backend (Python)
427-
1. **`superset/views/base.py`**
428-
- Modified `cached_common_bootstrap_data()` to inject prefix into bootstrap data
429-
- Modified `menu_data()` to add prefix to user menu URLs
430-
431-
2. **`superset/views/auth.py`**
432-
- Created `SupersetAuthView` class to override login behavior
433-
- Added proper handling of `next` parameter with JupyterHub prefix
434-
435-
### Frontend (TypeScript/TSX)
436-
3. **`superset-frontend/src/setup/setupClient.ts`**
437-
- Added `appRoot: bootstrapData.common.application_root` to SupersetClient config
438-
439-
4. **`superset-frontend/src/features/datasets/AddDataset/Footer/index.tsx`**
440-
- Changed hardcoded paths to use `SupersetClient.getUrl()`
441-
442-
5. **`superset-frontend/src/features/datasets/AddDataset/LeftPanel/index.tsx`**
443-
- Changed hardcoded paths to use `SupersetClient.getUrl()`
444-
445-
### Configuration (Deployment)
446-
6. **`deploy/Docker/app-stacks/superset_test/superset_config.py`** (not in Superset repo)
447-
- Custom `PrefixRedirectMiddleware` class
448-
- `FLASK_APP_MUTATOR` with patched `url_for()` functions
449-
- `STATIC_ASSETS_PREFIX` configuration
450-
451-
7. **`deploy/Docker/app-stacks/superset_test/start-dashboard.sh`** (not in Superset repo)
452-
- Launches via `jhsingle-native-proxy` with Gunicorn
453-
454-
## Advantages of This Approach
455-
456-
✅ **Minimal source code changes** - Only 5 files in Superset repository
457-
✅ **Standard jhsingle pattern** - Matches other dashboard deployments
458-
✅ **No nginx configuration** - Simpler deployment stack
459-
✅ **Configurable performance** - Gunicorn workers/threads via environment variables
460-
✅ **Built-in OAuth** - jhsingle handles JupyterHub authentication
461-
✅ **Maintainable** - Clear separation between Superset source and deployment config
462-
✅ **Upstream-able** - Source changes could potentially be contributed to Apache Superset
463-
464-
## Repository
465-
466-
Forked Superset with JupyterHub integration: https://github.com/liuji1031/superset.git
467-
468-
**Key commits:**
469-
- `a9bff3803`: Added bootstrap data injection (`setupClient.ts`)
470-
- `378a0f1d8`: Modified bootstrap data generation (`views/base.py`)
471-
- `50aae5b2c`: Fixed hardcoded URLs in dataset UI
472-
- `a6a294088`: Login respects `next` parameter
473-
- `ce8470815`: Login redirect fixes
474-
- `41b330220`: User menu URL fixes
475-
476-
## Performance Tuning
477-
478-
### Environment Variables
479-
480-
Configure via JupyterHub spawner:
481-
482-
```python
483-
c.KubeSpawner.environment = {
484-
'SUPERSET_GUNICORN_WORKERS': '8', # More workers for high load
485-
'SUPERSET_GUNICORN_THREADS': '4', # Threads per worker
486-
'SUPERSET_GUNICORN_TIMEOUT': '120', # Timeout for slow queries
487-
}
488-
```
489-
490-
### Sizing Guidelines
491-
492-
| Users | Workers | Threads | CPU | Memory |
493-
|-------|---------|---------|-----|--------|
494-
| < 10 | 2-4 | 4 | 1-2 | 2-4 GB |
495-
| 10-50 | 4-8 | 4 | 2-4 | 4-8 GB |
496-
| 50+ | 8-16 | 4 | 4-8 | 8-16 GB |
214+
**Why this is needed:** Ensures dataset creation UI links include the JupyterHub prefix.

README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,10 @@ specific language governing permissions and limitations
1717
under the License.
1818
-->
1919

20+
# JupyterHub Integration for Apache Superset
21+
22+
see [JUPYTERHUB_INTEGRATION_FIX.md](JUPYTERHUB_INTEGRATION_FIX.md) for more details.
23+
2024
# Superset
2125

2226
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/license/apache-2-0)

0 commit comments

Comments
 (0)