implement K8sDatabaseManager

andrii-i · andrii-i · commit 346496e1e2d7 · 2025-09-02T11:25:25.000-07:00
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -6,7 +6,7 @@ This document contains development notes, architecture decisions, and lessons le
 
 ## Project Structure
 
-- `src/jupyter_scheduler_k8s/` - Main Python package with K8sExecutionManager
+- `src/jupyter_scheduler_k8s/` - Main Python package with K8sExecutionManager and K8sDatabaseManager
 - `image/` - Docker image with Pixi-based Python environment and notebook executor
 - `local-dev/` - Local development configuration (Kind cluster)
 - `Makefile` - Build and development automation with auto-detection
@@ -29,7 +29,7 @@ This document contains development notes, architecture decisions, and lessons le
 
 ## Key Design Principles
 
-1. **Minimal Extension**: Only override ExecutionManager, reuse everything else from jupyter-scheduler
+1. **Minimal Extension**: Only override ExecutionManager and DatabaseManager, reuse everything else from jupyter-scheduler
 2. **Container Simplicity**: Container just executes notebooks, unaware of K8s or scheduler
 3. **No Circular Dependencies**: Container doesn't depend on jupyter-scheduler package
 4. **Staging Compatibility**: Work with jupyter-scheduler's existing file staging mechanism
@@ -56,6 +56,13 @@ This document contains development notes, architecture decisions, and lessons le
 
 ### Phase 2: K8s Backend Implementation ✅
 
+#### K8s Database Backend (Production-Ready)
+- **Storage**: K8s Jobs as database records using labels and annotations
+- **SQLAlchemy Interface**: K8sSession and K8sQuery mimic SQLAlchemy patterns
+- **DatabaseManager Plugin**: Clean integration via jupyter-scheduler's Type system
+- **Zero SQL Dependencies**: Complete replacement for SQLite/PostgreSQL
+- **Usage**: `jupyter lab --SchedulerApp.db_url="k8s://default" --SchedulerApp.database_manager_class="jupyter_scheduler_k8s.K8sDatabaseManager"`
+
 ### Pre-Populated PVC Architecture (Production-Ready)
 - **Storage**: PVC (PersistentVolumeClaim) for production-ready file handling
   - Works with all standard K8s clusters (Kind, minikube, EKS, GKE, AKS)
@@ -187,11 +194,12 @@ jupyter lab --Scheduler.execution_manager_class="jupyter_scheduler_k8s.K8sExecut
 
 ## Current Implementation Status
 
-### Latest Architecture: S3 Storage (Production Ready ✅)
-1. **Upload inputs** - AWS CLI sync to S3 bucket
-2. **Container execution** - Job downloads from S3, executes notebook, uploads outputs  
-3. **Download outputs** - AWS CLI sync from S3 to staging directory
-4. **Durability** - Files survive cluster failures, can be retrieved later
+### Latest Architecture: K8s Database + S3 Storage (Production Ready ✅)
+1. **Database operations** - Job metadata stored in K8s Jobs using labels/annotations
+2. **Upload inputs** - AWS CLI sync to S3 bucket
+3. **Container execution** - Job downloads from S3, executes notebook, uploads outputs
+4. **Download outputs** - AWS CLI sync from S3 to staging directory
+5. **Durability** - Both metadata and files survive cluster failures
 
 **Key Implementation Details:**
 - **AWS credentials passed at runtime**: K8sExecutionManager passes host AWS credentials to containers via environment variables
diff --git a/README.md b/README.md
@@ -10,7 +10,8 @@ Kubernetes backend for [jupyter-scheduler](https://github.com/jupyter-server/jup
 4. Results uploaded back to S3, then downloaded to JupyterLab and accessible through the UI
 
 **Key features:**
-- **S3 storage** - files survive Kubernetes cluster or Jupyter Server failures. Supports any S3-compatible storage like AWS S3, MinIO, GCS with S3 API, and so on
+- **K8s database** - store job metadata in Kubernetes Jobs (zero SQL dependencies)
+- **S3 storage** - files survive Kubernetes cluster or Jupyter Server failures
 - Parameter injection for notebook customization
 - Multiple output formats (HTML, PDF, etc.)
 - Works with any Kubernetes cluster (Kind, minikube, EKS, GKE, AKS)
@@ -146,9 +147,12 @@ export S3_BUCKET="<your-test-bucket>"
 export AWS_ACCESS_KEY_ID="<your-access-key>"
 export AWS_SECRET_ACCESS_KEY="<your-secret-key>"
 
-# Launch and test through JupyterLab UI
+# Launch with K8s execution only
 jupyter lab --Scheduler.execution_manager_class="jupyter_scheduler_k8s.K8sExecutionManager"
 
+# Launch with K8s database + K8s execution
+jupyter lab --SchedulerApp.db_url="k8s://default" --SchedulerApp.database_manager_class="jupyter_scheduler_k8s.K8sDatabaseManager" --Scheduler.execution_manager_class="jupyter_scheduler_k8s.K8sExecutionManager"
+
 # Cleanup
 make clean
 ```
@@ -197,12 +201,13 @@ make clean          # Remove cluster and cleanup
 ## Implementation Status
 
 ### Working Features ✅
-- Custom `K8sExecutionManager` that extends `jupyter-scheduler.ExecutionManager` and runs notebook jobs in Kubernetes pods
+- **K8s Database**: `K8sDatabaseManager` stores job metadata in K8s Jobs (zero SQL dependencies)
+- **K8s Execution**: `K8sExecutionManager` runs notebook jobs in Kubernetes pods
 - Parameter injection and multiple output formats
 - File handling for any notebook size with proven S3 operations
 - Configurable CPU/memory limits
 - Event-driven job monitoring with Watch API
-- S3 storage: Files persist beyond kubernetes cluster or jupyter server failures using AWS CLI for reliable transfers
+- S3 storage: Files persist beyond kubernetes cluster or jupyter server failures
 
 ### Planned 🚧
 - GPU resource configuration for k8s jobs from UI
diff --git a/src/jupyter_scheduler_k8s/__init__.py b/src/jupyter_scheduler_k8s/__init__.py
@@ -1,6 +1,7 @@
 """Kubernetes backend for jupyter-scheduler."""
 
 from .executors import K8sExecutionManager
+from .database_manager import K8sDatabaseManager
 
 __version__ = "0.1.0"
-__all__ = ["K8sExecutionManager"]
+__all__ = ["K8sExecutionManager", "K8sDatabaseManager"]
diff --git a/src/jupyter_scheduler_k8s/database_manager.py b/src/jupyter_scheduler_k8s/database_manager.py
@@ -0,0 +1,42 @@
+from kubernetes import client, config
+from jupyter_scheduler.managers import DatabaseManager
+
+from .k8s_orm import K8sSession
+
+
+class K8sDatabaseManager(DatabaseManager):
+    """Database manager that uses Kubernetes Jobs for storage."""
+    
+    def create_session(self, db_url: str):
+        """Create K8s session factory."""
+        if not db_url.startswith("k8s://"):
+            raise ValueError(f"K8sDatabaseManager only supports k8s:// URLs, got: {db_url}")
+            
+        namespace = db_url[6:] or "default"
+        
+        def session_factory():
+            return K8sSession(namespace=namespace)
+        return session_factory
+    
+    def create_tables(self, db_url: str, drop_tables: bool = False):
+        """Ensure K8s namespace exists."""
+        if not db_url.startswith("k8s://"):
+            return
+            
+        namespace = db_url[6:] or "default"
+        
+        try:
+            config.load_incluster_config()
+        except config.ConfigException:
+            config.load_kube_config()
+        
+        v1 = client.CoreV1Api()
+        
+        try:
+            v1.read_namespace(name=namespace)
+        except client.ApiException as e:
+            if e.status == 404:
+                namespace_body = client.V1Namespace(
+                    metadata=client.V1ObjectMeta(name=namespace)
+                )
+                v1.create_namespace(body=namespace_body)
diff --git a/src/jupyter_scheduler_k8s/k8s_orm.py b/src/jupyter_scheduler_k8s/k8s_orm.py