diff --git a/.gitignore b/.gitignore index d489ec2..577c34e 100644 --- a/.gitignore +++ b/.gitignore @@ -23,9 +23,10 @@ wheels/ # .env .env -# Deployment-generated URLs +# Deployment-generated URLs and backups .onboarding-status-url .token-service-url +url-map-backup*.yaml terraform.tfvars *.tfstate diff --git a/docs/developer_guide.md b/docs/developer_guide.md new file mode 100644 index 0000000..fd1c039 --- /dev/null +++ b/docs/developer_guide.md @@ -0,0 +1,196 @@ +# Developer Guide + +This guide provides comprehensive information for developers and administrators working with the AI Engineering Platform infrastructure. + +## Overview + +The AI Engineering Platform consists of multiple components that work together to provide secure, isolated development environments and automated participant management. This guide covers deployment, configuration, and maintenance procedures. + +--- + +## Platform Components + +### 1. Coder Server +- **Purpose**: Provides containerized development environments +- **Deployment**: GCP VM with Terraform +- **Documentation**: See [Coder Deployment](index.md#1-coder-deployment-for-gcp) + +### 2. Participant Onboarding System +- **Purpose**: Automated participant authentication and API key distribution +- **Components**: Firebase Authentication, Firestore, Cloud Functions +- **Documentation**: See [Participant Onboarding](index.md#2-participant-onboarding-system) + +### 3. Onboarding Status Dashboard +- **Purpose**: Real-time monitoring of participant onboarding status +- **Deployment**: Next.js on Cloud Run with Load Balancer path-based routing +- **Access**: `https://platform.vectorinstitute.ai/onboarding` + +--- + +## Infrastructure Deployment + +### Coder Server Deployment + +Follow the comprehensive deployment guide in the `coder/deploy/` directory. + +**Quick Start:** +```bash +cd coder/deploy +terraform init +terraform plan +terraform apply +``` + +For detailed instructions, see [`coder/deploy/README.md`](../coder/deploy/README.md). + +### Onboarding Status Web Dashboard + +The onboarding status dashboard is deployed on Cloud Run and integrated with the main platform load balancer using path-based routing. + +**Setup Guide:** [Onboarding Status Web - Load Balancer Setup](onboarding-status-web-load-balancer-setup.md) + +This guide covers: + +- Configuring Next.js with basePath for path-based routing +- Creating serverless Network Endpoint Groups (NEG) +- Setting up backend services for Cloud Run +- Configuring load balancer path matchers +- Deployment and verification procedures +- Troubleshooting common issues + +**Deployment Command:** +```bash +./scripts/admin/deploy_onboarding_status_web.sh +``` + +**Access URL:** +``` +https://platform.vectorinstitute.ai/onboarding +``` + +--- + +## Service Architecture + +### Load Balancer Configuration + +The platform uses a single Google Cloud Load Balancer to route traffic to multiple backend services: + +``` +platform.vectorinstitute.ai/ +├── / → Coder Server (VM: coder-entrypoint) +├── /onboarding → Cloud Run (onboarding-status-web) +└── /onboarding/* → Cloud Run (onboarding-status-web) +``` + +**Key Resources:** + +| Resource | Name | Purpose | +|----------|------|---------| +| External IP | `coderd-https-lb-ip` | Static IP for load balancer | +| HTTPS Forwarding Rule | `coderd-https-forwarding-rule` | Routes HTTPS traffic | +| HTTPS Proxy | `coderd-https-proxy` | SSL termination | +| URL Map | `https-url-map` | Path-based routing configuration | +| Backend Service (Coder) | `coderd-backend` | Routes to Coder VM | +| Backend Service (Onboarding) | `onboarding-backend` | Routes to Cloud Run | + +### Firebase Services + +The platform uses Firebase for authentication and data storage: + +- **Firebase Authentication**: Custom token generation for participants +- **Firestore**: Participant data, team assignments, and API keys +- **Firebase Security Rules**: Enforce team-level data isolation + +--- + +## Administration + +### Participant Management + +#### Adding Participants + +Use the admin scripts to add new participants: + +```bash +python scripts/admin/setup_participants.py +``` + +**Requirements:** +- CSV file with participant information +- Firebase admin credentials +- Team assignments + +#### Viewing Onboarding Status + +**Command Line:** +```bash +onboard --admin-status-report --gcp-project coderd +``` + +**Web Dashboard:** +``` +https://platform.vectorinstitute.ai/onboarding +``` + +The dashboard provides: +- Real-time participant status +- Onboarding completion rates +- Filtering by status +- CSV export functionality + +--- + +## Monitoring and Maintenance + +### Health Checks + +**Coder Server:** +```bash +curl -I https://platform.vectorinstitute.ai/ +``` + +**Onboarding Dashboard:** +```bash +curl -I https://platform.vectorinstitute.ai/onboarding +``` + +**Onboarding API:** +```bash +curl https://platform.vectorinstitute.ai/onboarding/api/participants +``` + +### Log Access + +**Cloud Run Logs:** +```bash +gcloud logging read "resource.type=cloud_run_revision AND resource.labels.service_name=onboarding-status-web" \ + --project=coderd \ + --limit=50 \ + --format=json +``` + +**Coder Server Logs:** +```bash +# SSH into VM +gcloud compute ssh coder-entrypoint --project=coderd --zone=us-central1-a + +# View logs +sudo journalctl -u coder -f +``` + +### Resource Management + +**List Active Services:** +```bash +# Cloud Run services +gcloud run services list --project=coderd + +# Compute instances +gcloud compute instances list --project=coderd + +# Backend services +gcloud compute backend-services list --project=coderd +``` + +--- diff --git a/docs/onboarding-status-web-load-balancer-setup.md b/docs/onboarding-status-web-load-balancer-setup.md new file mode 100644 index 0000000..98d4f4f --- /dev/null +++ b/docs/onboarding-status-web-load-balancer-setup.md @@ -0,0 +1,309 @@ +# Onboarding Status Web - Load Balancer Path-Based Routing Setup + +This document provides step-by-step instructions for configuring path-based routing to serve the Onboarding Status Web dashboard at `https://platform.vectorinstitute.ai/onboarding` using Google Cloud Load Balancer. + +## Overview + +The setup routes traffic from `platform.vectorinstitute.ai/onboarding` to a Cloud Run service while keeping all other traffic (including the root path) routed to the Coder server VM. + +**Architecture:** +- `platform.vectorinstitute.ai/` → Coder Server (VM: `coder-entrypoint`) +- `platform.vectorinstitute.ai/onboarding` → Cloud Run (`onboarding-status-web`) + +## Prerequisites + +- GCP project: `coderd` +- Existing load balancer with: + - External IP: `coderd-https-lb-ip` + - HTTPS forwarding rule: `coderd-https-forwarding-rule` + - HTTPS proxy: `coderd-https-proxy` + - URL map: `https-url-map` + - Backend service: `coderd-backend` (pointing to Coder VM) +- Cloud Run service: `onboarding-status-web` already deployed + +## Step 1: Configure Next.js Base Path + +Update the Next.js configuration to serve the app under the `/onboarding` path. + +**File:** `services/onboarding-status-web/next.config.js` + +```javascript +/** @type {import('next').NextConfig} */ +const nextConfig = { + basePath: '/onboarding', + output: 'standalone', + eslint: { + ignoreDuringBuilds: true, + }, + typescript: { + ignoreBuildErrors: false, + }, +} + +module.exports = nextConfig +``` + +**Important:** Update the API fetch path in the client code to use the absolute path: + +**File:** `services/onboarding-status-web/app/page.tsx` + +Change: +```javascript +const response = await fetch('/api/participants', { +``` + +To: +```javascript +const response = await fetch('/onboarding/api/participants', { +``` + +## Step 2: Create Serverless Network Endpoint Group (NEG) + +Create a serverless NEG that points to the Cloud Run service. + +```bash +gcloud compute network-endpoint-groups create onboarding-status-neg \ + --project=coderd \ + --region=us-central1 \ + --network-endpoint-type=serverless \ + --cloud-run-service=onboarding-status-web +``` + +**Verification:** +```bash +gcloud compute network-endpoint-groups describe onboarding-status-neg \ + --project=coderd \ + --region=us-central1 +``` + +## Step 3: Create Backend Service + +Create a backend service for the onboarding dashboard. + +```bash +gcloud compute backend-services create onboarding-backend \ + --project=coderd \ + --global \ + --load-balancing-scheme=EXTERNAL_MANAGED +``` + +**Verification:** +```bash +gcloud compute backend-services describe onboarding-backend \ + --project=coderd \ + --global +``` + +## Step 4: Add NEG to Backend Service + +Connect the serverless NEG to the backend service. + +```bash +gcloud compute backend-services add-backend onboarding-backend \ + --project=coderd \ + --global \ + --network-endpoint-group=onboarding-status-neg \ + --network-endpoint-group-region=us-central1 +``` + +**Verification:** +```bash +gcloud compute backend-services describe onboarding-backend \ + --project=coderd \ + --global \ + --format="yaml(name,backends)" +``` + +## Step 5: Backup Current URL Map + +Before making changes, backup the existing URL map configuration. + +```bash +gcloud compute url-maps export https-url-map \ + --project=coderd \ + --destination=url-map-backup-$(date +%Y%m%d).yaml +``` + +## Step 6: Update URL Map with Path-Based Routing + +Add path matcher rules to route `/onboarding` traffic to the Cloud Run backend. + +```bash +gcloud compute url-maps add-path-matcher https-url-map \ + --project=coderd \ + --path-matcher-name=onboarding-matcher \ + --default-service=coderd-backend \ + --path-rules="/onboarding=onboarding-backend,/onboarding/*=onboarding-backend" +``` + +**Verification:** +```bash +gcloud compute url-maps describe https-url-map \ + --project=coderd +``` + +Expected output should show: +```yaml +defaultService: .../coderd-backend +hostRules: +- hosts: + - '*' + pathMatcher: onboarding-matcher +pathMatchers: +- defaultService: .../coderd-backend + name: onboarding-matcher + pathRules: + - paths: + - /onboarding + - /onboarding/* + service: .../onboarding-backend +``` + +## Step 7: Deploy Updated Next.js Application + +Deploy the updated application with the basePath configuration. + +```bash +./scripts/admin/deploy_onboarding_status_web.sh +``` + +## Step 8: Verify and Test + +Wait 5-10 minutes for the load balancer configuration to propagate across Google's network. + +### Test Coder Platform (unchanged) +```bash +curl -I https://platform.vectorinstitute.ai/ +``` +Expected: HTTP 200 with Coder server headers + +### Test Onboarding Dashboard +```bash +curl -I https://platform.vectorinstitute.ai/onboarding +``` +Expected: HTTP 200 with Next.js headers + +### Test Onboarding API +```bash +curl -s https://platform.vectorinstitute.ai/onboarding/api/participants | jq '.summary' +``` +Expected: JSON response with participant summary + +### Browser Test +Open `https://platform.vectorinstitute.ai/onboarding` in a browser. You should see: +- Dashboard loads without errors +- Participant data appears in the table +- No JSON parsing errors in browser console + +## Troubleshooting + +### Issue: 404 on Cloud Run Direct URL + +**Symptom:** `https://onboarding-status-web-736624225747.us-central1.run.app/` returns 404 + +**Status:** This is expected behavior! The app is configured with `basePath: '/onboarding'`, so: +- ❌ `https://[cloud-run-url]/` → 404 (no root route) +- ✅ `https://[cloud-run-url]/onboarding` → Works +- ✅ `https://platform.vectorinstitute.ai/onboarding` → Works (intended access) + +Users should always access via the load balancer URL, not the direct Cloud Run URL. + +### Issue: API Returns HTML Instead of JSON + +**Symptom:** Browser console shows `Unexpected token '<', " { try { setError(null); - const response = await fetch('/api/participants', { + const response = await fetch('/onboarding/api/participants', { cache: 'no-store' }); diff --git a/services/onboarding-status-web/next.config.js b/services/onboarding-status-web/next.config.js index 7552ed9..9e48ec9 100644 --- a/services/onboarding-status-web/next.config.js +++ b/services/onboarding-status-web/next.config.js @@ -1,5 +1,6 @@ /** @type {import('next').NextConfig} */ const nextConfig = { + basePath: '/onboarding', output: 'standalone', eslint: { ignoreDuringBuilds: true,