diff --git a/assets/images/logo/dark.png b/assets/images/logo/dark.png new file mode 100644 index 0000000..bdb54fc Binary files /dev/null and b/assets/images/logo/dark.png differ diff --git a/assets/images/logo/dark.svg b/assets/images/logo/dark.svg index f147a2c..2f7b7a4 100644 --- a/assets/images/logo/dark.svg +++ b/assets/images/logo/dark.svg @@ -1,8 +1,8 @@ - - + + - DEPLOYSTACK - + DEPLOYSTACK + diff --git a/assets/images/logo/dark.webp b/assets/images/logo/dark.webp index 667b56f..9f4ff91 100644 Binary files a/assets/images/logo/dark.webp and b/assets/images/logo/dark.webp differ diff --git a/assets/images/logo/light.png b/assets/images/logo/light.png new file mode 100644 index 0000000..2799f16 Binary files /dev/null and b/assets/images/logo/light.png differ diff --git a/assets/images/logo/light.svg b/assets/images/logo/light.svg index 8be2af0..3e1a77c 100644 --- a/assets/images/logo/light.svg +++ b/assets/images/logo/light.svg @@ -1,8 +1,8 @@ - - + + - DEPLOYSTACK - + DEPLOYSTACK + diff --git a/assets/images/logo/light.webp b/assets/images/logo/light.webp index 24c1a8a..fe4988e 100644 Binary files a/assets/images/logo/light.webp and b/assets/images/logo/light.webp differ diff --git a/development/backend/mcp-configuration-architecture.mdx b/development/backend/mcp-configuration-architecture.mdx index d9f68a8..a1867dd 100644 --- a/development/backend/mcp-configuration-architecture.mdx +++ b/development/backend/mcp-configuration-architecture.mdx @@ -82,88 +82,42 @@ The three-tier system addresses this by: ## Database Schema +The three-tier configuration system uses three main tables: + ### Tier 1: MCP Catalog (`mcpServers`) -The catalog defines the configuration structure for each MCP server type: - -**Core Configuration Fields:** -```sql --- Template Level (with lock controls) -template_args: text('template_args') -- [{value, locked, description}] -template_env: text('template_env') -- Fixed environment variables -template_headers: text('template_headers') -- Fixed HTTP headers (for HTTP/SSE) -template_url_query_params: text('template_url_query_params') -- Fixed URL query params (for HTTP/SSE) - --- Team Schema (with lock/visibility controls) -team_args_schema: text('team_args_schema') -- Schema with lock controls -team_env_schema: text('team_env_schema') -- [{name, type, required, default_team_locked, visible_to_users}] -team_headers_schema: text('team_headers_schema') -- HTTP headers schema (for HTTP/SSE) -team_url_query_params_schema: text('team_url_query_params_schema') -- URL query params schema (for HTTP/SSE) - --- User Schema -user_args_schema: text('user_args_schema') -- User-configurable argument schema -user_env_schema: text('user_env_schema') -- User-configurable environment schema -user_headers_schema: text('user_headers_schema') -- User HTTP headers schema (for HTTP/SSE) -user_url_query_params_schema: text('user_url_query_params_schema') -- User URL query params schema (for HTTP/SSE) -``` +Defines configuration structure for each MCP server type including template-level config, team/user schemas, transport configuration, and registry tracking. -**Transport Configuration:** -```sql -transport_type: text('transport_type') -- 'stdio' | 'http' | 'sse' -packages: text('packages') -- JSON: npm/pip/docker packages (stdio) -remotes: text('remotes') -- JSON: HTTP/SSE endpoints -``` +### Tier 2: Team Installation (`mcpServerInstallations`) -**Official Registry Tracking Fields:** -```sql -official_name: text('official_name') -- Reverse-DNS name from registry -synced_from_official_registry: boolean -- True if synced from registry -official_registry_server_id: text -- Registry's server identifier -official_registry_version_id: text -- Registry's version identifier -official_registry_published_at: timestamp -- Original publication date -official_registry_updated_at: timestamp -- Last update in registry -``` +Manages shared team configurations including installation name, team args, environment variables, headers, and URL query params. -**GitHub Enhancement Fields:** -```sql -repository_source: text -- 'github' | 'gitlab' | 'bitbucket' -repository_id: text -- Platform-specific repo ID -repository_subfolder: text -- Monorepo subfolder path -github_account_id: text -- For avatar URLs -github_readme_base64: text -- Cached README content -github_stars: integer -- Star count for social proof -``` +### Per-User Instances (`mcpServerInstances`) -### Tier 2: Team Installation (`mcpServerInstallations`) +While not strictly a "configuration tier", the instance table enables per-user isolation: -Team installations manage shared configurations: +**Key Characteristics**: +- One instance per user per installation (UNIQUE constraint) +- Independent status tracking per user +- CASCADE deletes when installation removed +- Enables parallel status states (User A online, User B offline) -```sql -installation_name: text('installation_name') -- Team-friendly name -team_args: text('team_args') -- Team-level arguments (JSON array) -team_env: text('team_env') -- Team environment variables (JSON object) -team_headers: text('team_headers') -- Team HTTP headers (JSON object, for HTTP/SSE) -team_url_query_params: text('team_url_query_params') -- Team URL query params (JSON object, for HTTP/SSE) -``` +For complete database schema, see [Database Schema](/development/backend/database/). -### Tier 3: User Configuration (`mcpUserConfigurations`) +For complete instance lifecycle, see [Instance Lifecycle](/development/satellite/instance-lifecycle). -Individual user configurations: +### Tier 3: User Configuration (`mcpUserConfigurations`) -```sql -installation_id: text('installation_id') -- References team installation -user_id: text('user_id') -- User who owns this config +Stores individual user configurations including personal args, environment variables, headers, and URL query params. -user_args: text('user_args') -- User arguments (JSON array) -user_env: text('user_env') -- User environment variables (JSON object) -user_headers: text('user_headers') -- User HTTP headers (JSON object, for HTTP/SSE) -user_url_query_params: text('user_url_query_params') -- User URL query params (JSON object, for HTTP/SSE) -``` +**For complete database schema details, see [Database Schema](/development/backend/database/).** ## Configuration Flow ### Runtime Assembly +**Per-User Instance Isolation**: Configurations are assembled **per user** at runtime. Each team member's instance receives their merged config (template + team + user). + ### Configuration Schema Step Global administrators categorize configuration elements through the Configuration Schema Step: diff --git a/development/backend/mcp-server-oauth.mdx b/development/backend/mcp-server-oauth.mdx index 5321dc0..a13516a 100644 --- a/development/backend/mcp-server-oauth.mdx +++ b/development/backend/mcp-server-oauth.mdx @@ -42,6 +42,7 @@ The OAuth implementation includes: - **OAuth Discovery Service** - Detects OAuth requirement and discovers endpoints using RFC 8414/9728 - **Authorization Endpoint** - Initiates OAuth flow with PKCE, state parameter, and resource parameter - **Callback Endpoint** - Exchanges authorization code for tokens +- **Re-Authentication Endpoint** - User-initiated token refresh when automatic refresh fails - **Token Service** - Handles token exchange and refresh operations - **Client Registration Service** - Implements RFC 7591 Dynamic Client Registration (DCR) - **Encryption Service** - AES-256-GCM encryption for tokens at rest @@ -375,6 +376,13 @@ if (provider) { status_message: 'Authenticated successfully, waiting for satellite to connect' }); ``` + +**Per-User Instance Creation**: OAuth callback creates the user's instance with status='connecting'. For multi-user teams: +- Installing user's instance created with their OAuth credentials +- Other team members' instances created with status='awaiting_user_config' (they must authenticate separately) +- Each user authenticates independently with their own OAuth account + +For instance lifecycle details, see [Instance Lifecycle](/development/satellite/instance-lifecycle). @@ -824,6 +832,8 @@ CREATE TABLE mcpOauthTokens ( }) .where(eq(mcpServerInstallations.id, installation.id)); ``` + +**Per-User Instance Impact**: Token refresh failures only affect the specific user's instance. Other team members' instances remain unaffected even if one user's OAuth token expires. @@ -837,16 +847,43 @@ INFO: OAuth token refresh job completed (totalTokens: 3, successCount: 3, failur ### Token Expiration Handling -**During token refresh cron job**: +When automatic token refresh fails (server offline, invalid refresh token, token revoked), the installation enters `requires_reauth` status. Users can recover through self-service re-authentication. + +**Automatic Refresh Failure**: +- Token refresh job attempts refresh +- Refresh fails (network error, invalid_grant, server offline) - Installation status → `requires_reauth` -- User sees "Reconnect" button in frontend -- User must re-authorize to get new tokens - -**During satellite token retrieval**: -- Satellite checks `expires_at` timestamp -- If expired and no refresh possible → Return error to MCP client -- MCP client receives authentication error -- User must re-authenticate +- Status message explains why re-auth is needed + +**User-Initiated Re-Authentication**: + +Users can re-authenticate existing installations without reinstalling: + +**Endpoint**: `POST /api/teams/:teamId/mcp/installations/:installationId/reauth` + +**Permission**: `mcp.installations.view` (both team_admin and team_user) + +**Why Both Roles**: OAuth tokens are per-user credentials, not team-level configuration. Each user authenticates with their own account. + +**Flow**: +1. Frontend detects `requires_reauth` status and shows "Re-authenticate" button +2. User clicks button → Backend starts OAuth flow (same as initial authorization) +3. OAuth provider redirects to authorization page +4. User authorizes → Callback exchanges code for new tokens +5. Backend **updates** existing token record (doesn't create new installation) +6. Installation status → `connecting` → `online` +7. Satellite receives updated tokens via configuration sync + +**Key Difference from Initial Authorization**: +- Initial: Creates new installation + new token record +- Re-auth: Updates existing installation + existing token record + +**Database Impact**: +- Pending flow created with `installation_id` reference (links to existing installation) +- Callback detects `installation_id` and performs UPDATE instead of INSERT +- Preserves team configuration (env vars, args, headers) + +**Security**: Same PKCE flow, state validation, and token encryption as initial authorization ### Token Revocation diff --git a/development/backend/satellite/commands.mdx b/development/backend/satellite/commands.mdx index c8cdeaf..ec490a7 100644 --- a/development/backend/satellite/commands.mdx +++ b/development/backend/satellite/commands.mdx @@ -76,23 +76,28 @@ interface CommandPayload { ## Status Changes Triggered by Commands -Commands trigger installation status changes through satellite event emission: +Commands trigger **per-user instance** status changes through satellite event emission: -| Command | Status Before | Status After | When | -|---------|--------------|--------------|------| -| `configure` (install) | N/A | `provisioning` → `command_received` → `connecting` | Installation creation flow | +| Command | Instance Status Before | Instance Status After | When | +|---------|----------------------|---------------------|------| +| `configure` (install) | N/A | `provisioning` or `awaiting_user_config` | New installation - depends on required user config | | `configure` (update) | `online` | `restarting` → `online` | Configuration change applied | -| `configure` (delete) | Any | Process terminated | Installation removal | -| `health_check` (credential) | `online` | `requires_reauth` | OAuth token invalid | +| `configure` (delete) | Any | Instance deleted (CASCADE) | Installation removal | +| `health_check` (credential) | `online` | `requires_reauth` | OAuth token invalid for this user | | `restart` | `online` | `restarting` → `online` | Manual restart requested | -**Status Lifecycle on Installation**: -1. Backend creates installation → status=`provisioning` -2. Backend sends `configure` command → status=`command_received` -3. Satellite connects to server → status=`connecting` -4. Satellite discovers tools → status=`discovering_tools` -5. Satellite syncs tools to backend → status=`syncing_tools` -6. Process complete → status=`online` +**Per-User Status Tracking**: Each team member's instance has independent status. User A's config change only affects User A's instance status. + +**Status Lifecycle on Installation** (per user): +1. Backend creates installation → creates instances for all team members +2. If user has required config → status=`provisioning` +3. If user missing required config → status=`awaiting_user_config` (not sent to satellite) +4. User configures settings → status changes to `provisioning` → satellite spawns +5. Satellite connects → status=`connecting` +6. Satellite discovers tools → status=`discovering_tools` +7. Process complete → status=`online` + +For complete instance lifecycle documentation, see [Instance Lifecycle](/development/satellite/instance-lifecycle). For complete status transition documentation, see [Backend Events - Status Values](/development/backend/satellite/events#mcp-server-status_changed). @@ -118,6 +123,11 @@ All `configure` commands include an `event` field in the payload for tracking an 3. Identifies added/removed/modified servers 4. Takes appropriate action +**Per-User Instance Impact**: Configure commands trigger per-user instance creation/updates: +- Installation created → All team members get instances (status depends on required user config) +- Installation updated → Only affects users with existing configs (others remain `awaiting_user_config`) +- Installation deleted → CASCADE deletes all user instances + **Purpose of Event Field**: - Database record keeping - Structured logging diff --git a/development/backend/satellite/communication.mdx b/development/backend/satellite/communication.mdx index a6a9dde..3ead12d 100644 --- a/development/backend/satellite/communication.mdx +++ b/development/backend/satellite/communication.mdx @@ -25,9 +25,9 @@ The satellite communication system includes: **Team-Aware Configuration Distribution**: - Global satellites receive ALL team MCP server installations -- Each team installation becomes a separate process with unique identifier -- Process ID format: `{server_slug}-{team_slug}-{installation_id}` -- Team-specific configurations (args, environment, headers) merged per installation +- Each team installation becomes separate per-user processes with unique identifiers +- Process ID format: `{server_slug}-{team_slug}-{user_slug}-{installation_id}` +- User-specific configurations (args, environment, headers) merged per user instance **Configuration Merging Process**: 1. Template-level configuration (from MCP server definition) @@ -268,7 +268,9 @@ The backend provides REST and SSE endpoints for frontend access to installation ### Status & Monitoring Endpoints **GET `/api/teams/{teamId}/mcp/installations/{installationId}/status`** -- Returns current installation status, status message, and last update timestamp +- Returns authenticated user's instance status only (not installation-level) +- Each user sees ONLY their own instance status, not other team members' statuses +- Response: `{ status, status_message, status_updated_at, last_health_check_at }` - Used by frontend for real-time status badges and progress indicators **GET `/api/teams/{teamId}/mcp/installations/{installationId}/logs`** @@ -445,36 +447,21 @@ The satellite system integrates with existing DeployStack schema through 5 speci - Alert generation and notification triggers - Historical health trend analysis -### New Columns Added (Status & Health Tracking System) - -**mcpServerInstallations** table: -- `status` (text) - Current installation status (11 possible values) -- `status_message` (text, nullable) - Human-readable status context or error details -- `status_updated_at` (timestamp) - Last status change timestamp -- `last_health_check_at` (timestamp, nullable) - Last health check execution time -- `last_credential_check_at` (timestamp, nullable) - Last credential validation time -- `settings` (jsonb, nullable) - Generic settings object (e.g., `request_logging_enabled`) - -**mcpServers** table: -- `health_status` (text, nullable) - Template-level aggregated health status -- `last_health_check_at` (timestamp, nullable) - Last template health check time -- `health_check_error` (text, nullable) - Last health check error message - -**mcpServerLogs** table: -- Stores batched stderr logs from satellites -- 100-line limit per installation (enforced by cleanup cron job) -- Fields: `installation_id`, `team_id`, `log_level`, `message`, `timestamp` - -**mcpRequestLogs** table: -- Stores batched tool execution logs -- `tool_response` (jsonb, nullable) - MCP server response data -- Privacy control: Only captured when `request_logging_enabled=true` -- Fields: `installation_id`, `team_id`, `tool_name`, `request_params`, `tool_response`, `duration_ms`, `success`, `error_message`, `timestamp` - -**mcpToolMetadata** table: -- Stores discovered tools with token counts -- Used for hierarchical router token savings calculations -- Fields: `installation_id`, `server_slug`, `tool_name`, `description`, `input_schema`, `token_count`, `discovered_at` +### Database Integration + +**Per-User Instance Model**: Each team member gets their own instance with independent status tracking. Status exists ONLY at the instance level in `mcpServerInstances` table. + +**Key Tables**: +- `mcpServerInstances` - Per-user instance status tracking +- `satellites` - Satellite registry +- `satelliteCommands` - Command queue +- `satelliteProcesses` - Process tracking +- `satelliteUsageLogs` - Usage analytics +- `satelliteHeartbeats` - Health monitoring + +For complete database schema details, see [Database Schema](/development/backend/database/). + +For complete instance lifecycle documentation, see [Instance Lifecycle](/development/satellite/instance-lifecycle). ### Team Isolation in Data Model @@ -602,15 +589,11 @@ server.get('/satellites/:satelliteId/commands', { ### Database Integration -The satellite system extends the existing database schema with 5 specialized tables: +The satellite system uses specialized tables for registry, commands, processes, usage logs, and heartbeats. -**Schema Location**: `services/backend/src/db/schema.ts` +**Per-User Instance Model**: `mcpServerInstances` table tracks status for each user's instance independently, with CASCADE delete when installation is removed. -**Table Relationships**: -- `satellites` table links to existing `teams` and `authUser` tables -- `satelliteProcesses` table references `mcpServerInstallations` for team context -- `satelliteCommands` table includes team context for command execution -- All tables use existing foreign key relationships for data integrity +For complete database schema and relationships, see [Database Schema](/development/backend/database/). ### Configuration Query Implementation diff --git a/development/backend/satellite/events.mdx b/development/backend/satellite/events.mdx index d77b0e4..200ce3e 100644 --- a/development/backend/satellite/events.mdx +++ b/development/backend/satellite/events.mdx @@ -51,16 +51,9 @@ services/backend/src/events/satellite/ ### Convention-Based Handler Discovery -The dispatcher automatically discovers and registers handlers from the `handlerModules` array in `index.ts`: - -```typescript -const handlerModules = [ - () => import('./mcp-server-started'), - () => import('./mcp-tool-executed'), - () => import('./mcp-server-crashed'), - // Add new handlers here - they will be automatically registered -]; -``` +The dispatcher automatically discovers and registers handlers from the `handlerModules` array. + +**Source:** [services/backend/src/events/satellite/index.ts](services/backend/src/events/satellite/index.ts) Each handler must export three components: @@ -70,20 +63,9 @@ Each handler must export three components: ### Handler Interface -All event handlers must implement this interface: - -```typescript -export interface EventHandler { - EVENT_TYPE: string; - SCHEMA: Record; - handle: ( - satelliteId: string, - eventData: Record, - db: PostgresJsDatabase, - eventTimestamp: Date - ) => Promise; -} -``` +All event handlers must implement the `EventHandler` interface. + +**Source:** [services/backend/src/events/satellite/index.ts](services/backend/src/events/satellite/index.ts) ## Event Processing @@ -101,7 +83,7 @@ export interface EventHandler { "type": "mcp.server.started", "timestamp": "2025-01-10T10:30:45.123Z", "data": { - "process_id": "proc-123", + "process_id": "filesystem-devteam-alice-U3hCfHenbK5", "server_id": "filesystem-team-xyz", "server_slug": "filesystem", "team_id": "team-xyz", @@ -189,16 +171,18 @@ Updates `satelliteProcesses` table when server exits unexpectedly. **Optional Fields**: None (all fields required for proper crash tracking) #### mcp.server.status_changed -Updates `mcpServerInstallations` table when server status changes during installation, discovery, or health checks. +Updates `mcpServerInstances` table when server status changes for a specific user's instance during installation, discovery, or health checks. + +**Per-User Instance Status**: Each user has an independent instance with independent status tracking. Status exists ONLY at the instance level in `mcpServerInstances` table, not at the installation level. -**Business Logic**: Tracks installation lifecycle from provisioning through discovery to online/error states. Enables frontend progress indicators and error visibility. +**Business Logic**: Tracks per-user instance lifecycle from provisioning through discovery to online/error states. Enables frontend progress indicators and error visibility for each user's instance independently. -**Required Fields** (snake_case): `installation_id`, `team_id`, `status`, `timestamp` +**Required Fields** (snake_case): `installation_id`, `team_id`, `user_id`, `status`, `timestamp` **Optional Fields**: `status_message` (string, human-readable context or error details) -**Status Values** (11 total): -- `provisioning` - Installation created, waiting for satellite +**Status Values** (12 total): +- `provisioning` - Instance created, waiting for satellite - `command_received` - Satellite acknowledged install command - `connecting` - Satellite connecting to MCP server - `discovering_tools` - Tool discovery in progress @@ -209,11 +193,14 @@ Updates `mcpServerInstallations` table when server status changes during install - `error` - Connection failed with specific error - `requires_reauth` - OAuth token expired/revoked - `permanently_failed` - Process crashed 3+ times in 5 minutes +- `awaiting_user_config` - User hasn't configured required personal credentials yet (backend filters these instances from satellite config) **Handler Implementation**: `services/backend/src/events/handlers/mcp/status-changed.handler.ts` For satellite-side status detection logic and lifecycle flows, see [Satellite Status Tracking](/development/satellite/status-tracking). +For complete instance lifecycle and per-user isolation details, see [Instance Lifecycle](/development/satellite/instance-lifecycle). + **Emission Points**: - Success path: After successful tool discovery → status='online' - Failure path: On connection errors → status='offline', 'error', or 'requires_reauth' @@ -274,89 +261,31 @@ For satellite-side tool discovery implementation, see [Satellite Tool Discovery] ## Creating New Event Handlers -### Handler Template - **CRITICAL**: All event data fields MUST use **snake_case** naming convention to match satellite event emission and backend API standards. -Create a new file in `services/backend/src/events/satellite/`: - -```typescript -import type { PostgresJsDatabase } from 'drizzle-orm/postgres-js'; -import { yourTable } from '../../db/schema.ts'; -import { eq } from 'drizzle-orm'; +### Example Handler Files -export const EVENT_TYPE = 'your.event.type'; - -export const SCHEMA = { - type: 'object', - properties: { - required_field: { - type: 'string', - minLength: 1, - description: 'Description of this field' - }, - optional_field: { - type: 'number', - description: 'Optional numeric field' - } - }, - required: ['required_field'], - additionalProperties: true -} as const; - -// TypeScript interface can use camelCase internally -interface YourEventData { - required_field: string; - optional_field?: number; -} +See existing event handlers for implementation patterns: -export async function handle( - satelliteId: string, - eventData: Record, - db: PostgresJsDatabase, - eventTimestamp: Date -): Promise { - const data = eventData as unknown as YourEventData; - - // Update existing business table - await db.update(yourTable) - .set({ - status: 'updated', - updated_at: eventTimestamp - }) - .where(eq(yourTable.id, data.required_field)); -} -``` +- **Server lifecycle**: [mcp-server-started.ts](services/backend/src/events/satellite/mcp-server-started.ts) +- **Status updates**: [mcp-server-status-changed.ts](services/backend/src/events/satellite/mcp-server-status-changed.ts) +- **Tool execution**: [mcp-tool-executed.ts](services/backend/src/events/satellite/mcp-tool-executed.ts) +- **Tool discovery**: [mcp-tools-discovered.ts](services/backend/src/events/satellite/mcp-tools-discovered.ts) ### Registration Steps 1. Create handler file in `services/backend/src/events/satellite/` 2. Export `EVENT_TYPE`, `SCHEMA`, and `handle()` function -3. Add import to `handlerModules` array in `index.ts`: -```typescript -const handlerModules = [ - () => import('./mcp-server-started'), - () => import('./mcp-tool-executed'), - () => import('./mcp-server-crashed'), - () => import('./your-new-handler'), // Add here -]; -``` +3. Add import to `handlerModules` array in [index.ts](services/backend/src/events/satellite/index.ts) 4. Handler is automatically registered and ready to process events ## Schema Validation ### AJV Configuration -The dispatcher uses AJV with specific configuration for compatibility: +The dispatcher uses AJV with specific configuration for compatibility. -```typescript -const ajv = new Ajv({ - allErrors: true, // Report all validation errors - strict: false, // Allow unknown keywords - strictTypes: false // Disable strict type checking -}); -addFormats(ajv); // Add format validators (email, date-time, etc.) -``` +**Source:** [services/backend/src/events/satellite/index.ts](services/backend/src/events/satellite/index.ts) ### Validation Process @@ -384,7 +313,7 @@ Events route to existing business tables based on their purpose: |-----------|----------------|--------| | `mcp.server.started` | `satelliteProcesses` | Update status='running', set start time | | `mcp.server.crashed` | `satelliteProcesses` | Update status='failed', log error details | -| `mcp.server.status_changed` | `mcpServerInstallations` | Update status, status_message, status_updated_at | +| `mcp.server.status_changed` | `mcpServerInstances` | Update status, status_message, status_updated_at for specific user's instance | | `mcp.tool.executed` | `satelliteUsageLogs` | Insert usage record with metrics | | `mcp.server.logs` | `mcpServerLogs` | Insert batched stderr logs (100-line limit) | | `mcp.request.logs` | `mcpRequestLogs` | Insert tool execution logs with request/response | @@ -458,17 +387,9 @@ Each event is processed in a separate database transaction: ### Unit Testing -Test individual event handlers in isolation: +Test individual event handlers in isolation. -```typescript -// Test handler validation -const validData = { processId: 'proc-123', serverId: 'server-xyz', ... }; -await handler.handle('satellite-id', validData, mockDb, new Date()); - -// Test schema validation -const validate = ajv.compile(handler.SCHEMA); -expect(validate(validData)).toBe(true); -``` +**Example tests:** See existing handler test files in [services/backend/src/events/satellite/](services/backend/src/events/satellite/) (`.test.ts` files) ### Integration Testing @@ -527,16 +448,9 @@ All event operations are logged with structured data: ### Debug Queries -Check registered event types: - -```typescript -import { getRegisteredEventTypes } from '../events/satellite'; - -const types = await getRegisteredEventTypes(); -console.log('Registered event types:', types); -``` +**Check registered event types:** Use `getRegisteredEventTypes()` from [services/backend/src/events/satellite/index.ts](services/backend/src/events/satellite/index.ts) -Verify database updates: +**Verify database updates:** ```sql -- Check process status after mcp.server.started diff --git a/development/satellite/architecture.mdx b/development/satellite/architecture.mdx index 7df1c60..7a043fc 100644 --- a/development/satellite/architecture.mdx +++ b/development/satellite/architecture.mdx @@ -1,10 +1,10 @@ --- title: Satellite Architecture Design -description: Complete architectural overview of DeployStack Satellite - from current MCP transport implementation to full enterprise MCP management platform. +description: Complete architectural overview of DeployStack Satellite - per-user MCP instance management with dual deployment support. --- -DeployStack Satellite is an edge worker service that manages MCP servers with dual deployment support: HTTP proxy for external endpoints and stdio subprocess for local MCP servers. This document covers both the current MCP transport implementation and the planned full architecture. +DeployStack Satellite is an edge worker service that manages **per-user MCP server instances** with dual deployment support: HTTP proxy for external endpoints and stdio subprocess for local MCP servers. Each team member gets their own isolated instance with merged configuration (Template + Team + User). ## Technical Overview @@ -13,12 +13,13 @@ DeployStack Satellite is an edge worker service that manages MCP servers with du Satellites operate as edge workers similar to GitHub Actions runners, providing: - **MCP Transport Protocols**: SSE, Streamable HTTP, Direct HTTP communication -- **Dual MCP Server Management**: HTTP proxy + stdio subprocess support (ready for implementation) -- **Team Isolation**: nsjail sandboxing with built-in resource limits (ready for implementation) -- **OAuth 2.1 Resource Server**: Token introspection with Backend +- **Per-User Instance Management**: Each team member has their own MCP server instance (implemented) +- **Dual MCP Server Management**: HTTP proxy + stdio subprocess support (implemented) +- **Team and User Isolation**: Per-user process isolation with independent status tracking (implemented) +- **OAuth 2.1 Resource Server**: Token introspection with Backend for team and user context (implemented) - **Backend Polling Communication**: Outbound-only, firewall-friendly - **Real-Time Event System**: Immediate satellite → backend event emission with automatic batching -- **Process Lifecycle Management**: Spawn, monitor, terminate MCP servers (ready for implementation) +- **Process Lifecycle Management**: Per-user spawn, monitor, terminate with independent lifecycles (implemented) - **Background Jobs System**: Cron-like recurring tasks with automatic error handling ## Current Implementation Architecture @@ -294,20 +295,29 @@ See [Event Emission](/development/satellite/event-emission) for complete event t ### Status Tracking System -The satellite tracks MCP server installation health through an 11-state status system that drives tool availability and automatic recovery. +The satellite tracks per-user MCP server **instance** health through a 12-state status system that drives tool availability and automatic recovery. + +**Per-User Status Tracking:** +- **Status Location**: `mcpServerInstances` table (per user) +- **No Installation Status**: Status fields completely removed from `mcpServerInstallations` +- **Independent Tracking**: Each team member has independent status for each MCP server +- **User-Specific Filtering**: Users see only tools from their OWN instances that are online **Status Values:** -- Installation lifecycle: `provisioning`, `command_received`, `connecting`, `discovering_tools`, `syncing_tools` +- User configuration: `awaiting_user_config` (new - user hasn't configured required user-level fields) +- Instance lifecycle: `provisioning`, `command_received`, `connecting`, `discovering_tools`, `syncing_tools` - Healthy state: `online` (tools available) - Configuration changes: `restarting` - Failure states: `offline`, `error`, `requires_reauth`, `permanently_failed` **Status Integration:** -- **Tool Filtering**: Tools from non-online servers hidden from discovery -- **Auto-Recovery**: Offline servers auto-recover when responsive -- **Event Emission**: Status changes emitted immediately to backend +- **Tool Filtering**: Tools from user's non-online instances hidden from discovery +- **Auto-Recovery**: Offline instances auto-recover when responsive +- **Event Emission**: Status changes emitted immediately to backend with `user_id` field +- **Backend Filtering**: Instances with `awaiting_user_config` NOT sent to satellite (prevents spawn crashes) See [Status Tracking](/development/satellite/status-tracking) for complete status lifecycle and transitions. +See [Instance Lifecycle](/development/satellite/instance-lifecycle) for per-user instance creation and management. ### Log Capture System @@ -325,34 +335,44 @@ See [Log Capture](/development/satellite/log-capture) for buffering implementati - **Activity Tracking**: Prevents session hijacking - **Error Handling**: Secure error responses -### Planned Security Features +### Security Features (Implemented) + +**Per-User Instance Isolation:** +- **Process Isolation**: Each user's instance runs in isolated process +- **Independent Lifecycle**: Terminating one user's process doesn't affect teammates +- **User-Specific Config**: Merged Template + Team + User configuration per instance +- **Status Isolation**: Each user's instance has independent status tracking -**Team Isolation:** -- **Linux Namespaces**: PID, network, filesystem isolation -- **Process Groups**: Separate process trees per team -- **User Isolation**: Dedicated system users per team +**Team and User Separation:** +- **OAuth Token Context**: Team ID AND User ID extracted from tokens +- **Instance Resolution**: Tools route to user's specific instance (not teammates) +- **Database Separation**: `mcpServerInstances` table tracks per-user instances -**Resource Management:** -- **cgroups v2**: CPU and memory limits -- **Resource Quotas**: 0.1 CPU cores, 100MB RAM per process -- **Automatic Cleanup**: 5-minute idle timeout +**Resource Management (stdio processes):** +- **nsjail Isolation**: PID, network, filesystem isolation in production +- **Resource Quotas**: 100MB RAM, 60s CPU time per process +- **Development Mode**: Direct spawn() without isolation for cross-platform development **Authentication & Authorization:** -- **OAuth 2.1 Resource Server**: Backend token validation -- **Scope-Based Access**: Fine-grained permissions -- **Team Context**: Automatic team resolution from tokens +- **OAuth 2.1 Resource Server**: Backend token validation with 5-minute caching +- **User Context**: Automatic user and team resolution from tokens +- **Per-User Access Control**: Users only access their OWN instances + +See [Team Isolation](/development/satellite/team-isolation) for complete implementation details. ## MCP Server Management ### Dual MCP Server Support **stdio Subprocess Servers:** +- **Per-User Instances**: Each team member has their own process for each MCP server - **Local Execution**: MCP servers as Node.js child processes - **JSON-RPC Communication**: Full MCP protocol 2025-11-05 over stdin/stdout -- **Process Lifecycle**: Spawn, monitor, auto-restart (max 3 attempts), terminate -- **Team Isolation**: Processes tracked by team_id with environment-based security -- **Tool Discovery**: Automatic tool caching with namespacing -- **Resource Limits**: nsjail in production (100MB RAM, 60s CPU, 50 processes) +- **Process Lifecycle**: Per-user spawn, monitor, auto-restart (max 3 attempts), terminate +- **Instance Isolation**: Processes tracked by `team_id` AND `user_id` with independent lifecycles +- **ProcessId Format**: `{server_slug}-{team_slug}-{user_slug}-{installation_id}` +- **Tool Discovery**: Automatic tool caching with per-user namespacing +- **Resource Limits**: nsjail in production (100MB RAM, 60s CPU, 50 processes per user) - **Development Mode**: Plain spawn() on all platforms for easy debugging **HTTP Proxy Servers:** @@ -395,9 +415,10 @@ Configuration → Spawn → Monitor → Health Check → Restart/Terminate - **Isolation Method**: Linux namespaces + cgroups v2 ### Technology Stack -- **HTTP Framework**: Fastify with @fastify/http-proxy (planned) -- **Process Communication**: stdio JSON-RPC for local MCP servers (planned) -- **Authentication**: OAuth 2.1 Resource Server with token introspection (planned) +- **HTTP Framework**: Fastify with @fastify/http-proxy (implemented) +- **Process Communication**: stdio JSON-RPC for local MCP servers (implemented) +- **Authentication**: OAuth 2.1 Resource Server with token introspection (implemented) +- **Per-User Instance Management**: ProcessManager with team and user tracking (implemented) - **Logging**: Pino structured logging - **Build System**: TypeScript + Webpack diff --git a/development/satellite/event-emission.mdx b/development/satellite/event-emission.mdx index 46ae8e9..f29f8e3 100644 --- a/development/satellite/event-emission.mdx +++ b/development/satellite/event-emission.mdx @@ -10,7 +10,7 @@ The satellite communicates with the backend through a centralized EventBus that ## Overview The satellite emits events for: -- **Status Changes**: Real-time installation status updates +- **Status Changes**: Real-time instance status updates (per-user) - **Server Logs**: Batched stderr output from MCP servers - **Request Logs**: Batched tool execution logs with request/response data - **Tool Metadata**: Tool discovery results with token counts @@ -18,6 +18,10 @@ The satellite emits events for: All events are processed by the backend's event handler system and trigger database updates, SSE broadcasts to frontend, and health monitoring actions. + +**Per-User Instance Events:** Status change events now include `user_id` field to target the correct user's instance. Each user has independent status tracking in the `mcpServerInstances` table. + + ## Event System Architecture ``` @@ -36,7 +40,7 @@ Frontend SSE Streams (real-time updates to users) ### mcp.server.status_changed -**Purpose:** Update installation status in real-time +**Purpose:** Update instance status in real-time (per-user) **Emitted by:** - ProcessManager (connecting, online, crashed, permanently_failed) @@ -50,9 +54,10 @@ For complete status transition triggers and lifecycle flows, see [Status Trackin { installation_id: string; team_id: string; - status: 'provisioning' | 'command_received' | 'connecting' | 'discovering_tools' - | 'syncing_tools' | 'online' | 'restarting' | 'offline' | 'error' - | 'requires_reauth' | 'permanently_failed'; + user_id: string; // NEW: Identifies which user's instance + status: 'awaiting_user_config' | 'provisioning' | 'command_received' | 'connecting' + | 'discovering_tools' | 'syncing_tools' | 'online' | 'restarting' | 'offline' + | 'error' | 'requires_reauth' | 'permanently_failed'; status_message?: string; timestamp: string; // ISO 8601 } @@ -63,13 +68,18 @@ For complete status transition triggers and lifecycle flows, see [Status Trackin eventBus.emit('mcp.server.status_changed', { installation_id: 'inst_abc123', team_id: 'team_xyz', + user_id: 'user_alice', // Alice's instance status: 'online', status_message: 'Server connected successfully', timestamp: '2025-01-15T10:30:00.000Z' }); ``` -**Backend Action:** Updates `mcpServerInstallations.status` and broadcasts via SSE +**Backend Action:** Updates `mcpServerInstances.status` for the specific user's instance and broadcasts via SSE + + +**Per-User Status:** The `user_id` field ensures status updates are applied to the correct user's instance. Status exists ONLY in `mcpServerInstances` table (removed from `mcpServerInstallations`). + --- @@ -83,13 +93,14 @@ eventBus.emit('mcp.server.status_changed', { **Batching Strategy:** - **Interval**: 3 seconds after first log entry - **Max Size**: 20 logs per batch (forces immediate flush) -- **Grouping**: By `installation_id + team_id` +- **Grouping**: By `installation_id + team_id + user_id` (per-user instance) **Payload:** ```typescript { installation_id: string; team_id: string; + user_id: string; // Per-user instance logs logs: Array<{ level: 'info' | 'warn' | 'error' | 'debug'; message: string; @@ -104,6 +115,7 @@ eventBus.emit('mcp.server.status_changed', { eventBus.emit('mcp.server.logs', { installation_id: 'inst_abc123', team_id: 'team_xyz', + user_id: 'user_alice', // Alice's instance logs logs: [ { level: 'error', @@ -120,7 +132,7 @@ eventBus.emit('mcp.server.logs', { }); ``` -**Backend Action:** Inserts logs into `mcpServerLogs` table, enforces 100-line limit per installation +**Backend Action:** Inserts logs into `mcpServerLogs` table, enforces 100-line limit per user instance --- @@ -134,7 +146,7 @@ eventBus.emit('mcp.server.logs', { **Batching Strategy:** - **Interval**: 3 seconds after first request - **Max Size**: 20 requests per batch -- **Grouping**: By `installation_id + team_id` +- **Grouping**: By `installation_id + team_id + user_id` (per-user instance) **Payload:** ```typescript @@ -173,7 +185,7 @@ eventBus.emit('mcp.request.logs', { }); ``` -**Backend Action:** Inserts requests into `mcpRequestLogs` table, enforces 100-line limit +**Backend Action:** Inserts requests into `mcpRequestLogs` table, enforces 100-line limit per user instance **Privacy Note:** Only emitted if `settings.request_logging_enabled !== false` @@ -228,6 +240,12 @@ eventBus.emit('mcp.tools.discovered', { These events track stdio MCP server process state: + +**Per-User Process Context:** The `process_id` field uniquely identifies each user's process instance using the format `{server_slug}-{team_slug}-{user_slug}-{installation_id}`. This ensures process lifecycle events target the correct user's instance. + +Example: `filesystem-acme-alice-abc123` + + #### mcp.server.started **Emitted when:** Stdio process successfully spawned @@ -290,7 +308,7 @@ These events track stdio MCP server process state: } ``` -**Backend Action:** Sets installation status to `permanently_failed`, requires manual restart +**Backend Action:** Sets instance status to `permanently_failed` for the user's specific instance, requires manual restart --- @@ -310,7 +328,7 @@ Batching reduces: |-----------|-------|--------| | Batch Interval | 3 seconds | Balance between real-time feel and efficiency | | Max Batch Size | 20 entries | Prevent large payloads, force timely emission | -| Grouping Key | `installation_id + team_id` | Separate batches per installation | +| Grouping Key | `installation_id + team_id + user_id` | Separate batches per user instance | ### Batching Implementation @@ -330,6 +348,7 @@ const eventBus = EventBus.getInstance(); eventBus.emit('mcp.server.status_changed', { installation_id: 'inst_123', team_id: 'team_456', + user_id: 'user_alice', // Per-user instance status: 'online', timestamp: new Date().toISOString() }); @@ -421,7 +440,8 @@ The event emission system consists of several integrated components: ## Related Documentation -- [Status Tracking](/development/satellite/status-tracking) - Status values and lifecycle +- [Status Tracking](/development/satellite/status-tracking) - Per-user instance status values and lifecycle +- [Instance Lifecycle](/development/satellite/instance-lifecycle) - Four lifecycle processes for instance management - [Log Capture](/development/satellite/log-capture) - Logging system details -- [Process Management](/development/satellite/process-management) - Lifecycle events +- [Process Management](/development/satellite/process-management) - Per-user process lifecycle events - [Tool Discovery](/development/satellite/tool-discovery) - Tool metadata events diff --git a/development/satellite/hierarchical-router.mdx b/development/satellite/hierarchical-router.mdx index 9a11617..8c32438 100644 --- a/development/satellite/hierarchical-router.mdx +++ b/development/satellite/hierarchical-router.mdx @@ -361,9 +361,9 @@ MCP Client → POST /mcp Satellite Internal Process: 1. Parses tool_path: "github:create_issue" → "github-create_issue" -2. Looks up in cache: transport=stdio, serverName="github-team123-abc456" -3. Routes to ProcessManager for stdio execution -4. Sends JSON-RPC to actual GitHub MCP server process +2. Looks up in cache: transport=stdio, serverName="github-team123-alice-abc456" (user's instance) +3. Routes to ProcessManager for stdio execution (user's specific process) +4. Sends JSON-RPC to user's GitHub MCP server process 5. Returns result Satellite → Client @@ -398,7 +398,7 @@ const fuseOptions = { keys: [ { name: 'toolName', weight: 0.4 }, // 40% weight { name: 'description', weight: 0.35 }, // 35% weight - { name: 'serverSlug', weight: 0.25 } // 25% weight + { name: 'serverName', weight: 0.25 } // 25% weight ], includeScore: true, minMatchCharLength: 2, @@ -531,14 +531,19 @@ Both meta-tools are implemented and production-ready: ## Status-Based Tool Filtering -The hierarchical router integrates with status tracking to hide tools from unavailable servers and provide clear error messages when unavailable tools are executed. +The hierarchical router integrates with status tracking to hide tools from the user's unavailable instances and provide clear error messages when unavailable tools are executed. + + +**Per-User Filtering:** Each user sees only tools from their OWN instances that are online. Other team members' instance status doesn't affect your tool availability. + See [Status Tracking - Tool Filtering](/development/satellite/status-tracking#tool-filtering-by-status) for complete filtering logic, execution blocking rules, and status values. ## Related Documentation - [Tool Discovery Implementation](/development/satellite/tool-discovery) - Internal tool caching and discovery -- [Status Tracking](/development/satellite/status-tracking) - Tool filtering by server status +- [Status Tracking](/development/satellite/status-tracking) - Tool filtering by per-user instance status +- [Instance Lifecycle](/development/satellite/instance-lifecycle) - Per-user instance management - [Recovery System](/development/satellite/recovery-system) - How offline servers auto-recover - [MCP Transport Protocols](/development/satellite/mcp-transport) - How clients connect - [Process Management](/development/satellite/process-management) - stdio server lifecycle diff --git a/development/satellite/instance-lifecycle.mdx b/development/satellite/instance-lifecycle.mdx new file mode 100644 index 0000000..b6c7ebf --- /dev/null +++ b/development/satellite/instance-lifecycle.mdx @@ -0,0 +1,454 @@ +--- +title: Instance Lifecycle Management +description: Per-user MCP server instance lifecycle covering creation, deletion, and team membership changes in DeployStack Satellite. +--- + +DeployStack manages MCP server instances on a per-user basis, ensuring each team member has their own isolated process with merged configuration. This document covers the four key lifecycle processes that create, maintain, and clean up instances across team operations. + +## Architecture Overview + +### Per-User Instance Model + +DeployStack follows a **per-user instance architecture**: + +``` +1 Installation × N Users = N Instances + +Example: +- Team "Acme Corp" installs Filesystem MCP +- Team has 3 members: Alice, Bob, Charlie +- Result: 3 separate instances (processes), one per user +``` + +**Key Concepts:** + +- **Installation**: MCP server installed for a team (row in `mcpServerInstallations`) +- **Instance**: Per-user running process with merged config (row in `mcpServerInstances`) +- **ProcessId**: Unique identifier for each instance + +### ProcessId Format + +Each instance has a unique ProcessId that includes the user identifier: + +``` +Format: {server_slug}-{team_slug}-{user_slug}-{installation_id} + +Example: filesystem-acme-alice-abc123 +``` + +This format enables: +- Unique process identification across all users and teams +- User-specific process routing via OAuth token +- Independent lifecycle management per user + +### Independent Status Tracking + +Each user's instance has **independent status tracking**: + +- Status exists ONLY in `mcpServerInstances` table +- No installation-level status aggregation across users +- Each user sees only their own instance status +- Other team members' status doesn't affect your tools + +## Lifecycle Process A: MCP Server Installation + +**Trigger:** Team admin installs MCP server for the team + +### Backend Operations + + + + Create `mcpServerInstallations` row (team-level installation record) + + + + Create FIRST `mcpServerInstances` row for the installing admin: + - `installation_id` → installation.id + - `user_id` → admin.id + - `status` → 'provisioning' (or 'awaiting_user_config' if admin didn't provide required user fields) + + + + For EACH existing team member (besides admin), create `mcpServerInstances` row: + - `installation_id` → installation.id + - `user_id` → member.id + - `status` → 'provisioning' (or 'awaiting_user_config' if server requires user-level config) + + + + Send `configure` command to all global satellites (priority: immediate): + ```json + { + "event": "mcp_installation_created", + "installation_id": "uuid", + "team_id": "uuid" + } + ``` + + + +### Satellite Operations + + + + Receive configure command via command polling service + + + + Fetch per-user configs from backend (includes all team members' processIds) + + + + Spawn per-user MCP processes for each team member (excluding those with `awaiting_user_config` status) + + + + Emit status events with `user_id` field: + ```json + { + "event": "mcp.server.status_changed", + "installation_id": "uuid", + "team_id": "uuid", + "user_id": "uuid", + "status": "provisioning", + "status_message": "Spawning MCP server process..." + } + ``` + + + + Status progression: `provisioning` → `connecting` → `discovering_tools` → `syncing_tools` → `online` + + + +### Result + +- Each team member gets their own instance with independent status +- Members who provided config can use MCP server immediately +- Members without required user-level config remain in `awaiting_user_config` status until they configure + + +**Special Case: awaiting_user_config Status** + +If an MCP server has required user-level configuration fields (e.g., personal API keys) and a user hasn't configured them, their instance is created with `status='awaiting_user_config'`. The satellite does NOT spawn processes for these instances until the user completes their configuration. See [Status Tracking](/development/satellite/status-tracking) for details. + + +--- + +## Lifecycle Process B: MCP Server Deletion + +**Trigger:** Team admin deletes MCP installation + +### Backend Operations + + + + Delete `mcpServerInstallations` row + + + + CASCADE automatically deletes ALL `mcpServerInstances` rows: + ```sql + -- Foreign key constraint ensures automatic cleanup + installation_id REFERENCES mcpServerInstallations(id) ON DELETE CASCADE + ``` + + + + Send `configure` command to all global satellites: + ```json + { + "event": "mcp_installation_deleted", + "installation_id": "uuid", + "team_id": "uuid" + } + ``` + + + +### Satellite Operations + + + + Receive configure command via command polling service + + + + Terminate ALL per-user processes for that installation (across all team members) + + + + Clean up process metadata and runtime state + + + + Remove installation from dynamic config cache + + + +### Result + +- All instances deleted from database +- All processes terminated on satellites +- No orphaned processes or database rows +- Complete cleanup across all team members + +--- + +## Lifecycle Process C: Team Member Added + +**Trigger:** Team admin adds new member to team + +### Backend Operations + + + + Create team membership record + + + + Query ALL existing MCP installations for this team: + ```sql + SELECT * FROM mcpServerInstallations WHERE team_id = :teamId + ``` + + + + For EACH installation, create `mcpServerInstances` row: + - `installation_id` → installation.id + - `user_id` → new_member.id + - `status` → 'provisioning' (or 'awaiting_user_config' if server requires user-level config) + + + + Send `configure` command to all global satellites (one per installation): + ```json + { + "event": "mcp_installation_created", + "installation_id": "uuid", + "team_id": "uuid", + "user_id": "uuid" + } + ``` + + + +### Satellite Operations + + + + Receive configure commands (one per team installation) + + + + Fetch updated per-user configs from backend (includes new member's processIds) + + + + Spawn per-user processes for new member (dormant pattern - excluding `awaiting_user_config` instances) + + + + Emit status events with new member's `user_id` + + + + Processes remain dormant until first client connection (OAuth token) + + + +### Result + +- New member has instances for ALL team MCP servers +- Processes spawn on demand when member makes first request +- Each instance has independent status (no aggregation) +- Member must configure required user-level fields before instances become online + +--- + +## Lifecycle Process D: Team Member Removed + +**Trigger:** Team admin removes member from team + +### Backend Operations + + + + Delete ALL `mcpServerInstances` rows for that user in this team: + ```sql + DELETE FROM mcpServerInstances + WHERE user_id = :userId + AND installation_id IN ( + SELECT id FROM mcpServerInstallations WHERE team_id = :teamId + ) + ``` + + + + Send `configure` command to all global satellites: + ```json + { + "event": "team_member_removed", + "team_id": "uuid", + "user_id": "uuid" + } + ``` + + + + Emit backend event: `TEAM_MEMBER_REMOVED` (audit trail and notifications) + + + +### Satellite Operations + + + + Receive configure command via command polling service + + + + Terminate ALL processes owned by that user_id in this team + + + + Clean up process metadata and cached OAuth tokens + + + + Remove user from runtime state + + + +### Result + +- All member's instances deleted from database +- All member's processes terminated on satellites +- No status recalculation needed (status only exists per-instance) +- Other team members' instances remain unaffected + +--- + +## Status Tracking Design + +### Per-User Status Only + +Status fields have been **completely removed** from `mcpServerInstallations` table. Status exists ONLY in `mcpServerInstances`: + +```sql +-- Query user's own instance status +SELECT status, status_message, status_updated_at, last_health_check_at +FROM mcpServerInstances +WHERE installation_id = :installationId + AND user_id = :authenticatedUserId +``` + +### API Behavior + +**Status Endpoints:** +- `GET /teams/:teamId/mcp/installations/:installationId/status` - Returns authenticated user's instance status only +- `GET /teams/:teamId/mcp/installations/:installationId/status-stream` - SSE stream of user's instance status changes +- No installation-level status aggregation across users + +### Why No Aggregation? + +- Each user has independent instance with independent status +- Admin seeing "online" doesn't mean other users' instances are online +- User's config changes only affect their own instance status +- Simpler architecture - single source of truth per user + +### Database Schema + +**Status Location:** +- `mcpServerInstances`: Has status fields (per user) ✅ +- `mcpServerInstallations`: NO status fields (removed) ❌ + +--- + +## Error Handling and Edge Cases + +### Scenario: Satellite sends status for non-existent instance + +**Behavior:** +- Backend logs error: "Instance not found for status update" +- No auto-creation (strict validation) +- Requires manual investigation and instance creation + +**Why This Happens:** +- Database instance deleted but satellite still has process running +- Timing issue between deletion and process termination +- Network delay in command delivery + +### Scenario: Member removed while instance is online + +**Behavior:** +- Backend deletes instance row first +- Satellite terminates process on next configure command poll +- Brief window where process runs without database record (acceptable) + +**Impact:** +- Process terminated within polling interval (2-60 seconds depending on priority) +- No data loss or security issue +- Graceful shutdown when command received + +### Scenario: Installation deleted with online instances + +**Behavior:** +- CASCADE delete removes all instances immediately +- Satellite terminates all processes on next poll +- Status events ignored (instances already deleted) + +**Impact:** +- Clean database state (no orphaned instances) +- Processes cleaned up automatically +- All team members' access revoked simultaneously + +### Scenario: Team member added but instance creation fails + +**Behavior:** +- Log error, continue with other installations +- Member addition succeeds (instances can be created manually later) +- No rollback - partial instance creation is acceptable + +**Why:** +- Team membership is independent of MCP instances +- Failed instance creation shouldn't block member from joining +- Manual retry available via admin interface + +### Scenario: Satellite offline during member add + +**Behavior:** +- Instance rows created with status 'provisioning' +- Satellite picks up on next heartbeat/command poll +- Eventually spawns processes for new member + +**Timeline:** +- Satellite comes online → polls backend +- Receives configure commands for new member +- Processes spawn as normal +- Status progresses to online + +--- + +## Related Documentation + + + + Complete status system with 12 states including awaiting_user_config + + + Per-user process spawning, termination, and lifecycle + + + OAuth-based team and user context resolution + + + Satellite command system for instance management + + + +--- + +## Summary + +The instance lifecycle system ensures each team member has their own isolated MCP server instance with independent status tracking. The four lifecycle processes (Installation, Deletion, Member Added, Member Removed) handle instance creation and cleanup across team operations. Status exists only at the instance level, providing clear per-user feedback without cross-user status aggregation. diff --git a/development/satellite/process-management.mdx b/development/satellite/process-management.mdx index 6a10fc3..5e34f62 100644 --- a/development/satellite/process-management.mdx +++ b/development/satellite/process-management.mdx @@ -1,19 +1,38 @@ --- title: Process Management -description: Technical implementation of stdio subprocess management for local MCP servers in DeployStack Satellite. +description: Technical implementation of per-user stdio subprocess management for local MCP servers in DeployStack Satellite. --- -DeployStack Satellite implements stdio subprocess management for local MCP servers through the ProcessManager component. This system handles spawning, monitoring, and lifecycle management of MCP server processes with dual-mode operation for development and production environments. +DeployStack Satellite implements per-user stdio subprocess management for local MCP servers through the ProcessManager component. Each team member gets their own isolated process for each MCP server installation, with dual-mode operation for development and production environments. ## Overview +### Per-User Process Architecture + +DeployStack manages MCP server processes on a **per-user basis**: + +- **1 Installation × N Users = N Processes**: Each team member has their own process for each MCP server +- **Independent Lifecycle**: Terminating one user's process doesn't affect other users' processes +- **User-Specific Config**: Each process runs with merged 3-tier config (Template + Team + User) +- **ProcessId Format**: `{server_slug}-{team_slug}-{user_slug}-{installation_id}` + +**Example:** +``` +Team "Acme Corp" installs Filesystem MCP with 3 members: +- Process 1: filesystem-acme-alice-abc123 (Alice's instance) +- Process 2: filesystem-acme-bob-abc123 (Bob's instance) +- Process 3: filesystem-acme-charlie-abc123 (Charlie's instance) + +Each process runs independently with user-specific configuration. +``` + **Core Components:** -- **ProcessManager**: Handles spawning, communication, and lifecycle of stdio-based MCP servers -- **RuntimeState**: Maintains in-memory state of all processes with team-grouped tracking -- **TeamIsolationService**: Validates team-based access control for process operations +- **ProcessManager**: Handles spawning, communication, and lifecycle of per-user stdio processes +- **RuntimeState**: Maintains in-memory state of all processes with team AND user tracking +- **TeamIsolationService**: Validates team and user access control for process operations **Deployment Modes:** - **Development**: Direct spawn without isolation (cross-platform) @@ -46,12 +65,47 @@ The system automatically selects the appropriate spawning mode based on environm ### Process Configuration Processes are spawned using MCPServerConfig containing: -- `installation_name`: Unique identifier in format `{server_slug}-{team_slug}-{installation_id}` +- `installation_name`: Unique per-user identifier in format `{server_slug}-{team_slug}-{user_slug}-{installation_id}` - `installation_id`: Database UUID for the installation - `team_id`: Team owning the process +- `user_id`: User owning this specific instance - `command`: Executable command (e.g., `npx`, `node`) -- `args`: Command arguments -- `env`: Environment variables (credentials, configuration) +- `args`: Command arguments (merged Template + Team + User args) +- `env`: Environment variables (merged Template + Team + User env vars, plus credentials) + + +**ProcessId Includes User:** The `installation_name` (also called `processId`) now includes the `user_slug` to uniquely identify each user's instance. This enables per-user process isolation and independent lifecycle management. + +Example: `filesystem-acme-alice-abc123` + + +### Backend Filtering (awaiting_user_config) + +The satellite does NOT receive configurations for instances with `awaiting_user_config` status: + +**Why:** +- MCP servers with required user-level fields (e.g., personal API keys) cannot spawn without complete configuration +- Backend filters out these instances in the config endpoint +- Satellite never attempts to spawn incomplete configurations +- Prevents process crashes from missing required arguments/environment variables + +**When it applies:** +- New team member joins but hasn't configured their personal credentials +- Admin installs MCP server but doesn't provide required user-level config during installation +- User's instance remains in `awaiting_user_config` until they complete configuration + +**Status transition:** +``` +awaiting_user_config + ↓ +[User configures personal settings via dashboard] + ↓ +provisioning (backend updates status, sends satellite command) + ↓ +Satellite receives config and spawns process normally +``` + +See [Instance Lifecycle](/development/satellite/instance-lifecycle) and [Status Tracking](/development/satellite/status-tracking) for complete details. ## MCP Handshake Protocol diff --git a/development/satellite/recovery-system.mdx b/development/satellite/recovery-system.mdx index bf8123d..00f95cc 100644 --- a/development/satellite/recovery-system.mdx +++ b/development/satellite/recovery-system.mdx @@ -347,12 +347,18 @@ class McpServerWrapper { ## Manual Recovery (Requires User Action) -Some failures cannot auto-recover: - -| Status | Reason | User Action | -|--------|--------|-------------| -| `requires_reauth` | OAuth token expired/revoked | Re-authenticate in dashboard | -| `permanently_failed` | 3+ crashes in 5 minutes (stdio) | Check logs, fix issue, manual restart | +Some failures cannot auto-recover and require user intervention: + +| Status | Reason | User Action | Recovery Type | +|--------|--------|-------------|---------------| +| `requires_reauth` | OAuth token expired/revoked | Click "Re-authenticate" button in installation details | Self-service (all team members) | +| `permanently_failed` | 3+ crashes in 5 minutes (stdio) | Check logs, fix issue, manual restart | Admin intervention | + +**Re-Authentication Details**: +- Available to all team members (OAuth is per-user) +- Preserves installation configuration +- Updates tokens in-place (no reinstall needed) +- See [MCP Server OAuth - Token Expiration Handling](/development/backend/mcp-server-oauth#token-expiration-handling) for technical details See [Process Management - Auto-Restart System](/development/satellite/process-management#auto-restart-system) for complete stdio restart policy details (3 crashes in 5-minute window, backoff delays). diff --git a/development/satellite/status-tracking.mdx b/development/satellite/status-tracking.mdx index 710408d..dc5e430 100644 --- a/development/satellite/status-tracking.mdx +++ b/development/satellite/status-tracking.mdx @@ -1,32 +1,55 @@ --- title: Status Tracking -description: MCP server installation status tracking system in the satellite +description: Per-user MCP server instance status tracking system in the satellite --- # Status Tracking -The satellite tracks the health and availability of each MCP server installation through an 11-state status system. This enables real-time monitoring, automatic recovery, and tool availability filtering. +The satellite tracks the health and availability of each MCP server **instance** (per-user) through a 12-state status system. This enables real-time monitoring, automatic recovery, and tool availability filtering on a per-user basis. ## Overview +### Per-User Instance Status + +DeployStack tracks status at the **instance level**, not the installation level: + +- **Status Location**: `mcpServerInstances` table (per user) +- **No Installation Status**: Status fields completely removed from `mcpServerInstallations` table +- **Independent Tracking**: Each team member has independent status for each MCP server +- **User-Specific Filtering**: Users see only tools from their OWN instances that are online + +**Example:** +``` +Team "Acme Corp" installs Filesystem MCP with 3 members: +- Alice's instance: status = 'online' (tools available to Alice) +- Bob's instance: status = 'awaiting_user_config' (tools NOT available to Bob) +- Charlie's instance: status = 'offline' (tools NOT available to Charlie) + +Each member sees different tool availability based on their instance status. +``` + +### Status Tracking Purposes + Status tracking serves three primary purposes: -1. **User Visibility**: Users see current server state in real-time via the frontend -2. **Tool Availability**: Tools from unavailable servers are filtered from discovery -3. **Automatic Recovery**: System detects and recovers from failures automatically +1. **User Visibility**: Users see their OWN instance status in real-time via the frontend +2. **Tool Availability**: Tools from user's unavailable instances are filtered from discovery +3. **Automatic Recovery**: System detects and recovers from failures per instance The status system is managed by `UnifiedToolDiscoveryManager` and updated through: -- Installation lifecycle events (provisioning → online) +- Instance lifecycle events (provisioning → online) - Health check results (online → offline) - Tool execution failures (online → offline/error/requires_reauth) - Configuration changes (online → restarting) - Recovery detection (offline → connecting → online) +- User configuration completion (awaiting_user_config → provisioning) ## Status Values | Status | Description | Tools Available? | User Action Required | |--------|-------------|------------------|---------------------| -| `provisioning` | Initial state after installation created | No | Wait | +| `awaiting_user_config` | User hasn't configured required user-level fields | No | **Configure personal settings** | +| `provisioning` | Initial state after instance created | No | Wait | | `command_received` | Satellite received configuration command | No | Wait | | `connecting` | Connecting to MCP server | No | Wait | | `discovering_tools` | Running tool discovery | No | Wait | @@ -38,9 +61,17 @@ The status system is managed by `UnifiedToolDiscoveryManager` and updated throug | `requires_reauth` | OAuth token expired/revoked | No | Re-authenticate | | `permanently_failed` | 3+ crashes in 5 minutes (stdio only) | No | Manual restart required | + +**New Status: awaiting_user_config** + +This status indicates that an MCP server has required user-level configuration fields (e.g., personal API keys) and the user hasn't configured them yet. The satellite does NOT spawn processes for instances with this status. Once the user completes their configuration via the dashboard, the status automatically transitions to `provisioning` and the instance spawns normally. + +See [Instance Lifecycle - Process A](/development/satellite/instance-lifecycle#lifecycle-process-a-mcp-server-installation) for details. + + ## Status Lifecycle -### Initial Installation Flow +### Initial Installation Flow (User Has Complete Config) ``` provisioning @@ -56,6 +87,32 @@ syncing_tools (sending tools to backend) online (ready for use) ``` +### Initial Installation Flow (User Missing Required Config) + +When an MCP server has required user-level configuration fields and the user hasn't configured them: + +``` +awaiting_user_config + ↓ +[User configures personal settings via dashboard] + ↓ +provisioning (backend updates status, sends satellite command) + ↓ +command_received + ↓ +connecting + ↓ +discovering_tools + ↓ +syncing_tools + ↓ +online +``` + + +**Backend Filtering:** The satellite does NOT receive configurations for instances with `awaiting_user_config` status. Backend filters these instances out in the config endpoint, preventing spawn attempts for incomplete configurations. + + ### Configuration Update Flow ``` @@ -203,20 +260,43 @@ this.statusCallback?.(processId, 'error', errorMessage); ## Tool Filtering by Status +### Per-User Instance Filtering + +Tool availability is filtered based on the authenticated user's OWN instance status: + +**Key Principles:** +- Each user sees only tools from their own instances that are `online` +- Other team members' instance status does NOT affect your tool availability +- If your instance is `awaiting_user_config`, you see NO tools from that server +- If your instance is `online`, you see all tools (even if teammates' instances are offline) + +**Example:** +``` +Team "Acme Corp" - Context7 MCP Server: +- Alice's instance: status = 'online' + → Alice sees Context7 tools in discover_mcp_tools + +- Bob's instance: status = 'awaiting_user_config' + → Bob sees NO Context7 tools (must configure API key first) + +- Charlie's instance: status = 'offline' + → Charlie sees NO Context7 tools (server unreachable) +``` + ### Discovery Filtering -When LLMs call `discover_mcp_tools`, only tools from available servers are returned: +When LLMs call `discover_mcp_tools`, only tools from the user's available instances are returned: ```typescript -// UnifiedToolDiscoveryManager.getAllTools() filters by status -const tools = toolDiscoveryManager.getAllTools(); // Only 'online' servers +// UnifiedToolDiscoveryManager.getAllTools() filters by user's instance status +const tools = toolDiscoveryManager.getAllTools(); // Only user's 'online' instances -// Tools from offline/error/requires_reauth servers are hidden +// Tools from user's offline/error/requires_reauth/awaiting_user_config instances are hidden ``` ### Execution Blocking -When LLMs attempt to execute tools from unavailable servers: +When LLMs attempt to execute tools from the user's unavailable instances: ```typescript // services/satellite/src/core/mcp-server-wrapper.ts @@ -227,10 +307,10 @@ const statusEntry = this.toolDiscoveryManager?.getServerStatus(serverSlug); // Block execution for non-recoverable states if (statusEntry?.status === 'requires_reauth') { return { - error: `Tool cannot be executed - server requires re-authentication. + error: `Tool cannot be executed - your instance requires re-authentication. Status: ${statusEntry.status} -The server requires re-authentication. Please re-authorize in the dashboard. +Your instance requires re-authentication. Please re-authorize in the dashboard. Unavailable server: ${serverSlug}` }; @@ -243,19 +323,24 @@ Unavailable server: ${serverSlug}` ### Backend-Triggered (Database Updates) -**Source:** Backend API routes +**Source:** Backend API routes update `mcpServerInstances` table (per user) | Trigger | New Status | When | |---------|-----------|------| -| Installation created | `provisioning` | User installs MCP server | -| Config updated | `restarting` | User modifies environment vars/args/headers | +| Instance created | `provisioning` or `awaiting_user_config` | Admin installs MCP server or member added to team | +| User config updated | `provisioning` (if was `awaiting_user_config`) or `restarting` (if already online) | User modifies their personal config | +| Config updated | `restarting` | User modifies environment vars/args/headers/query params | | OAuth callback success | `connecting` | User re-authenticates | | Health check fails | `offline` | Server unreachable (3-min interval) | | Credential validation fails | `requires_reauth` | OAuth token invalid | + +**Status Target:** All backend status updates target the `mcpServerInstances` table. Status fields have been **completely removed** from `mcpServerInstallations`. Each user's instance has independent status tracking. + + ### Satellite-Triggered (Event Emission) -**Source:** Satellite emits `mcp.server.status_changed` events to backend +**Source:** Satellite emits `mcp.server.status_changed` events to backend (includes `user_id` field) | Trigger | New Status | When | |---------|-----------|------| @@ -267,6 +352,20 @@ Unavailable server: ${serverSlug}` | Server recovery detected | `connecting` | Previously offline server responds | | Stdio crashes 3 times | `permanently_failed` | 3 crashes within 5 minutes | +**Event Payload:** +```json +{ + "event": "mcp.server.status_changed", + "installation_id": "uuid", + "team_id": "uuid", + "user_id": "uuid", + "status": "online", + "status_message": "Server is online and ready" +} +``` + +The `user_id` field ensures status updates are applied to the correct user's instance. + ## Implementation Components The status tracking system consists of several integrated components: @@ -279,7 +378,17 @@ The status tracking system consists of several integrated components: ## Related Documentation -- [Event Emission](/development/satellite/event-emission) - Status change event details -- [Recovery System](/development/satellite/recovery-system) - Automatic recovery logic -- [Tool Discovery](/development/satellite/tool-discovery) - How status affects tool discovery -- [Hierarchical Router](/development/satellite/hierarchical-router) - Status-based tool filtering + + + Four lifecycle processes for instance creation, deletion, and team membership changes + + + Status change event details with user_id field + + + Automatic recovery logic for failed instances + + + How status affects per-user tool discovery + + diff --git a/development/satellite/team-isolation.mdx b/development/satellite/team-isolation.mdx index 22a3627..161f6a4 100644 --- a/development/satellite/team-isolation.mdx +++ b/development/satellite/team-isolation.mdx @@ -1,85 +1,96 @@ --- title: Team Isolation Implementation -description: Technical implementation of OAuth-based team separation in DeployStack Satellite for multi-tenant MCP server access control. +description: Technical implementation of OAuth-based team and user separation in DeployStack Satellite for per-user MCP server instance access control. --- -DeployStack Satellite implements OAuth 2.1 Resource Server-based team isolation to provide secure multi-tenant access to MCP servers. This system ensures complete separation of team resources while maintaining a unified MCP client interface. +DeployStack Satellite implements OAuth 2.1 Resource Server-based team and user isolation to provide secure multi-tenant access to per-user MCP server instances. This system ensures complete separation of team resources AND per-user instances while maintaining a unified MCP client interface. -For OAuth authentication details, see [OAuth Authentication Implementation](/development/satellite/oauth-authentication). For tool discovery mechanics, see [Tool Discovery Implementation](/development/satellite/tool-discovery). +For OAuth authentication details, see [OAuth Authentication Implementation](/development/satellite/oauth-authentication). For tool discovery mechanics, see [Tool Discovery Implementation](/development/satellite/tool-discovery). For per-user instance lifecycle, see [Instance Lifecycle Management](/development/satellite/instance-lifecycle). ## Technical Architecture -### Team Context Resolution +### Team and User Context Resolution -Team isolation operates through OAuth token introspection that extracts team context from validated Bearer tokens: +Isolation operates through OAuth token introspection that extracts both team AND user context from validated Bearer tokens: ``` -MCP Client Request → OAuth Token → Token Introspection → Team Context → Resource Filtering - │ │ │ │ │ - Bearer Token Satellite API Backend Validation Team ID Allowed Servers - (team-scoped) Key Required 5-minute Cache Extraction Database Query +MCP Client Request → OAuth Token → Token Introspection → Team + User Context → Resource Filtering + │ │ │ │ │ + Bearer Token Satellite API Backend Validation Team ID + User ID User's Instances + (user-scoped) Key Required 5-minute Cache Extraction Database Query ``` **Core Components:** - **TokenIntrospectionService**: Validates tokens via Backend introspection endpoint -- **TeamAwareMcpHandler**: Filters MCP resources based on team permissions -- **DynamicConfigManager**: Provides team-server mappings from Backend polling -- **RemoteToolDiscoveryManager**: Caches tools with server association metadata +- **TeamAwareMcpHandler**: Filters MCP resources based on team AND user permissions +- **DynamicConfigManager**: Provides per-user instance mappings from Backend polling +- **RemoteToolDiscoveryManager**: Caches tools with per-user server association metadata -### Team-Server Mapping Architecture +### Per-User Instance Mapping Architecture -Team isolation relies on database-backed server instance mappings: +Isolation relies on database-backed per-user instance mappings: ``` -Team "john" → Server Instance "context7-john-R36no6FGoMFEZO9nWJJLT" -Team "alice" → Server Instance "context7-alice-S47mp8GHpNGFZP0oWKKMU" +Team "acme" + User "alice" → Instance "context7-acme-alice-R36no6FGoMFEZO9nWJJLT" +Team "acme" + User "bob" → Instance "context7-acme-bob-R36no6FGoMFEZO9nWJJLT" +Team "beta" + User "charlie" → Instance "context7-beta-charlie-S47mp8GHpNGFZP0oWKKMU" ``` **Database Integration:** -- **mcpServerInstallations Table**: Links teams to specific MCP server instances -- **Dynamic Configuration**: Backend polling delivers team-server mappings -- **Server Instance Naming**: Format `{server_slug}-{team_slug}-{installation_id}` -- **Complete Isolation**: Teams cannot access other teams' server instances +- **mcpServerInstances Table**: Links users to their specific MCP server instances (per user) +- **mcpServerInstallations Table**: Links teams to MCP server installations (team-level) +- **Dynamic Configuration**: Backend polling delivers per-user instance mappings +- **Instance Naming**: Format `{server_slug}-{team_slug}-{user_slug}-{installation_id}` +- **Complete Isolation**: Users only access their OWN instances, not teammates' instances + + +**Per-User Instances:** Each team member has their own independent instance for each MCP server. This enables user-specific configuration (Template + Team + User merged config) and independent status tracking per user. + ## Tool Discovery Integration ### Friendly Tool Naming -Tool discovery uses `server_slug` for user-friendly tool names while maintaining internal server routing: +Tool discovery uses `server_slug` for user-friendly tool names while maintaining internal per-user instance routing: **User-Facing Names:** - `context7-resolve-library-id` - `context7-get-library-docs` -**Internal Server Resolution:** -- Team "john": Routes to `context7-john-R36no6FGoMFEZO9nWJJLT` -- Team "alice": Routes to `context7-alice-S47mp8GHpNGFZP0oWKKMU` +**Internal Per-User Instance Resolution:** +- Team "acme" + User "alice": Routes to `context7-acme-alice-R36no6FGoMFEZO9nWJJLT` +- Team "acme" + User "bob": Routes to `context7-acme-bob-R36no6FGoMFEZO9nWJJLT` +- Team "beta" + User "charlie": Routes to `context7-beta-charlie-S47mp8GHpNGFZP0oWKKMU` **Implementation Details:** - **RemoteToolDiscoveryManager**: Creates friendly names using `config.server_slug` - **CachedTool Interface**: Stores both `namespacedName` and `serverName` for routing -- **TeamAwareMcpHandler**: Resolves team context to actual server instances +- **TeamAwareMcpHandler**: Resolves team AND user context to actual per-user instances ### Tool Filtering Process -Team-aware tool filtering operates at the MCP protocol level: +Per-user tool filtering operates at the MCP protocol level: ``` -tools/list Request → OAuth Team Context → Filter by Team Servers → Return Filtered Tools - │ │ │ │ - Bearer Token Team ID Extraction Database Lookup Team-Specific List - Validation from Token Cache Allowed Servers JSON-RPC Response +tools/list Request → OAuth User Context → Filter by User's Instances → Return Filtered Tools + │ │ │ │ + Bearer Token Team + User ID Extraction Database Lookup User-Specific List + Validation from Token Cache User's Instances JSON-RPC Response ``` **Filtering Logic:** -1. **Token Validation**: Extract team ID from OAuth token introspection -2. **Server Resolution**: Query team's allowed MCP server instances -3. **Tool Filtering**: Include only tools from team's server instances +1. **Token Validation**: Extract team ID AND user ID from OAuth token introspection +2. **Instance Resolution**: Query user's allowed MCP server instances (only THEIR instances) +3. **Tool Filtering**: Include only tools from user's OWN instances that are `online` 4. **Response Generation**: Return filtered tool list to MCP client + +**User-Specific Filtering:** Each user sees only tools from their OWN instances. Other team members' instance status does NOT affect your tool availability. If your instance is `awaiting_user_config`, you see NO tools from that server until you complete configuration. + + ## OAuth Integration Points ### Authentication Middleware Integration @@ -99,26 +110,33 @@ Team isolation integrates with existing OAuth authentication middleware: ### Token Introspection Response -Backend token introspection provides team context: +Backend token introspection provides team AND user context: ``` Introspection Response: { "active": true, - "sub": "user_id", + "sub": "user_uuid", + "user_id": "user_uuid", "team_id": "team_uuid", - "team_name": "john", + "team_name": "acme", "team_role": "admin", "scope": "mcp:read mcp:tools:execute" } ``` -**Team Context Fields:** +**Context Fields:** +- **sub**: User identifier (OAuth standard field) +- **user_id**: Database UUID for user identification - **team_id**: Database UUID for team identification - **team_name**: Human-readable team identifier (slug) - **team_role**: User's role within the team - **scope**: OAuth scopes for permission validation + +**User Context Required:** The `user_id` field is critical for per-user instance resolution. It enables the satellite to route tool calls to the correct user's process, ensuring complete isolation between team members. + + ## Server Instance Resolution ### Dynamic Server Mapping @@ -133,23 +151,23 @@ Team-server mappings are delivered via Backend polling system: ### Server Resolution Algorithm -Tool execution resolves team context to specific server instances: +Tool execution resolves team AND user context to specific per-user instances: ``` -Tool Call "context7-resolve-library-id" + Team "john" +Tool Call "context7-resolve-library-id" + Team "acme" + User "alice" ↓ -Find Server: server_slug="context7" AND team_id="john_uuid" +Find Instance: server_slug="context7" AND team_id="acme_uuid" AND user_id="alice_uuid" ↓ -Resolve to: "context7-john-R36no6FGoMFEZO9nWJJLT" +Resolve to: "context7-acme-alice-R36no6FGoMFEZO9nWJJLT" ↓ -Route Request: HTTP proxy to team's server instance +Route Request: HTTP proxy OR stdio process to user's specific instance ``` **Resolution Process:** 1. **Parse Tool Name**: Extract `server_slug` from namespaced tool name -2. **Team Context**: Get team ID from OAuth token validation -3. **Server Lookup**: Find server instance matching team + server_slug -4. **Request Routing**: Proxy to resolved server instance +2. **Team + User Context**: Get team ID AND user ID from OAuth token validation +3. **Instance Lookup**: Find user's instance: `server_slug` + `team_id` + `user_id` +4. **Route Execution**: Execute tool on user's specific instance (not teammates' instances) ## Security Implementation diff --git a/development/satellite/tool-discovery.mdx b/development/satellite/tool-discovery.mdx index 00abbbe..101838d 100644 --- a/development/satellite/tool-discovery.mdx +++ b/development/satellite/tool-discovery.mdx @@ -351,9 +351,13 @@ curl http://localhost:3001/api/status/debug ## Status Integration -Tool discovery integrates with the status tracking system to filter tools and enable automatic recovery. Discovery managers call status callbacks on success/failure to update installation status in real-time. +Tool discovery integrates with the status tracking system to filter tools and enable automatic recovery. Discovery managers call status callbacks on success/failure to update **instance status** in real-time (per-user). -See [Status Tracking - Tool Filtering](/development/satellite/status-tracking#tool-filtering-by-status) for complete details on status-based tool filtering and execution blocking. + +**Per-User Status:** Each user's instance has independent status tracking. Tool filtering is based on the authenticated user's OWN instance status, not other team members' statuses. + + +See [Status Tracking - Tool Filtering](/development/satellite/status-tracking#tool-filtering-by-status) for complete details on per-user status-based tool filtering and execution blocking. ## Recovery System @@ -399,13 +403,14 @@ Tool execution is logged with full request/response data for debugging. - User attribution (who called the tool) **Privacy Control:** -Request logging can be disabled per-installation via `settings.request_logging_enabled = false`. +Request logging can be disabled per-instance via `settings.request_logging_enabled = false` in the instance configuration. See [Log Capture](/development/satellite/log-capture) for buffering and storage details. ## Related Documentation -- [Status Tracking](/development/satellite/status-tracking) - Tool filtering by server status +- [Status Tracking](/development/satellite/status-tracking) - Tool filtering by per-user instance status +- [Instance Lifecycle](/development/satellite/instance-lifecycle) - Per-user instance creation and management - [Recovery System](/development/satellite/recovery-system) - Automatic re-discovery on recovery - [Event Emission](/development/satellite/event-emission) - Tool metadata events - [Log Capture](/development/satellite/log-capture) - Request logging system diff --git a/docs.json b/docs.json index 25fd8ef..68d8031 100644 --- a/docs.json +++ b/docs.json @@ -200,6 +200,7 @@ "pages": [ "/development/satellite/mcp-transport", "/development/satellite/hierarchical-router", + "/development/satellite/instance-lifecycle", "/development/satellite/process-management", "/development/satellite/team-isolation", "/development/satellite/tool-discovery", diff --git a/favicon.ico b/favicon.ico index c947817..3d0a97a 100644 Binary files a/favicon.ico and b/favicon.ico differ diff --git a/favicon.png b/favicon.png index a07ee7e..109ed8a 100644 Binary files a/favicon.png and b/favicon.png differ