|
| 1 | +# PDD: Maintenance Mode Toggle (Feature 4.3) |
| 2 | + |
| 3 | +**Parent PRD:** [001-infra-rollout.md](../prd/001-infra-rollout.md) |
| 4 | +**Tracking Issue:** [CLC-1273](https://linear.app/gitpod/issue/CLC-1273/admin-maintenance-mode-toggle) |
| 5 | + |
| 6 | +## 1. Overview |
| 7 | + |
| 8 | +This document outlines the technical design and implementation plan for the "Maintenance Mode Toggle" feature. This feature allows administrators to manually enable or disable a maintenance mode for their Gitpod organization/instance. When enabled, new workspace starts are prevented, a notification is shown on the dashboard, and the "Stop All Workspaces" button becomes active. |
| 9 | + |
| 10 | +## 2. Requirements |
| 11 | + |
| 12 | +As per PRD `001-infra-rollout.md` (Section 4.3): |
| 13 | +* **R2.1:** Administrators must be able to manually enable or disable a "Maintenance Mode". |
| 14 | +* **R2.2:** When Maintenance Mode is enabled: |
| 15 | + * Users must be prevented from starting new workspaces (failure reason: "maintenanceMode"). |
| 16 | + * A clear warning/notification must be displayed on the dashboard. |
| 17 | + * The "Stop All Running Workspaces" button must be enabled; otherwise, it must be disabled. |
| 18 | +* **R2.3:** The toggle allows control over the system state during updates. |
| 19 | +* The `maintenanceMode` flag should be stored in the `DBTeam` (Organization) table. |
| 20 | +* The UI toggle should be in a new section on the Admin page, above "running workspaces". |
| 21 | + |
| 22 | +## 3. Technical Design & Implementation Plan |
| 23 | + |
| 24 | +### I. Backend Changes (Server & Database) |
| 25 | + |
| 26 | +#### 1. Database Schema Update (`gitpod-db`) |
| 27 | +* **Target Entity:** `DBTeam` (in `components/gitpod-db/src/typeorm/entity/db-team.ts`). |
| 28 | + * This table represents an "Organization". |
| 29 | +* **Action:** Add a new column `maintenanceMode` to the `DBTeam` entity. |
| 30 | + * Type: `boolean` |
| 31 | + * Default Value: `false` |
| 32 | +* **Migration:** A TypeORM migration script will be generated and applied to add this column. |
| 33 | + |
| 34 | +#### 2. Permissions Definition (SpiceDB) |
| 35 | +* **Target File:** `components/spicedb/schema/schema.yaml` |
| 36 | +* **Action:** In the `definition organization` block, add a new permission: |
| 37 | + ```yaml |
| 38 | + permission maintenance = owner + installation->admin |
| 39 | + ``` |
| 40 | +* This grants the `maintenance` permission to users who are an `owner` of the organization or an `admin` of the installation. |
| 41 | +* The SpiceDB schema will be updated and reloaded/re-validated. |
| 42 | + |
| 43 | +#### 3. API Definition (gRPC - Public API) |
| 44 | +* **Target File:** `components/public-api/gitpod/v1/organization.proto` |
| 45 | +* **Action:** Add new RPC methods to the `OrganizationService`: |
| 46 | + ```protobuf |
| 47 | + service OrganizationService { |
| 48 | + // ... existing RPCs ... |
| 49 | +
|
| 50 | + // GetOrganizationMaintenanceMode retrieves the maintenance mode status for an organization. |
| 51 | + rpc GetOrganizationMaintenanceMode(GetOrganizationMaintenanceModeRequest) returns (GetOrganizationMaintenanceModeResponse) {} |
| 52 | +
|
| 53 | + // SetOrganizationMaintenanceMode sets the maintenance mode status for an organization. |
| 54 | + rpc SetOrganizationMaintenanceMode(SetOrganizationMaintenanceModeRequest) returns (SetOrganizationMaintenanceModeResponse) {} |
| 55 | + } |
| 56 | +
|
| 57 | + message GetOrganizationMaintenanceModeRequest { |
| 58 | + string organization_id = 1; // ID of the DBTeam |
| 59 | + } |
| 60 | +
|
| 61 | + message GetOrganizationMaintenanceModeResponse { |
| 62 | + bool enabled = 1; |
| 63 | + } |
| 64 | +
|
| 65 | + message SetOrganizationMaintenanceModeRequest { |
| 66 | + string organization_id = 1; // ID of the DBTeam |
| 67 | + bool enabled = 2; |
| 68 | + } |
| 69 | +
|
| 70 | + message SetOrganizationMaintenanceModeResponse { |
| 71 | + bool enabled = 1; // The new state of maintenance mode |
| 72 | + } |
| 73 | + ``` |
| 74 | +* Relevant code generation scripts (for Go, TypeScript clients/servers) will be run after this change. |
| 75 | + |
| 76 | +#### 4. API Implementation (`server` component - TypeScript) |
| 77 | +* The `server` component will implement the server-side logic for the new `gitpod.v1.OrganizationService` RPCs. |
| 78 | +* This will involve updating or creating a service class (e.g., `OrganizationServiceImpl.ts` in `components/server/src/services/` or `components/server/src/orgs/`). |
| 79 | +* **`GetOrganizationMaintenanceMode` Implementation:** |
| 80 | + * Input: `GetOrganizationMaintenanceModeRequest`. |
| 81 | + * Authorization: Verify the caller has the `maintenance` permission on the `organization:{organization_id}` resource using SpiceDB. |
| 82 | + * Logic: Fetch `DBTeam` by `organization_id` and return its `maintenanceMode` status. |
| 83 | +* **`SetOrganizationMaintenanceMode` Implementation:** |
| 84 | + * Input: `SetOrganizationMaintenanceModeRequest`. |
| 85 | + * Authorization: Verify `maintenance` permission on `organization:{organization_id}` via SpiceDB. |
| 86 | + * Logic: Update `maintenanceMode` for the `DBTeam` and return the new status. |
| 87 | +* The `server` component directly hosts and implements the `gitpod.v1` gRPC services, so it will handle these incoming gRPC requests. |
| 88 | + |
| 89 | +#### 5. Workspace Start Logic Modification |
| 90 | +* **Target File:** `components/server/src/workspace/workspace-starter.ts` |
| 91 | +* **Action:** In the `startWorkspace` method of the `WorkspaceStarter` class: |
| 92 | + * After existing permission checks (e.g., `checkStartPermission`, `checkBlockedRepository`). |
| 93 | + * Retrieve `organizationId` from the `workspace.organizationId` field. |
| 94 | + * Fetch the `DBTeam` entity for this `organizationId` (potentially via `OrganizationService`). |
| 95 | + * If `team?.maintenanceMode` is `true`: |
| 96 | + * Prevent the workspace start. |
| 97 | + * Throw an `ApplicationError` (e.g., with code `SERVICE_UNAVAILABLE` or a new custom code) with a user-friendly message like "Cannot start workspace: The system is currently in maintenance mode." |
| 98 | + * Ensure the error includes the `failureReason: "maintenanceMode"`. |
| 99 | + |
| 100 | +### II. Frontend Changes (Dashboard - `components/dashboard/`) |
| 101 | + |
| 102 | +#### 1. API Client Service |
| 103 | +* Update or create methods in the dashboard's API client service to call the new gRPC endpoints (e.g., `getMaintenanceMode(orgId: string)`, `setMaintenanceMode(orgId: string, enabled: boolean)`). This might involve using a generated gRPC-web client or a RESTful wrapper if Connect is used. |
| 104 | + |
| 105 | +#### 2. Admin Page - Maintenance Mode Section & Toggle |
| 106 | +* A new UI section will be added to the Admin page, positioned above the "Running Workspaces" card. |
| 107 | +* A new React component (e.g., `MaintenanceModeCard.tsx` in `components/dashboard/src/components/admin/`) will be created to: |
| 108 | + * Fetch the current maintenance mode status for the organization using the API client. |
| 109 | + * Display a toggle switch (e.g., from Material UI or a custom component) reflecting the current status. |
| 110 | + * When the toggle is changed, call the `setMaintenanceMode` API endpoint. |
| 111 | + * Provide user feedback (loading indicators, success/error messages). |
| 112 | + |
| 113 | +#### 3. Global Dashboard Notification/Banner |
| 114 | +* Implement a global state management solution (e.g., React Context, Redux slice) to make the maintenance status available throughout the dashboard application. |
| 115 | +* Create a new banner component (e.g., `MaintenanceModeBanner.tsx`) that: |
| 116 | + * Consumes the global maintenance status. |
| 117 | + * If maintenance mode is active for the user's organization, displays a prominent, non-dismissible warning banner on relevant dashboard pages (e.g., dashboard home, workspace list, new workspace page). |
| 118 | + * The banner text should clearly state: "System is in maintenance mode. Starting new workspaces is currently disabled." |
| 119 | +* Integrate this banner into the main application layout component. |
| 120 | + |
| 121 | +#### 4. "Stop All Workspaces" Button Logic |
| 122 | +* The existing "Stop All Workspaces" button (implemented as part of feature 4.2, likely in a component like `RunningWorkspacesCard.tsx`) will be modified. |
| 123 | +* The button will be **enabled only if `maintenanceMode` is `true`** for the organization. Otherwise, it will be disabled. |
| 124 | +* This requires the component rendering the button to access the current maintenance mode status (from the global state or by fetching it). |
| 125 | + |
| 126 | +### III. Permissions & Auditing (Backend - `components/server/`) |
| 127 | + |
| 128 | +#### 1. Permissions Implementation |
| 129 | +* The gRPC method implementations in the `server` component will use the SpiceDB client to check if the authenticated user has the `maintenance` permission on the `organization:{orgId}` resource before allowing the `GetOrganizationMaintenanceMode` and `SetOrganizationMaintenanceMode` operations. |
| 130 | + |
| 131 | +#### 2. Auditing (Consideration) |
| 132 | +* Actions related to enabling/disabling maintenance mode should be logged for auditing purposes. This could involve creating `DbAuditLog` entries. |
| 133 | + |
| 134 | +## 4. Open Questions / Considerations |
| 135 | +* Exact naming and location of the new gRPC service implementation class within the `server` component. |
| 136 | +* Specific error code to use for `ApplicationError` when maintenance mode blocks workspace start. |
| 137 | +* Need to ensure that the `orgId` used by the frontend (dashboard) correctly corresponds to `DBTeam.id`. |
0 commit comments