|
| 1 | +# PDD: Infrastructure Rollout - Task 4.2: Stop All Running Workspaces |
| 2 | + |
| 3 | +- **Associated PRD**: [prd/001-infra-rollout.md](./001-infra-rollout.md) - Section 4.2 |
| 4 | +- **Date**: May 8, 2025 |
| 5 | +- **Author**: Cline |
| 6 | +- **Version**: 1.0 |
| 7 | +- **Status**: Draft |
| 8 | + |
| 9 | +## 1. Overview |
| 10 | +This document outlines the design for implementing the "Stop All Running Workspaces" feature for administrators. This feature allows an administrator to initiate a stop command for all currently running workspaces within their organization. This implementation will leverage frontend orchestration, using existing APIs. |
| 11 | + |
| 12 | +## 2. Background |
| 13 | +As part of improving the infrastructure update rollout experience (PRD Ref: [CLC-1275](https://linear.app/gitpod/issue/CLC-1275/admin-stop-all-running-workspaces-button-for-infra-update)), administrators need a way to ensure all workspaces are safely stopped (and thus backed up) before an update. This feature provides that capability. |
| 14 | + |
| 15 | +## 3. Proposed Design & Implementation |
| 16 | + |
| 17 | +### 3.1. Approach: Frontend Orchestration |
| 18 | +This feature will be implemented primarily in the frontend (`components/dashboard/`). The frontend will: |
| 19 | +1. Fetch the list of currently running/active workspaces. |
| 20 | +2. Upon admin confirmation, iterate through this list. |
| 21 | +3. For each workspace, call the existing public API `workspaceClient.stopWorkspace()` to request a graceful stop. |
| 22 | + |
| 23 | +This approach is viable because the SpiceDB schema (`components/spicedb/schema/schema.yaml`) confirms that an organization owner (`org->owner`) has the `stop` permission on workspaces belonging to their organization, which is enforced by the backend's `WorkspaceService.stopWorkspace` method. |
| 24 | + |
| 25 | +### 3.2. Affected Code Units (Frontend - `components/dashboard/`) |
| 26 | + |
| 27 | +- **`src/org-admin/AdminPage.tsx`**: |
| 28 | + * This page already hosts the `RunningWorkspacesCard.tsx` (as per PDD `001-infra-rollout-4.1.md`). No direct changes are needed here for this specific feature, as the functionality will be encapsulated within `RunningWorkspacesCard.tsx`. |
| 29 | +- **`src/org-admin/RunningWorkspacesCard.tsx`**: |
| 30 | + * This existing component (detailed in PDD `001-infra-rollout-4.1.md`) already fetches and displays running/active workspace sessions using the `useWorkspaceSessions` hook. |
| 31 | + * It will be **enhanced** to include the "Stop All Running Workspaces" button and its associated logic. |
| 32 | + |
| 33 | +### 3.3. Key Modifications to `RunningWorkspacesCard.tsx` |
| 34 | + |
| 35 | +- **New UI Elements:** |
| 36 | + * **"Stop All Running Workspaces" Button:** |
| 37 | + * To be placed prominently within the card (e.g., in the card header or a dedicated action row). |
| 38 | + * Label: "Stop All Running Workspaces". |
| 39 | + * **Confirmation Dialog:** |
| 40 | + * A modal dialog will appear upon clicking the button. |
| 41 | + * It will clearly explain the action (e.g., "This will attempt to stop all currently running workspaces in your organization. Workspaces are backed up before stopping. This action cannot be undone for the stop process itself.") and require explicit confirmation (e.g., "Confirm Stop All" button). |
| 42 | + |
| 43 | +- **New Logic:** |
| 44 | + 1. **Handle "Stop All" Action (on confirmed dialog):** |
| 45 | + * The `useWorkspaceSessions` hook's data (`data.pages`) is already available within this component. |
| 46 | + * Flatten the session pages: `const allSessions = data.pages.flatMap(page => page);` |
| 47 | + * Filter for "not stopped" workspaces (this filtering logic may already exist for display purposes): |
| 48 | + ```typescript |
| 49 | + const notStoppedSessions = allSessions.filter(session => |
| 50 | + session.workspace?.status?.phase?.name !== WorkspacePhase_Phase.STOPPED |
| 51 | + ); |
| 52 | + ``` |
| 53 | + * Iterate through `notStoppedSessions`. For each `session` where `session.workspace?.id` is valid: |
| 54 | + * Call `workspaceClient.stopWorkspace({ workspaceId: session.workspace.id })`. |
| 55 | + * The `workspaceClient` is imported from `../../service/public-api` (as seen in `list-workspace-sessions-query.ts`). |
| 56 | + * Handle individual API call responses: |
| 57 | + * Track successes and failures. |
| 58 | + * Update the UI to provide feedback (e.g., a progress indicator, a list of workspaces being processed, or a summary toast/notification). |
| 59 | + * Provide overall feedback to the administrator (e.g., "Stop command sent for X workspaces. Successes: Y, Failures: Z."). |
| 60 | + * The list of running workspaces displayed by this card should update automatically as `useWorkspaceSessions` refetches or its cache is updated by `react-query` after the stop actions. A manual refetch can also be triggered if necessary. |
| 61 | + |
| 62 | +### 3.4. Backend Interaction (`components/server/`) |
| 63 | +- **No new dedicated backend API endpoint** is required for the "stop-all" action itself. |
| 64 | +- The frontend will use the existing `workspaceClient.stopWorkspace()` method, which calls the public `StopWorkspace` RPC. |
| 65 | +- The backend's `WorkspaceService.stopWorkspace` method, along with the `Authorizer` and SpiceDB schema, already handles the necessary permission checks to ensure an organization owner can stop workspaces within their organization. |
| 66 | +- The interlock with Maintenance Mode (i.e., disabling this button if Maintenance Mode is not active) will be handled as part of Feature 4.3's implementation. |
| 67 | + |
| 68 | +### 3.5. Diagram of "Stop All" Flow |
| 69 | + |
| 70 | +```mermaid |
| 71 | +graph TD |
| 72 | + Admin[Admin User] -- Clicks --> Btn["Stop All Workspaces Button (in RunningWorkspacesCard)"] |
| 73 | + Admin -- Confirms --> ConfirmDlg["Confirmation Dialog"] |
| 74 | + ConfirmDlg -- Triggers --> Iterate[Iterate Filtered Workspaces (from useWorkspaceSessions)] |
| 75 | + Iterate -- For Each Workspace --> CallAPI["Call workspaceClient.stopWorkspace({workspaceId})"] |
| 76 | + CallAPI -- Interacts With --> BackendAPI["Existing Public StopWorkspace RPC (Server)"] |
| 77 | + BackendAPI -- Checks Permissions --> SpiceDB[(SpiceDB: org_owner can stop)] |
| 78 | + BackendAPI -- Executes Stop --> WSStop[Workspace Stop Logic (ws-manager, etc.)] |
| 79 | + CallAPI -- Updates --> UIFeedback[UI Feedback (Progress, Success/Failure)] |
| 80 | + WSStop -- Eventually Updates --> WSList[Running Workspaces List (Refreshed)] |
| 81 | +``` |
| 82 | + |
| 83 | +## 4. Advantages of this Approach |
| 84 | +- **Reduced Backend Complexity:** No need to design, implement, and test a new backend API endpoint specifically for stopping all workspaces. |
| 85 | +- **Leverages Existing Infrastructure:** Utilizes the existing, tested `StopWorkspace` public API and its permission model. |
| 86 | +- **Clear Permission Model:** Relies on the confirmed SpiceDB definition where organization owners can stop workspaces in their org. |
| 87 | + |
| 88 | +## 5. Testing Strategy |
| 89 | +- **Manual Testing:** |
| 90 | + * Verify the "Stop All Running Workspaces" button is present in the `RunningWorkspacesCard`. |
| 91 | + * Verify clicking the button shows a confirmation dialog with appropriate explanatory text. |
| 92 | + * Verify that confirming the dialog triggers calls to `workspaceClient.stopWorkspace()` for all displayed "not stopped" workspaces. |
| 93 | + * Verify appropriate UI feedback during and after the stop operations (e.g., progress, success/error messages for individual stops or overall). |
| 94 | + * Verify the list of running workspaces in `RunningWorkspacesCard` updates correctly after workspaces are stopped. |
| 95 | + * Test with various scenarios: no running workspaces, a few running workspaces, many running workspaces (if feasible in a test environment). |
| 96 | + * Test error handling if individual `stopWorkspace` calls fail. |
| 97 | + |
| 98 | +## 6. Rollout Plan |
| 99 | +- This enhancement to `RunningWorkspacesCard.tsx` will be part of a standard `components/dashboard/` component update, released alongside the other "Infrastructure Rollout" features. |
| 100 | + |
| 101 | +## 7. Open Questions & Risks |
| 102 | +- **UI Feedback for Batch Operation:** Resolved. Assumed stopping will be quick. A toast notification will be shown once `stopWorkspace` has been called on all targeted workspaces. |
| 103 | +- **Rate Limiting/Concurrency:** Resolved. Considered not a problem at this time. |
| 104 | +- **Dependency on Maintenance Mode API:** The requirement for this button to be disabled if Maintenance Mode is not active (R3.4 in the main PRD) has been moved to be implemented as part of Feature 4.3 (Maintenance Mode Toggle). This PDD assumes the button is always enabled for an admin, and the interlock will be added later. |
| 105 | + |
| 106 | +## 8. Future Considerations |
| 107 | +- If performance issues arise with stopping a very large number of workspaces via individual frontend calls, a backend batch operation could be reconsidered in the future, but the current approach is preferred for its simplicity. |
0 commit comments