|
| 1 | +# Server Readiness Probe PRD |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This document outlines the implementation of a readiness probe for the Gitpod server deployment. The readiness probe will ensure that the server is only considered ready when it has established connections to both the database and SpiceDB authorizer. |
| 6 | + |
| 7 | +## Background |
| 8 | + |
| 9 | +Currently, the server deployment has a liveness probe that checks the event loop lag, but it does not have a readiness probe. This means that the server is considered ready to receive traffic as soon as the container starts, even if it hasn't established connections to critical dependencies like the database and SpiceDB. |
| 10 | + |
| 11 | +## Requirements |
| 12 | + |
| 13 | +1. Create a readiness endpoint in the server that checks: |
| 14 | + - Database connectivity |
| 15 | + - SpiceDB authorizer connectivity |
| 16 | +2. Configure the Kubernetes deployment to use this endpoint as a readiness probe |
| 17 | + |
| 18 | +## Implementation Details |
| 19 | + |
| 20 | +### 1. Readiness Controller |
| 21 | + |
| 22 | +We've created a new `ReadinessController` class in `components/server/src/liveness/readiness-controller.ts` that: |
| 23 | +- Checks database connectivity by executing a simple query |
| 24 | +- Checks SpiceDB connectivity by attempting to get a client |
| 25 | +- Returns a 200 status code only if both checks pass, otherwise returns a 503 status code |
| 26 | + |
| 27 | +```typescript |
| 28 | +// components/server/src/liveness/readiness-controller.ts |
| 29 | +import { injectable, inject } from "inversify"; |
| 30 | +import express from "express"; |
| 31 | +import { TypeORM } from "@gitpod/gitpod-db/lib"; |
| 32 | +import { SpiceDBClientProvider } from "../authorization/spicedb"; |
| 33 | +import { log } from "@gitpod/gitpod-protocol/lib/util/logging"; |
| 34 | + |
| 35 | +@injectable() |
| 36 | +export class ReadinessController { |
| 37 | + @inject(TypeORM) protected readonly typeOrm: TypeORM; |
| 38 | + @inject(SpiceDBClientProvider) protected readonly spiceDBClientProvider: SpiceDBClientProvider; |
| 39 | + |
| 40 | + get apiRouter(): express.Router { |
| 41 | + const router = express.Router(); |
| 42 | + this.addReadinessHandler(router); |
| 43 | + return router; |
| 44 | + } |
| 45 | + |
| 46 | + protected addReadinessHandler(router: express.Router) { |
| 47 | + router.get("/", async (_, res) => { |
| 48 | + try { |
| 49 | + // Check database connection |
| 50 | + const dbConnection = await this.checkDatabaseConnection(); |
| 51 | + if (!dbConnection) { |
| 52 | + log.warn("Readiness check failed: Database connection failed"); |
| 53 | + res.status(503).send("Database connection failed"); |
| 54 | + return; |
| 55 | + } |
| 56 | + |
| 57 | + // Check SpiceDB connection |
| 58 | + const spiceDBConnection = await this.checkSpiceDBConnection(); |
| 59 | + if (!spiceDBConnection) { |
| 60 | + log.warn("Readiness check failed: SpiceDB connection failed"); |
| 61 | + res.status(503).send("SpiceDB connection failed"); |
| 62 | + return; |
| 63 | + } |
| 64 | + |
| 65 | + // Both connections are good |
| 66 | + res.status(200).send("Ready"); |
| 67 | + } catch (error) { |
| 68 | + log.error("Readiness check failed", error); |
| 69 | + res.status(503).send("Readiness check failed"); |
| 70 | + } |
| 71 | + }); |
| 72 | + } |
| 73 | + |
| 74 | + private async checkDatabaseConnection(): Promise<boolean> { |
| 75 | + try { |
| 76 | + const connection = await this.typeOrm.getConnection(); |
| 77 | + // Simple query to verify connection is working |
| 78 | + await connection.query("SELECT 1"); |
| 79 | + return true; |
| 80 | + } catch (error) { |
| 81 | + log.error("Database connection check failed", error); |
| 82 | + return false; |
| 83 | + } |
| 84 | + } |
| 85 | + |
| 86 | + private async checkSpiceDBConnection(): Promise<boolean> { |
| 87 | + try { |
| 88 | + // Just getting the client without error is a basic check |
| 89 | + // If the client is not available, getClient() will throw an error |
| 90 | + this.spiceDBClientProvider.getClient(); |
| 91 | + return true; |
| 92 | + } catch (error) { |
| 93 | + log.error("SpiceDB connection check failed", error); |
| 94 | + return false; |
| 95 | + } |
| 96 | + } |
| 97 | +} |
| 98 | +``` |
| 99 | + |
| 100 | +### 2. Server Configuration |
| 101 | + |
| 102 | +We've updated the server's container module to include the `ReadinessController`: |
| 103 | + |
| 104 | +```typescript |
| 105 | +// In container-module.ts |
| 106 | +import { ReadinessController } from "./liveness/readiness-controller"; |
| 107 | + |
| 108 | +// In the productionContainerModule |
| 109 | +bind(ReadinessController).toSelf().inSingletonScope(); |
| 110 | +``` |
| 111 | + |
| 112 | +We've also updated the server's route configuration to register the readiness endpoint: |
| 113 | + |
| 114 | +```typescript |
| 115 | +// In server.ts |
| 116 | +import { ReadinessController } from "./liveness/readiness-controller"; |
| 117 | + |
| 118 | +// In the constructor |
| 119 | +@inject(ReadinessController) private readonly readinessController: ReadinessController, |
| 120 | + |
| 121 | +// In registerRoutes method |
| 122 | +app.use("/ready", this.readinessController.apiRouter); |
| 123 | +``` |
| 124 | + |
| 125 | +### 3. Kubernetes Deployment |
| 126 | + |
| 127 | +We need to update the server deployment in `install/installer/pkg/components/server/deployment.go` to add a readiness probe: |
| 128 | + |
| 129 | +```go |
| 130 | +ReadinessProbe: &corev1.Probe{ |
| 131 | + ProbeHandler: corev1.ProbeHandler{ |
| 132 | + HTTPGet: &corev1.HTTPGetAction{ |
| 133 | + Path: "/ready", |
| 134 | + Port: intstr.IntOrString{ |
| 135 | + Type: intstr.Int, |
| 136 | + IntVal: ContainerPort, |
| 137 | + }, |
| 138 | + }, |
| 139 | + }, |
| 140 | + InitialDelaySeconds: 30, |
| 141 | + PeriodSeconds: 10, |
| 142 | + FailureThreshold: 3, |
| 143 | +}, |
| 144 | +``` |
| 145 | + |
| 146 | +## Testing |
| 147 | + |
| 148 | +The readiness probe should be tested to ensure: |
| 149 | + |
| 150 | +1. The server is only considered ready when both database and SpiceDB connections are established |
| 151 | +2. The server is not considered ready if either connection fails |
| 152 | +3. The server becomes ready again when connections are re-established |
| 153 | + |
| 154 | +## Deployment Considerations |
| 155 | + |
| 156 | +- The readiness probe has an initial delay of 30 seconds to allow the server time to establish connections |
| 157 | +- The probe runs every 10 seconds |
| 158 | +- The probe allows up to 3 failures before marking the pod as not ready |
| 159 | + |
| 160 | +## Future Improvements |
| 161 | + |
| 162 | +- Add more sophisticated checks for SpiceDB connectivity, such as a simple permission check |
| 163 | +- Add metrics for readiness probe failures |
| 164 | +- Consider adding more dependencies to the readiness check as needed |
| 165 | + |
| 166 | +## Conclusion |
| 167 | + |
| 168 | +This implementation ensures that the server is only considered ready when it has established connections to both the database and SpiceDB authorizer. This improves the reliability of the deployment by preventing traffic from being sent to instances that are not fully initialized. |
0 commit comments