Skip to content

Commit 5a57076

Browse files
committed
[memory-bank] task-related learnings
Tool: gitpod/catfood.gitpod.cloud
1 parent aedcdd7 commit 5a57076

File tree

4 files changed

+198
-0
lines changed

4 files changed

+198
-0
lines changed

memory-bank/activeContext.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,5 +144,21 @@ Initial exploration of the Gitpod codebase has revealed:
144144
- Works by copying relevant sources into a separate file tree
145145
- Can also be run from inside the workspace
146146
- Manages complex dependencies between components
147+
- **Server Health Checks**: The Gitpod server uses two distinct health check mechanisms:
148+
- **Liveness Probe**: Checks the event loop lag to determine if the server is functioning properly
149+
- **Readiness Probe**: Checks database and SpiceDB connectivity to ensure the server is ready to handle requests
150+
- **Critical Dependencies**: The server has critical external dependencies that must be operational:
151+
- **Database (TypeORM)**: Used for persistent storage
152+
- **SpiceDB**: Used for authorization and permission management
153+
- **Server Architecture Patterns**:
154+
- The server uses dependency injection (Inversify) for component management
155+
- Components are registered in `container-module.ts` and injected where needed
156+
- The server exposes HTTP endpoints for health checks and other functionality
157+
- Routes are registered in the `registerRoutes` method in `server.ts`
158+
- New functionality typically requires:
159+
1. Creating a new controller/service class
160+
2. Registering it in the container module
161+
3. Injecting it where needed
162+
4. Updating any relevant configuration files
147163

148164
This section will be continuously updated as new insights are gained through working with the system.

memory-bank/progress.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,7 @@ As we begin working with the codebase, we have not yet identified specific issue
111111
- Documentation of second set of API components (image-builder-api, local-app-api, registry-facade-api)
112112
- Documentation of third set of API components (supervisor-api, usage-api, ws-daemon-api)
113113
- Documentation of fourth set of API components (ws-manager-api, ws-manager-bridge-api)
114+
- First feature implementation: Server readiness probe for database and SpiceDB connectivity
114115

115116
### Upcoming Milestones
116117
- Development environment setup
@@ -204,6 +205,17 @@ No specific blockers or dependencies have been identified yet. This section will
204205
- ws-manager-api: Interfaces for managing the lifecycle of workspaces in Kubernetes clusters
205206
- ws-manager-bridge-api: Interfaces for dynamic management of workspace clusters
206207

208+
- **3/17/2025**:
209+
- Implemented server readiness probe:
210+
- Created ReadinessController to check database and SpiceDB connectivity
211+
- Updated server container module and route configuration
212+
- Created PRD document for the readiness probe implementation
213+
- Updated Kubernetes deployment configuration to add the readiness probe
214+
- Updated memory bank with new learnings:
215+
- Added information about server health checks and critical dependencies
216+
- Documented server architecture patterns and dependency injection
217+
- Added information about Kubernetes deployment configuration
218+
207219
## Next Evaluation Point
208220

209221
The next evaluation of progress will occur after:

memory-bank/systemPatterns.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,3 +129,5 @@ Workspaces are treated as immutable, with changes to configuration resulting in
129129
6. **gRPC Communication**: Internal services communicate using gRPC for efficient, typed communication.
130130

131131
7. **Leeway Build System**: Custom build system for managing the complex dependencies between components.
132+
133+
8. **Kubernetes Deployment Configuration**: All code that defines Kubernetes objects for deployable components lives in `install/installer`. This centralized approach ensures consistent deployment patterns across all components.

prd/001-readinessprobe-server.md

Lines changed: 168 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,168 @@
1+
# Server Readiness Probe PRD
2+
3+
## Overview
4+
5+
This document outlines the implementation of a readiness probe for the Gitpod server deployment. The readiness probe will ensure that the server is only considered ready when it has established connections to both the database and SpiceDB authorizer.
6+
7+
## Background
8+
9+
Currently, the server deployment has a liveness probe that checks the event loop lag, but it does not have a readiness probe. This means that the server is considered ready to receive traffic as soon as the container starts, even if it hasn't established connections to critical dependencies like the database and SpiceDB.
10+
11+
## Requirements
12+
13+
1. Create a readiness endpoint in the server that checks:
14+
- Database connectivity
15+
- SpiceDB authorizer connectivity
16+
2. Configure the Kubernetes deployment to use this endpoint as a readiness probe
17+
18+
## Implementation Details
19+
20+
### 1. Readiness Controller
21+
22+
We've created a new `ReadinessController` class in `components/server/src/liveness/readiness-controller.ts` that:
23+
- Checks database connectivity by executing a simple query
24+
- Checks SpiceDB connectivity by attempting to get a client
25+
- Returns a 200 status code only if both checks pass, otherwise returns a 503 status code
26+
27+
```typescript
28+
// components/server/src/liveness/readiness-controller.ts
29+
import { injectable, inject } from "inversify";
30+
import express from "express";
31+
import { TypeORM } from "@gitpod/gitpod-db/lib";
32+
import { SpiceDBClientProvider } from "../authorization/spicedb";
33+
import { log } from "@gitpod/gitpod-protocol/lib/util/logging";
34+
35+
@injectable()
36+
export class ReadinessController {
37+
@inject(TypeORM) protected readonly typeOrm: TypeORM;
38+
@inject(SpiceDBClientProvider) protected readonly spiceDBClientProvider: SpiceDBClientProvider;
39+
40+
get apiRouter(): express.Router {
41+
const router = express.Router();
42+
this.addReadinessHandler(router);
43+
return router;
44+
}
45+
46+
protected addReadinessHandler(router: express.Router) {
47+
router.get("/", async (_, res) => {
48+
try {
49+
// Check database connection
50+
const dbConnection = await this.checkDatabaseConnection();
51+
if (!dbConnection) {
52+
log.warn("Readiness check failed: Database connection failed");
53+
res.status(503).send("Database connection failed");
54+
return;
55+
}
56+
57+
// Check SpiceDB connection
58+
const spiceDBConnection = await this.checkSpiceDBConnection();
59+
if (!spiceDBConnection) {
60+
log.warn("Readiness check failed: SpiceDB connection failed");
61+
res.status(503).send("SpiceDB connection failed");
62+
return;
63+
}
64+
65+
// Both connections are good
66+
res.status(200).send("Ready");
67+
} catch (error) {
68+
log.error("Readiness check failed", error);
69+
res.status(503).send("Readiness check failed");
70+
}
71+
});
72+
}
73+
74+
private async checkDatabaseConnection(): Promise<boolean> {
75+
try {
76+
const connection = await this.typeOrm.getConnection();
77+
// Simple query to verify connection is working
78+
await connection.query("SELECT 1");
79+
return true;
80+
} catch (error) {
81+
log.error("Database connection check failed", error);
82+
return false;
83+
}
84+
}
85+
86+
private async checkSpiceDBConnection(): Promise<boolean> {
87+
try {
88+
// Just getting the client without error is a basic check
89+
// If the client is not available, getClient() will throw an error
90+
this.spiceDBClientProvider.getClient();
91+
return true;
92+
} catch (error) {
93+
log.error("SpiceDB connection check failed", error);
94+
return false;
95+
}
96+
}
97+
}
98+
```
99+
100+
### 2. Server Configuration
101+
102+
We've updated the server's container module to include the `ReadinessController`:
103+
104+
```typescript
105+
// In container-module.ts
106+
import { ReadinessController } from "./liveness/readiness-controller";
107+
108+
// In the productionContainerModule
109+
bind(ReadinessController).toSelf().inSingletonScope();
110+
```
111+
112+
We've also updated the server's route configuration to register the readiness endpoint:
113+
114+
```typescript
115+
// In server.ts
116+
import { ReadinessController } from "./liveness/readiness-controller";
117+
118+
// In the constructor
119+
@inject(ReadinessController) private readonly readinessController: ReadinessController,
120+
121+
// In registerRoutes method
122+
app.use("/ready", this.readinessController.apiRouter);
123+
```
124+
125+
### 3. Kubernetes Deployment
126+
127+
We need to update the server deployment in `install/installer/pkg/components/server/deployment.go` to add a readiness probe:
128+
129+
```go
130+
ReadinessProbe: &corev1.Probe{
131+
ProbeHandler: corev1.ProbeHandler{
132+
HTTPGet: &corev1.HTTPGetAction{
133+
Path: "/ready",
134+
Port: intstr.IntOrString{
135+
Type: intstr.Int,
136+
IntVal: ContainerPort,
137+
},
138+
},
139+
},
140+
InitialDelaySeconds: 30,
141+
PeriodSeconds: 10,
142+
FailureThreshold: 3,
143+
},
144+
```
145+
146+
## Testing
147+
148+
The readiness probe should be tested to ensure:
149+
150+
1. The server is only considered ready when both database and SpiceDB connections are established
151+
2. The server is not considered ready if either connection fails
152+
3. The server becomes ready again when connections are re-established
153+
154+
## Deployment Considerations
155+
156+
- The readiness probe has an initial delay of 30 seconds to allow the server time to establish connections
157+
- The probe runs every 10 seconds
158+
- The probe allows up to 3 failures before marking the pod as not ready
159+
160+
## Future Improvements
161+
162+
- Add more sophisticated checks for SpiceDB connectivity, such as a simple permission check
163+
- Add metrics for readiness probe failures
164+
- Consider adding more dependencies to the readiness check as needed
165+
166+
## Conclusion
167+
168+
This implementation ensures that the server is only considered ready when it has established connections to both the database and SpiceDB authorizer. This improves the reliability of the deployment by preventing traffic from being sent to instances that are not fully initialized.

0 commit comments

Comments
 (0)