You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: GEMINI.md
+83-5Lines changed: 83 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,17 +14,31 @@ This section describes the tools and commands for local development.
14
14
-**Skaffold & Minikube**: Local development is managed by `skaffold`, which deploys services to a local `minikube` Kubernetes cluster.
15
15
-**Makefile**: Common development tasks are scripted in the `Makefile`. See below for key commands.
16
16
17
-
### Key Makefile Commands
17
+
### 2.1. Key Makefile Commands
18
18
19
19
-**`make start-local`**: Starts the complete local development environment using Skaffold and Minikube. This includes live-reloading for code changes.
20
20
-**`make port-forward-manual`**: After starting the environment, run this to expose the services (frontend, backend, etc.) on `localhost`.
21
21
-**`make test`**: Runs the Go and TypeScript unit tests. Use `make go-test` to run only Go tests.
22
-
-**`make precommit`**: Runs a comprehensive suite of checks including tests, linting, and license header verification. This is the main command to run before submitting a pull request.
22
+
-**`make precommit`**: Runs a comprehensive suite of checks including tests, linting (`golangci-lint` configured via `.golangci.yaml`), and license header verification. This is the main command to run before submitting a pull request.
23
23
-**`make gen`**: Regenerates all auto-generated code (from OpenAPI, JSON Schema, ANTLR). Use `make openapi` for just OpenAPI changes.
24
24
-**`make dev_workflows`**: Populates the local Spanner database by running the data ingestion jobs against live data sources.
25
25
-**`make dev_fake_data`**: Populates the local Spanner database with a consistent set of fake data for testing.
26
26
-**`make spanner_new_migration`**: Creates a new Spanner database migration file in `infra/storage/spanner/migrations`.
27
27
28
+
### 2.2. Living Document & Continuous Improvement
29
+
30
+
This document is a living guide to the `webstatus.dev` project. As an AI assistant, I must treat it as the source of truth for my operations. However, like any documentation, it can become outdated.
31
+
32
+
If you, the user, find that I am making mistakes due to outdated or missing information in this document, or if you have to provide significant guidance to correct my course, please instruct me to update this file.
33
+
34
+
My own process should also include a self-correction loop:
35
+
36
+
- After a series of changes, especially if they involved trial-and-error or failed tests.
37
+
- I will reflect on the process and identify any gaps in my understanding.
38
+
- I will then propose changes to this `GEMINI.md` file to incorporate the new knowledge.
39
+
40
+
This ensures that the document evolves with the codebase, making my assistance more accurate and efficient over time.
41
+
28
42
## 3. Codebase Architecture
29
43
30
44
This section details the main components of the application and how they interact.
@@ -81,7 +95,8 @@ Standalone Go applications that populate the Spanner database from external sour
81
95
-**DO** follow the existing pattern for new workflows: new directory, `main.go`, and `manifests/job.yaml`.
82
96
-**DO** use consumer-specific `spanneradapters` (e.g., `BCDConsumer`).
83
97
-**DON'T** call the `Backend` spanner adapter from a workflow.
84
-
-**DO** use the `entitySynchronizer` for bulk data updates.
98
+
-**DO** choose the correct data ingestion pattern (sync vs. upsert) based on the use case. See the "How-To" guide for details.
99
+
-**DO** separate the process of ingesting raw data from the process of linking that data to other entities (like web features). This makes the ingestion pipeline more robust.
85
100
-**DO** add a new target to the `make dev_workflows` command in `Makefile` for any new workflow.
86
101
87
102
### 3.2. Shared Go Libraries (`lib/`)
@@ -99,6 +114,18 @@ Shared Go libraries used by the `backend` and `workflows`.
99
114
-**DO** define new database table structs in `lib/gcpspanner`.
100
115
-**DO** create or extend adapters in `lib/gcpspanner/spanneradapters` to expose new database queries.
101
116
117
+
### 3.2.1 The Go Mapper Pattern for Spanner
118
+
119
+
A core architectural pattern in the Go codebase is the **mapper pattern**, used for all interactions with the Spanner database. This pattern, defined in `lib/gcpspanner/client.go`, provides a generic and reusable way to handle database operations, reducing boilerplate and ensuring consistency.
120
+
121
+
-**Core Concept**: Instead of writing custom query logic for each data type, you use generic helpers like `newEntityReader`, `newEntityWriter`, and `newEntitySynchronizer`. These helpers are configured with a "mapper" struct.
122
+
-**Mapper Interfaces**: The mapper struct implements a set of interfaces that define the specific database logic for a data type. The composition of these interfaces determines the mapper's capabilities. Key interfaces include:
123
+
-`baseMapper`: Defines the Spanner table name.
124
+
-`readOneMapper`: Defines how to select a single entity by its key.
125
+
-`mergeMapper`: Defines how to merge an incoming entity with an existing one for updates.
126
+
-`deleteByStructMapper`: Defines how to delete an entity.
127
+
-**Implementations**: You can find many examples of mapper implementations throughout the `lib/gcpspanner/` directory (e.g., `webFeatureSpannerMapper`, `baselineStatusMapper`). **DO** look for existing mappers before writing a new one.
128
+
102
129
### 3.3. End-to-End Data Flow Example
103
130
104
131
This example illustrates how data is ingested by a workflow and then served by the API, highlighting the different components involved.
@@ -171,8 +198,11 @@ This section covers key processes and architectural decisions that apply across
171
198
-**DO** add E2E tests for critical user journeys.
172
199
-**DON'T** write E2E tests for small component-level interactions.
173
200
-**DO** use resilient selectors like `data-testid`.
174
-
-**Unit Tests**:
175
-
-**Go**: Use table-driven unit tests with mocks for dependencies.
201
+
-**Go Unit & Integration Tests**:
202
+
-**DO** use table-driven unit tests with mocks for dependencies at the adapter layer (`spanneradapters`).
203
+
-**DO** write **integration tests using `testcontainers-go`** for any changes to the `lib/gcpspanner` layer. This is especially critical when implementing or modifying a mapper. These tests must spin up a Spanner emulator and verify the mapper's logic against a real database.
204
+
- When a refactoring changes how errors are handled (e.g., from returning an error to logging a warning and continuing), **DO** update the tests to reflect the new expected behavior. Some test cases might become obsolete and should be removed or updated.
205
+
-**TypeScript Unit Tests**:
176
206
-**TypeScript**: Use `npm run test -w frontend`.
177
207
178
208
### 5.2. CI/CD (`.github/`)
@@ -220,6 +250,10 @@ Helper scripts and small CLI tools for local development.
220
250
-**DO** place new one-off development scripts here.
221
251
-**DON'T** put production application logic in `util/`.
222
252
253
+
### 5.7. Code Modifications
254
+
255
+
-**License Headers**: Never modify license headers manually. They are managed by the `make license-fix` command. If you see license header issues, run that command.
256
+
223
257
## 6. How-To Guides
224
258
225
259
This section provides step-by-step guides for common development tasks. When working on a specific part of the application, use the corresponding section in this document as your primary guide. For example:
@@ -283,3 +317,47 @@ For other tools defined as features in the devcontainer:
283
317
284
318
1.**Update Devcontainer**: In `.devcontainer/devcontainer.json`, find the feature for the tool you want to update (e.g., `ghcr.io/devcontainers/features/terraform:1`) and change its `version`.
285
319
2.**Rebuild Devcontainer**: Rebuild and reopen the project in the devcontainer to use the new version.
320
+
321
+
### 6.3. How-To: Implement or Refactor a Go Data Ingestion Workflow
322
+
323
+
This guide outlines the process for implementing or refactoring a Go data ingestion workflow.
324
+
325
+
**1. Analyze the Data and Goal**
326
+
327
+
First, analyze the nature of the incoming data and the goal of the ingestion. Ask these questions:
328
+
329
+
- Is the incoming data a **complete set** that represents the entire desired state of a table? Or is it a **partial update** or a stream of new events?
330
+
- Do I need to handle **deletions**? If a record is no longer in the source data, should it be removed from the database?
331
+
332
+
**2. Choose the Right Ingestion Pattern**
333
+
334
+
Based on your analysis, choose one of the following patterns:
335
+
336
+
-**Use Full Synchronization if...** the data is a complete source of truth and you need to handle creates, updates, and deletes to keep a table perfectly in sync.
337
+
-**Example**: Syncing the `WebFeatures` table from the `web-features` git repository.
338
+
-**Implementation**: Use the `newEntitySynchronizer` helper. Your mapper must implement the `syncableEntityMapper` interface.
339
+
340
+
-**Use Batch Upsert if...** you are adding or updating records in bulk but **not** deleting old records. This is common for append-only or time-series data.
341
+
-**Example**: Storing daily UMA metrics or WPT results for a specific run.
342
+
-**Implementation**: Use the `newEntityWriter` helper, likely in a loop or with a custom batching function. Your mapper only needs to implement the `writeableEntityMapper` interface.
343
+
344
+
-**Use Simple Insert if...** you are processing and inserting records one-by-one.
345
+
-**Example**: Ingesting the list of BCD browser releases as they are processed.
346
+
-**Implementation**: Use the `newEntityWriter` helper inside a loop. Your mapper only needs to implement the `writeableEntityMapper` interface.
347
+
348
+
**3. Implement the Workflow**
349
+
350
+
1.**Implement the Mapper**: In the `lib/gcpspanner` package, create a new mapper struct and implement the required interfaces for your chosen pattern (e.g., `syncableEntityMapper` or `writeableEntityMapper`).
351
+
2.**Implement the Client Method**: In `lib/gcpspanner/client.go`, add a new method that takes the data and uses the appropriate generic helper (e.g., `newEntitySynchronizer`) with your new mapper.
352
+
3.**Update the Adapter**: In the `lib/gcpspanner/spanneradapters` package, update the consumer to call the new client method.
353
+
354
+
**4. Write Tests**
355
+
356
+
This is a critical step:
357
+
358
+
- Update the **unit tests** for the adapter to mock the new client methods.
359
+
- Write a new **integration test** using `testcontainers-go` for your new logic in the `lib/gcpspanner` package. This test must verify the end-to-end process for the pattern you chose.
360
+
361
+
**5. Verify**
362
+
363
+
Run `make precommit` to ensure all linting checks and tests pass.
0 commit comments