Skip to content

Commit c7759b1

Browse files
authored
chore: Update GEMINI.md and add dockerignore (#1743)
* chore: Update GEMINI.md * fix
1 parent e3c40af commit c7759b1

File tree

2 files changed

+156
-5
lines changed

2 files changed

+156
-5
lines changed

.dockerignore

Lines changed: 73 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
# Local .terraform directories
2+
**/.terraform/*
3+
4+
# .tfstate files
5+
*.tfstate
6+
*.tfstate.*
7+
8+
# Crash log files
9+
crash.log
10+
crash.*.log
11+
12+
# Exclude all .tfvars files, which are likely to contain sensitive data, such as
13+
# password, private keys, and other secrets. These should not be part of version
14+
# control as they are data points which are potentially sensitive and subject
15+
# to change depending on the environment.
16+
*.tfvars
17+
*.tfvars.json
18+
!infra/.envs/*.tfvars
19+
20+
# Ignore override files as they are usually used to override resources locally and so
21+
# are not checked in
22+
override.tf
23+
override.tf.json
24+
*_override.tf
25+
*_override.tf.json
26+
27+
# Include override files you do wish to add to version control using negated pattern
28+
# !example_override.tf
29+
30+
# Include tfplan files to ignore the plan output of command: terraform plan -out=tfplan
31+
# example: *tfplan*
32+
33+
# Ignore CLI configuration files
34+
.terraformrc
35+
terraform.rc
36+
37+
node_modules
38+
dist
39+
frontend/build
40+
frontend/.postinstall
41+
42+
43+
# Ignore coverage files
44+
coverage
45+
/test-results/
46+
/e2e/test-results/
47+
/playwright-report/
48+
/blob-report/
49+
/playwright/.cache/
50+
51+
./e2e/test-results/
52+
53+
# Ignore the file per this: https://go.dev/ref/mod#go-work-file
54+
go.work
55+
go.work.sum
56+
57+
# ANTLR4
58+
.antlr
59+
60+
.devcontainer/cache/go/go-build/*
61+
.devcontainer/cache/go/pkg/*
62+
.devcontainer/cache/node/.npm/*
63+
!.devcontainer/cache/go/go-build/.gitkeep
64+
!.devcontainer/cache/go/pkg/.gitkeep
65+
!.devcontainer/cache/node/.npm/.gitkeep
66+
67+
# TypeScript
68+
tsconfig.tsbuildinfo
69+
70+
# Docker build log files
71+
infra/*.log
72+
73+
**__debug_**

GEMINI.md

Lines changed: 83 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -14,17 +14,31 @@ This section describes the tools and commands for local development.
1414
- **Skaffold & Minikube**: Local development is managed by `skaffold`, which deploys services to a local `minikube` Kubernetes cluster.
1515
- **Makefile**: Common development tasks are scripted in the `Makefile`. See below for key commands.
1616

17-
### Key Makefile Commands
17+
### 2.1. Key Makefile Commands
1818

1919
- **`make start-local`**: Starts the complete local development environment using Skaffold and Minikube. This includes live-reloading for code changes.
2020
- **`make port-forward-manual`**: After starting the environment, run this to expose the services (frontend, backend, etc.) on `localhost`.
2121
- **`make test`**: Runs the Go and TypeScript unit tests. Use `make go-test` to run only Go tests.
22-
- **`make precommit`**: Runs a comprehensive suite of checks including tests, linting, and license header verification. This is the main command to run before submitting a pull request.
22+
- **`make precommit`**: Runs a comprehensive suite of checks including tests, linting (`golangci-lint` configured via `.golangci.yaml`), and license header verification. This is the main command to run before submitting a pull request.
2323
- **`make gen`**: Regenerates all auto-generated code (from OpenAPI, JSON Schema, ANTLR). Use `make openapi` for just OpenAPI changes.
2424
- **`make dev_workflows`**: Populates the local Spanner database by running the data ingestion jobs against live data sources.
2525
- **`make dev_fake_data`**: Populates the local Spanner database with a consistent set of fake data for testing.
2626
- **`make spanner_new_migration`**: Creates a new Spanner database migration file in `infra/storage/spanner/migrations`.
2727

28+
### 2.2. Living Document & Continuous Improvement
29+
30+
This document is a living guide to the `webstatus.dev` project. As an AI assistant, I must treat it as the source of truth for my operations. However, like any documentation, it can become outdated.
31+
32+
If you, the user, find that I am making mistakes due to outdated or missing information in this document, or if you have to provide significant guidance to correct my course, please instruct me to update this file.
33+
34+
My own process should also include a self-correction loop:
35+
36+
- After a series of changes, especially if they involved trial-and-error or failed tests.
37+
- I will reflect on the process and identify any gaps in my understanding.
38+
- I will then propose changes to this `GEMINI.md` file to incorporate the new knowledge.
39+
40+
This ensures that the document evolves with the codebase, making my assistance more accurate and efficient over time.
41+
2842
## 3. Codebase Architecture
2943

3044
This section details the main components of the application and how they interact.
@@ -81,7 +95,8 @@ Standalone Go applications that populate the Spanner database from external sour
8195
- **DO** follow the existing pattern for new workflows: new directory, `main.go`, and `manifests/job.yaml`.
8296
- **DO** use consumer-specific `spanneradapters` (e.g., `BCDConsumer`).
8397
- **DON'T** call the `Backend` spanner adapter from a workflow.
84-
- **DO** use the `entitySynchronizer` for bulk data updates.
98+
- **DO** choose the correct data ingestion pattern (sync vs. upsert) based on the use case. See the "How-To" guide for details.
99+
- **DO** separate the process of ingesting raw data from the process of linking that data to other entities (like web features). This makes the ingestion pipeline more robust.
85100
- **DO** add a new target to the `make dev_workflows` command in `Makefile` for any new workflow.
86101

87102
### 3.2. Shared Go Libraries (`lib/`)
@@ -99,6 +114,18 @@ Shared Go libraries used by the `backend` and `workflows`.
99114
- **DO** define new database table structs in `lib/gcpspanner`.
100115
- **DO** create or extend adapters in `lib/gcpspanner/spanneradapters` to expose new database queries.
101116

117+
### 3.2.1 The Go Mapper Pattern for Spanner
118+
119+
A core architectural pattern in the Go codebase is the **mapper pattern**, used for all interactions with the Spanner database. This pattern, defined in `lib/gcpspanner/client.go`, provides a generic and reusable way to handle database operations, reducing boilerplate and ensuring consistency.
120+
121+
- **Core Concept**: Instead of writing custom query logic for each data type, you use generic helpers like `newEntityReader`, `newEntityWriter`, and `newEntitySynchronizer`. These helpers are configured with a "mapper" struct.
122+
- **Mapper Interfaces**: The mapper struct implements a set of interfaces that define the specific database logic for a data type. The composition of these interfaces determines the mapper's capabilities. Key interfaces include:
123+
- `baseMapper`: Defines the Spanner table name.
124+
- `readOneMapper`: Defines how to select a single entity by its key.
125+
- `mergeMapper`: Defines how to merge an incoming entity with an existing one for updates.
126+
- `deleteByStructMapper`: Defines how to delete an entity.
127+
- **Implementations**: You can find many examples of mapper implementations throughout the `lib/gcpspanner/` directory (e.g., `webFeatureSpannerMapper`, `baselineStatusMapper`). **DO** look for existing mappers before writing a new one.
128+
102129
### 3.3. End-to-End Data Flow Example
103130

104131
This example illustrates how data is ingested by a workflow and then served by the API, highlighting the different components involved.
@@ -171,8 +198,11 @@ This section covers key processes and architectural decisions that apply across
171198
- **DO** add E2E tests for critical user journeys.
172199
- **DON'T** write E2E tests for small component-level interactions.
173200
- **DO** use resilient selectors like `data-testid`.
174-
- **Unit Tests**:
175-
- **Go**: Use table-driven unit tests with mocks for dependencies.
201+
- **Go Unit & Integration Tests**:
202+
- **DO** use table-driven unit tests with mocks for dependencies at the adapter layer (`spanneradapters`).
203+
- **DO** write **integration tests using `testcontainers-go`** for any changes to the `lib/gcpspanner` layer. This is especially critical when implementing or modifying a mapper. These tests must spin up a Spanner emulator and verify the mapper's logic against a real database.
204+
- When a refactoring changes how errors are handled (e.g., from returning an error to logging a warning and continuing), **DO** update the tests to reflect the new expected behavior. Some test cases might become obsolete and should be removed or updated.
205+
- **TypeScript Unit Tests**:
176206
- **TypeScript**: Use `npm run test -w frontend`.
177207

178208
### 5.2. CI/CD (`.github/`)
@@ -220,6 +250,10 @@ Helper scripts and small CLI tools for local development.
220250
- **DO** place new one-off development scripts here.
221251
- **DON'T** put production application logic in `util/`.
222252

253+
### 5.7. Code Modifications
254+
255+
- **License Headers**: Never modify license headers manually. They are managed by the `make license-fix` command. If you see license header issues, run that command.
256+
223257
## 6. How-To Guides
224258

225259
This section provides step-by-step guides for common development tasks. When working on a specific part of the application, use the corresponding section in this document as your primary guide. For example:
@@ -283,3 +317,47 @@ For other tools defined as features in the devcontainer:
283317

284318
1. **Update Devcontainer**: In `.devcontainer/devcontainer.json`, find the feature for the tool you want to update (e.g., `ghcr.io/devcontainers/features/terraform:1`) and change its `version`.
285319
2. **Rebuild Devcontainer**: Rebuild and reopen the project in the devcontainer to use the new version.
320+
321+
### 6.3. How-To: Implement or Refactor a Go Data Ingestion Workflow
322+
323+
This guide outlines the process for implementing or refactoring a Go data ingestion workflow.
324+
325+
**1. Analyze the Data and Goal**
326+
327+
First, analyze the nature of the incoming data and the goal of the ingestion. Ask these questions:
328+
329+
- Is the incoming data a **complete set** that represents the entire desired state of a table? Or is it a **partial update** or a stream of new events?
330+
- Do I need to handle **deletions**? If a record is no longer in the source data, should it be removed from the database?
331+
332+
**2. Choose the Right Ingestion Pattern**
333+
334+
Based on your analysis, choose one of the following patterns:
335+
336+
- **Use Full Synchronization if...** the data is a complete source of truth and you need to handle creates, updates, and deletes to keep a table perfectly in sync.
337+
- **Example**: Syncing the `WebFeatures` table from the `web-features` git repository.
338+
- **Implementation**: Use the `newEntitySynchronizer` helper. Your mapper must implement the `syncableEntityMapper` interface.
339+
340+
- **Use Batch Upsert if...** you are adding or updating records in bulk but **not** deleting old records. This is common for append-only or time-series data.
341+
- **Example**: Storing daily UMA metrics or WPT results for a specific run.
342+
- **Implementation**: Use the `newEntityWriter` helper, likely in a loop or with a custom batching function. Your mapper only needs to implement the `writeableEntityMapper` interface.
343+
344+
- **Use Simple Insert if...** you are processing and inserting records one-by-one.
345+
- **Example**: Ingesting the list of BCD browser releases as they are processed.
346+
- **Implementation**: Use the `newEntityWriter` helper inside a loop. Your mapper only needs to implement the `writeableEntityMapper` interface.
347+
348+
**3. Implement the Workflow**
349+
350+
1. **Implement the Mapper**: In the `lib/gcpspanner` package, create a new mapper struct and implement the required interfaces for your chosen pattern (e.g., `syncableEntityMapper` or `writeableEntityMapper`).
351+
2. **Implement the Client Method**: In `lib/gcpspanner/client.go`, add a new method that takes the data and uses the appropriate generic helper (e.g., `newEntitySynchronizer`) with your new mapper.
352+
3. **Update the Adapter**: In the `lib/gcpspanner/spanneradapters` package, update the consumer to call the new client method.
353+
354+
**4. Write Tests**
355+
356+
This is a critical step:
357+
358+
- Update the **unit tests** for the adapter to mock the new client methods.
359+
- Write a new **integration test** using `testcontainers-go` for your new logic in the `lib/gcpspanner` package. This test must verify the end-to-end process for the pattern you chose.
360+
361+
**5. Verify**
362+
363+
Run `make precommit` to ensure all linting checks and tests pass.

0 commit comments

Comments
 (0)