Skip to content

Commit 604d7da

Browse files
authored
docs(gemini): Update knowledge base and maintenance instructions (#1819)
Updates the Gemini assistant's knowledge base (`GEMINI.md`) based on an analysis of recent commits. This captures significant architectural and feature changes, including: - **Developer Signals:** Documents the new `developer_signals_consumer` workflow, the `LatestFeatureDeveloperSignals` database table, and the associated data adapters and types. - **Feature Evolution:** Explains how the system now handles "moved" and "split" features, including the new `MovedWebFeatures` and `SplitWebFeatures` tables. - **Data Migration:** Details the new pre-delete hook (`GetChildDeleteKeyMutations`) for handling large cascade deletes in Spanner and the generic data migrator for feature key renames. - **Code Refactoring:** Notes the refactoring of HTTP fetching into a common `fetchtypes` module and the use of `ProcessedWebFeaturesData` in the `web_feature_consumer`. Additionally, this commit improves the process for maintaining the knowledge base itself: - Consolidates the "Living Document" and "Updating the Knowledge Base" sections into a single, more comprehensive guide. - Adds a hidden marker with the last analyzed commit SHA to streamline future updates. - Provides a standardized prompt for triggering knowledge base updates.
1 parent 7672d9b commit 604d7da

File tree

1 file changed

+39
-23
lines changed

1 file changed

+39
-23
lines changed

GEMINI.md

Lines changed: 39 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
# Gemini Code Assist Configuration for webstatus.dev
22

3+
<!-- Last analyzed commit: 96f9821fd3482b12fac0a787ed273675f3f82655 -->
4+
35
This document provides context to Gemini Code Assist to help it generate more accurate and project-specific code suggestions.
46

57
## 1. Project Overview
@@ -25,20 +27,6 @@ This section describes the tools and commands for local development.
2527
- **`make dev_fake_data`**: Populates the local Spanner database with a consistent set of fake data for testing.
2628
- **`make spanner_new_migration`**: Creates a new Spanner database migration file in `infra/storage/spanner/migrations`.
2729

28-
### 2.2. Living Document & Continuous Improvement
29-
30-
This document is a living guide to the `webstatus.dev` project. As an AI assistant, I must treat it as the source of truth for my operations. However, like any documentation, it can become outdated.
31-
32-
If you, the user, find that I am making mistakes due to outdated or missing information in this document, or if you have to provide significant guidance to correct my course, please instruct me to update this file.
33-
34-
My own process should also include a self-correction loop:
35-
36-
- After a series of changes, especially if they involved trial-and-error or failed tests.
37-
- I will reflect on the process and identify any gaps in my understanding.
38-
- I will then propose changes to this `GEMINI.md` file to incorporate the new knowledge.
39-
40-
This ensures that the document evolves with the codebase, making my assistance more accurate and efficient over time.
41-
4230
## 3. Codebase Architecture
4331

4432
This section details the main components of the application and how they interact.
@@ -87,13 +75,13 @@ The frontend Single Page Application (SPA).
8775

8876
Standalone Go applications that populate the Spanner database from external sources.
8977

90-
- **Overview**: Each workflow corresponds to a specific data source (e.g., `bcd_consumer`, `wpt_consumer`). They run as scheduled Cloud Run Jobs in production.
78+
- **Overview**: Each workflow corresponds to a specific data source (e.g., `bcd_consumer`, `wpt_consumer`, `developer_signals_consumer`). They run as scheduled Cloud Run Jobs in production. The `web_feature_consumer` is a key workflow that ingests data from the `web-platform-dx/web-features` repository, and it now handles features that have been moved or split.
9179
- **Development & Execution**:
9280
- **Local**: Run all main workflows with `make dev_workflows`.
9381
- **Production**: Deployed as `google_cloud_run_v2_job` resources via Terraform.
9482
- **"Do's and Don'ts" (Workflows)**:
9583
- **DO** follow the existing pattern for new workflows: new directory, `main.go`, and `manifests/job.yaml`.
96-
- **DO** use consumer-specific `spanneradapters` (e.g., `BCDConsumer`).
84+
- **DO** use consumer-specific `spanneradapters` (e.g., `BCDConsumer`, `DeveloperSignalsConsumer`).
9785
- **DON'T** call the `Backend` spanner adapter from a workflow.
9886
- **DO** choose the correct data ingestion pattern (sync vs. upsert) based on the use case. See the "How-To" guide for details.
9987
- **DO** separate the process of ingesting raw data from the process of linking that data to other entities (like web features). This makes the ingestion pipeline more robust.
@@ -108,6 +96,10 @@ Shared Go libraries used by the `backend` and `workflows`.
10896
- **`lib/gcpspanner`**: The data access layer, containing the Spanner client, data models, and generic helpers.
10997
- **`lib/gcpspanner/spanneradapters`**: The abstraction layer between services and the database client.
11098
- **`lib/cachetypes`**: Common interfaces and types for the caching layer.
99+
- **`lib/fetchtypes`**: A common module for making HTTP requests.
100+
- **`lib/developersignaltypes`**: Types related to developer signals.
101+
- **`lib/webdxfeaturetypes`**: Types related to web features from the `web-platform-dx/web-features` repository.
102+
111103
- **"Do's and Don'ts" (Libraries)**:
112104
- **DO** place reusable Go code shared between services here.
113105
- **DON'T** put service-specific logic in `lib/`.
@@ -124,7 +116,8 @@ A core architectural pattern in the Go codebase is the **mapper pattern**, used
124116
- `readOneMapper`: Defines how to select a single entity by its key.
125117
- `mergeMapper`: Defines how to merge an incoming entity with an existing one for updates.
126118
- `deleteByStructMapper`: Defines how to delete an entity.
127-
- **Implementations**: You can find many examples of mapper implementations throughout the `lib/gcpspanner/` directory (e.g., `webFeatureSpannerMapper`, `baselineStatusMapper`). **DO** look for existing mappers before writing a new one.
119+
- `childDeleteMapper`: Defines how to handle child deletions in batches before deleting the parent. See the `GetChildDeleteKeyMutations` method.
120+
- **Implementations**: You can find many examples of mapper implementations throughout the `lib/gcpspanner/` directory (e.g., `webFeatureSpannerMapper`, `baselineStatusMapper`, `latestFeatureDeveloperSignalsMapper`). **DO** look for existing mappers before writing a new one.
128121

129122
### 3.3. End-to-End Data Flow Example
130123

@@ -135,9 +128,9 @@ This example illustrates how data is ingested by a workflow and then served by t
135128
The goal is to ingest feature definitions from the `web-platform-dx/web-features` repository into the Spanner `WebFeatures` table.
136129

137130
- A developer runs `make dev_workflows`, which executes the `web_feature_consumer` job via `util/run_job.sh`.
138-
- The `web_feature_consumer` fetches the latest feature data.
139-
- It uses its dedicated adapter, `spanneradapters.WebFeatureConsumer`, to process and store the data.
140-
- The adapter calls `gcpspanner.Client`, which uses the generic `entitySynchronizer` to efficiently batch-write the feature data into the `WebFeatures` table in the database.
131+
- The `web_feature_consumer` fetches the latest feature data and processes it into a `webdxfeaturetypes.ProcessedWebFeaturesData` struct. This struct separates features, moved features, and split features.
132+
- It uses its dedicated adapter, `spanneradapters.WebFeaturesConsumer`, to process and store the data.
133+
- The adapter calls `gcpspanner.Client`, which uses the generic `entitySynchronizer` to efficiently batch-write the feature data into the `WebFeatures`, `MovedWebFeatures`, and `SplitWebFeatures` tables in the database.
141134

142135
**2. Data Serving (`getFeature` API endpoint)**
143136

@@ -231,6 +224,7 @@ The project's infrastructure is managed with **Terraform**.
231224
- **Creation**: Use `make spanner_new_migration` to create a new migration file.
232225
- **Cascade Deletes**: Prefer using `ON DELETE CASCADE` for foreign key relationships to maintain data integrity. Add an integration test to verify this behavior (see `lib/gcpspanner/web_features_fk_test.go`).
233226
- **Cascade Caveat**: If a cascade could delete thousands of child entities, it may exceed Spanner's mutation limit. In such cases, implement the `GetChildDeleteKeyMutations` method in the parent's `spannerMapper` to handle child deletions in batches before deleting the parent.
227+
- **Data Migrations**: For more complex data migrations, such as renaming a feature key, a generic migrator has been introduced in `lib/gcpspanner/spanneradapters/migration.go`. This can be used to migrate data between old and new keys.
234228

235229
### 5.5. Caching
236230

@@ -328,6 +322,7 @@ First, analyze the nature of the incoming data and the goal of the ingestion. As
328322

329323
- Is the incoming data a **complete set** that represents the entire desired state of a table? Or is it a **partial update** or a stream of new events?
330324
- Do I need to handle **deletions**? If a record is no longer in the source data, should it be removed from the database?
325+
- Does the data contain features that have been **moved** or **split**?
331326

332327
**2. Choose the Right Ingestion Pattern**
333328

@@ -347,9 +342,10 @@ Based on your analysis, choose one of the following patterns:
347342

348343
**3. Implement the Workflow**
349344

350-
1. **Implement the Mapper**: In the `lib/gcpspanner` package, create a new mapper struct and implement the required interfaces for your chosen pattern (e.g., `syncableEntityMapper` or `writeableEntityMapper`).
351-
2. **Implement the Client Method**: In `lib/gcpspanner/client.go`, add a new method that takes the data and uses the appropriate generic helper (e.g., `newEntitySynchronizer`) with your new mapper.
352-
3. **Update the Adapter**: In the `lib/gcpspanner/spanneradapters` package, update the consumer to call the new client method.
345+
1. **Process Data**: In the workflow's `main.go`, fetch the data and process it into the appropriate struct from `lib/webdxfeaturetypes` or other relevant type packages. For example, the `web_feature_consumer` uses `webdxfeaturetypes.ProcessedWebFeaturesData`.
346+
2. **Implement the Mapper**: In the `lib/gcpspanner` package, create a new mapper struct and implement the required interfaces for your chosen pattern (e.g., `syncableEntityMapper` or `writeableEntityMapper`).
347+
3. **Implement the Client Method**: In `lib/gcpspanner/client.go`, add a new method that takes the data and uses the appropriate generic helper (e.g., `newEntitySynchronizer`) with your new mapper.
348+
4. **Update the Adapter**: In the `lib/gcpspanner/spanneradapters` package, update the consumer to call the new client method. The adapter is responsible for converting the data from the workflow's format to the Spanner client's format.
353349

354350
**4. Write Tests**
355351

@@ -361,3 +357,23 @@ This is a critical step:
361357
**5. Verify**
362358

363359
Run `make precommit` to ensure all linting checks and tests pass.
360+
361+
## 7. Updating the Knowledge Base
362+
363+
To keep this document up-to-date, you can ask me to analyze the latest commits and update my knowledge base. I will use the hidden marker at the end of this file to find the commits that have been made since my last analysis.
364+
365+
### 7.1. Prompt for Updating
366+
367+
You can use the following prompt to ask me to update my knowledge base:
368+
369+
> Please update your knowledge base by analyzing the commits since the last analyzed commit stored in `GEMINI.md`.
370+
371+
### 7.2. Process
372+
373+
When you give me this prompt, I will:
374+
375+
1. Read the `GEMINI.md` file to find the last analyzed commit SHA.
376+
2. Use `git log` to find all the commits that have been made since that SHA.
377+
3. Analyze the new commits to understand the changes.
378+
4. Update this document with the new information.
379+
5. Update the last analyzed commit SHA near the top of this file.

0 commit comments

Comments
 (0)