|
| 1 | +# CLAUDE.md |
| 2 | + |
| 3 | +This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. |
| 4 | + |
| 5 | +## Project Overview |
| 6 | + |
| 7 | +Community-driven transformers for cdviz-collector that convert various event sources (GitHub, ArgoCD, Kubewatch) into CDEvents format. Transformers are written in VRL (Vector Remap Language) and follow strict conventions for event transformation. |
| 8 | + |
| 9 | +## Development Commands |
| 10 | + |
| 11 | +### Testing |
| 12 | + |
| 13 | +```bash |
| 14 | +# Test all transformers |
| 15 | +mise run :test |
| 16 | + |
| 17 | +# Test specific transformer (from root) |
| 18 | +mise run //github_events:test |
| 19 | +mise run //argocd_notifications:test |
| 20 | +mise run //kubewatch_cloudevents:test |
| 21 | +mise run //passthrough:test |
| 22 | + |
| 23 | +# Test from transformer directory |
| 24 | +cd github_events && mise run :test |
| 25 | + |
| 26 | +# Review and update expected outputs (use when inputs change or transformer logic is updated) |
| 27 | +mise run //github_events:test -- --mode review |
| 28 | +``` |
| 29 | + |
| 30 | +### Code Formatting |
| 31 | + |
| 32 | +```bash |
| 33 | +# Format all code (runs dprint and obfuscation) |
| 34 | +mise run :format |
| 35 | + |
| 36 | +# Obfuscate sensitive data in sample inputs |
| 37 | +mise run :obfuscate |
| 38 | +``` |
| 39 | + |
| 40 | +### CI Tasks |
| 41 | + |
| 42 | +```bash |
| 43 | +# Run complete CI suite |
| 44 | +mise run :ci |
| 45 | +``` |
| 46 | + |
| 47 | +## Architecture |
| 48 | + |
| 49 | +### Repository Structure |
| 50 | + |
| 51 | +Each transformer is a self-contained directory with: |
| 52 | +- `transformer.vrl` - VRL transformation logic (main implementation) |
| 53 | +- `cdviz-collector.toml` - Configuration example showing how to use the transformer |
| 54 | +- `inputs/` - Sample input events (JSON files organized by event type) |
| 55 | +- `outputs/` - Expected CDEvents output (used for testing) |
| 56 | +- `mise.toml` - Test task definition |
| 57 | +- `README.md` - Usage documentation |
| 58 | + |
| 59 | +### VRL Transformers |
| 60 | + |
| 61 | +Transformers convert source events to CDEvents using VRL. The pattern is: |
| 62 | + |
| 63 | +1. Parse input event structure |
| 64 | +2. Extract relevant data from source-specific fields |
| 65 | +3. Map to CDEvents schema with proper field conventions |
| 66 | +4. Preserve source-specific data in `customData.<source>` hierarchy |
| 67 | + |
| 68 | +Example from github_events/transformer.vrl: |
| 69 | +- Detects event type by checking for specific fields (e.g., `.body.package`, `.body.workflow_run`) |
| 70 | +- Extracts timestamps and converts to ISO format |
| 71 | +- Builds CDEvents JSON with proper `context`, `subject`, and `customData` structure |
| 72 | +- Returns array of events (transformers can emit 0, 1, or multiple CDEvents per input) |
| 73 | + |
| 74 | +### Testing Mechanism |
| 75 | + |
| 76 | +The `cdviz-collector transform` command processes all JSON files in `inputs/` directory through the transformer and compares output against `outputs/` directory: |
| 77 | + |
| 78 | +- `--mode check` (default): Fails if output doesn't match expected |
| 79 | +- `--mode review`: Interactive mode to accept/reject changes |
| 80 | + |
| 81 | +## CDEvents Field Conventions (Critical Rules) |
| 82 | + |
| 83 | +### context.id |
| 84 | +- Set to `"0"` (or omit) to enable automatic content-based ID generation by cdviz-collector |
| 85 | +- DO NOT manually generate IDs or reuse source event IDs |
| 86 | +- **Exception**: Keep `context.id` when the transformer is NOT creating a new CDEvent (filtering, normalizing, validating, or adding customData) |
| 87 | +- This ensures reproducible, deterministic IDs based on event content |
| 88 | + |
| 89 | +### context.source |
| 90 | +- Use the URI of the cdviz-collector service that creates or modifies the event |
| 91 | +- This identifies where the event was created/modified, not the original triggering system |
| 92 | +- Value depends on cdviz-collector's running mode: |
| 93 | + - **`connect` mode (server)**: Use cdviz-collector URI with `source` as query parameter |
| 94 | + - **`send` mode**: Use URL of triggering system (pipeline, workflow, etc.) |
| 95 | + - **`transform` mode**: Use `http://cdviz-collector.example.com?source=cli-transform` |
| 96 | +- cdviz-collector provides suggested value in metadata that transformers can use or override |
| 97 | +- Customize using `http.root_url` in `cdviz-collector.toml` |
| 98 | + |
| 99 | +### context.timestamp |
| 100 | +- Extract timestamp from source event data when available (e.g., `.body.workflow_run.updated_at`) |
| 101 | +- Parse and format as ISO 8601: `parse_timestamp(..., "%+")` then `format_timestamp!(..., format: "%+")` |
| 102 | +- Avoid `now()` or automatic timestamps to ensure reproducible outputs for testing |
| 103 | + |
| 104 | +### subject.id |
| 105 | +- Use globally unique, hierarchical URI/URL identifying the subject entity |
| 106 | +- Can be a URL, PURL, or absolute path starting with `/` |
| 107 | +- Prefer API URIs over human-facing view URIs |
| 108 | +- **DO NOT use `subject.source`** - make `subject.id` fully self-describing and globally unique |
| 109 | +- The ID should work as a standalone identifier/reference in any context |
| 110 | +- Examples: |
| 111 | + - Absolute paths: `/namespace/my-service`, `/cluster/us-1/staging` |
| 112 | + - API URLs: `https://github.com/org/repo/workflow/run`, `https://jenkins.example.com/job/job_name/` |
| 113 | + - For artifacts: Use PURL format (see artifactId section) |
| 114 | + |
| 115 | +### subject.type |
| 116 | +- Must match CDEvents subject types: `artifact`, `pipelineRun`, `taskRun`, `ticket`, `change`, `branch`, etc. |
| 117 | + |
| 118 | +### environment.id |
| 119 | +- Follow same rules as `subject.id` - it's a reference to an environment subject |
| 120 | +- Often subjects don't know their environment, so this may need to be injected |
| 121 | +- Define as absolute path starting with `/` for consistency |
| 122 | +- Use hierarchical paths ordered from most to least stable: `/level/region/owner` |
| 123 | +- Be consistent across all apps and configurations |
| 124 | +- Examples: `/production`, `/pro/us-1/cluster-33`, `/staging`, `/dev/ephemeral-42` |
| 125 | +- **Why**: Enables environment-level dashboards, filtering, and alerts |
| 126 | + |
| 127 | +### artifactId |
| 128 | +- Follow Package URL (PURL) specification: `pkg:type/namespace/name@version?qualifiers` |
| 129 | +- Use appropriate type if supported, otherwise fallback to `generic` |
| 130 | +- Common types: `oci`, `npm`, `maven`, `gem`, `nuget`, `github` |
| 131 | +- For OCI images: Use image digest as version (NOT git commit SHA), include `repository_url` and `tag` as qualifiers |
| 132 | +- Examples: |
| 133 | + - OCI: `pkg:oci/my-app@sha256:abc123...?repository_url=ghcr.io/myorg/my-app&tag=v1.2.3` |
| 134 | + - NPM: `pkg:npm/[email protected]` |
| 135 | + - Maven: `pkg:maven/org.springframework/[email protected]` |
| 136 | + - Generic: `pkg:generic/[email protected]` |
| 137 | + |
| 138 | +**Common PURL Pitfalls**: |
| 139 | +- **Digest vs Tag**: Use image digest for immutability, NOT source code commit SHA |
| 140 | +- **OCI Namespace**: `pkg:oci/` does NOT support namespace in path - use `repository_url` query parameter |
| 141 | +- **Type-Specific Rules**: Each PURL type has unique encoding rules - consult the specification |
| 142 | + |
| 143 | +### customData |
| 144 | +- Preserve source-specific information not covered by CDEvents standard fields |
| 145 | +- Structure as JSON object with source name at first level: `customData.github`, `customData.argocd` |
| 146 | +- For webhook events, mirror original event structure (complete or filtered) |
| 147 | +- Avoid duplicating data already in standard CDEvents fields |
| 148 | +- Additional first-level keys may be added for information useful to other consumers |
| 149 | + |
| 150 | +## Transformer Development Guidelines |
| 151 | + |
| 152 | +### Metadata and Transformer Chaining |
| 153 | + |
| 154 | +- Use `metadata` to transfer information between transformers |
| 155 | +- Use `metadata` from extractors to initialize information |
| 156 | +- Use the first transformer to initialize information when: |
| 157 | + - Not possible via extractor (pre-0.19) |
| 158 | + - Sharing information/transformers between multiple sources and transformer chains |
| 159 | + |
| 160 | +### Adding a New Event Type to Existing Transformer |
| 161 | + |
| 162 | +1. Add sample input JSON file to `inputs/<event-type>/` |
| 163 | +2. Add expected output JSON file to `outputs/<event-type>/` |
| 164 | +3. Update transformer.vrl with new event detection logic (typically an `else if exists(.body.field_name)` block) |
| 165 | +4. Map source fields to CDEvents schema following field conventions from RULES.md |
| 166 | +5. Extract timestamps from input data (avoid `now()` for reproducibility) |
| 167 | +6. Run `mise run :test -- --mode review` to validate and accept outputs |
| 168 | +7. Update transformer's README.md to document the new event type |
| 169 | + |
| 170 | +### Creating a New Transformer |
| 171 | + |
| 172 | +1. Create directory with transformer name |
| 173 | +2. Add `transformer.vrl` with transformation logic |
| 174 | +3. Add `cdviz-collector.toml` with configuration example |
| 175 | +4. Create `inputs/` and `outputs/` directories with test cases |
| 176 | +5. Add `mise.toml` with test task (copy pattern from existing transformers) |
| 177 | +6. Add `README.md` with usage documentation |
| 178 | +7. Update root README.md table with new transformer |
| 179 | + |
| 180 | +### VRL Error Handling |
| 181 | + |
| 182 | +- Use `!` suffix for functions that should fail loudly (e.g., `string!()`, `to_string!()`, `format_timestamp!()`) |
| 183 | +- Use `??` operator for fallback values (e.g., `parse_timestamp(...) ?? now()`) |
| 184 | +- Use `exists()` to check for optional fields before accessing them |
| 185 | +- Filter null values from arrays: `filter(array) -> |_index, item| { !is_nullish(item) }` |
| 186 | + |
| 187 | +## Important Files |
| 188 | + |
| 189 | +- `RULES.md` - Comprehensive CDEvents field conventions (authoritative source for field usage) |
| 190 | +- `AGENTS.md` - Quick reference for build/test commands and conventions |
| 191 | +- `CONTRIBUTING.md` - Contribution workflow, prerequisites (mise, docker), commit signing requirements |
| 192 | +- `mise.toml` - Root task definitions and tool dependencies |
| 193 | +- `dprint.jsonc` - Code formatting configuration |
| 194 | + |
| 195 | +## Technology Stack |
| 196 | + |
| 197 | +- **VRL (Vector Remap Language)**: Transformation language for cdviz-collector |
| 198 | +- **mise-en-place**: Task runner and environment manager |
| 199 | +- **dprint**: Code formatter for JSON, Markdown, YAML |
| 200 | +- **cdviz-collector**: Event collection and transformation tool |
| 201 | + |
| 202 | +## Remote Transformer Usage |
| 203 | + |
| 204 | +Transformers can be used remotely without cloning the repository by configuring cdviz-collector.toml: |
| 205 | + |
| 206 | +```toml |
| 207 | +[remote.transformers-community] |
| 208 | +type = "github" |
| 209 | +owner = "cdviz-dev" |
| 210 | +repo = "transformers-community" |
| 211 | + |
| 212 | +[transformers.github_events] |
| 213 | +type = "vrl" |
| 214 | +template_rfile = "transformers-community:///github_events/transformer.vrl" |
| 215 | +``` |
| 216 | + |
| 217 | +## Commit Requirements |
| 218 | + |
| 219 | +- All commits must be signed off: `git commit -s` |
| 220 | +- Includes `Signed-off-by` line in commit message (required by CLA) |
0 commit comments