|
| 1 | +# Rules |
| 2 | + |
| 3 | +Opinionated rules for creating CDEvents and transformers. |
| 4 | + |
| 5 | +## CDEvents Best Practices |
| 6 | + |
| 7 | +Choosing good values for key fields improves observability, event correlation, and entity tracking. |
| 8 | + |
| 9 | +Always follow the [official CDEvents specification](https://github.com/cdevents/spec/blob/main/spec.md). |
| 10 | + |
| 11 | +### context.source - Event Origin |
| 12 | + |
| 13 | +#### Official definition |
| 14 | + |
| 15 | +Extract from [context.source](https://github.com/cdevents/spec/blob/main/spec.md#source-context) |
| 16 | + |
| 17 | +> Type: URI-Reference |
| 18 | +> Description: defines the context in which an event happened. The main purpose of the source is to provide global uniqueness for source + id. |
| 19 | +> The source MAY identify a single producer or a group of producer that belong to the same application. |
| 20 | +> When selecting the format for the source, it may be useful to think about how clients may use it. Using the root use cases as reference: |
| 21 | +> |
| 22 | +> - A client may want to react only to events sent by a specific service, like the instance of Tekton that runs in a specific cluster or the instance of Jenkins managed by team X |
| 23 | +> - A client may want to collate all events coming from a specific source for monitoring, observability or visualization purposes |
| 24 | +> |
| 25 | +> Constraints: |
| 26 | +> |
| 27 | +> - REQUIRED |
| 28 | +> - MUST be a non-empty URI-reference |
| 29 | +> - An absolute URI is RECOMMENDED |
| 30 | +
|
| 31 | +#### Complementary rules |
| 32 | + |
| 33 | +- Use the URI of the latest service that creates or modifies the event, regardless of what triggered it (webhook, another event, etc.) |
| 34 | +- Prefer the URI of the service (or sub-service) generating the event, regardless of subject or event type |
| 35 | +- Prefer API URIs over human-facing view URIs |
| 36 | +- Use query parameters to provide additional information |
| 37 | + |
| 38 | +**Why**: Allows consumers to identify where the event producer is configured |
| 39 | + |
| 40 | +```yaml |
| 41 | +# ✅ Good - Specific service identifiers |
| 42 | +"source": "https://github.com/myorg/myrepo/workflow-a" # Event sent from specific workflow |
| 43 | +"source": "https://jenkins.example.com/job/job_name" |
| 44 | +"source": "https://cdviz-collector.example.com/?source=source_name" # Use query params when needed |
| 45 | + |
| 46 | +# ❌ Avoid - Too generic, conflicts in larger scopes |
| 47 | +"source": "github.com/myorg/myrepo" |
| 48 | +"source": "myrepo" |
| 49 | +``` |
| 50 | +
|
| 51 | +### subject.id - Event Subject Identifier |
| 52 | +
|
| 53 | +#### Official definition |
| 54 | +
|
| 55 | +Extract from [subject.id](https://github.com/cdevents/spec/blob/main/spec.md#id-subject): |
| 56 | +
|
| 57 | +> Identifier for a subject. Subsequent events associated to the same subject MUST use the same subject id. |
| 58 | +> Constraints: |
| 59 | +> |
| 60 | +> - REQUIRED |
| 61 | +> - MUST be a non-empty string |
| 62 | +> - MUST be unique within the given source (in the scope of the producer) |
| 63 | +
|
| 64 | +#### Complementary rules |
| 65 | +
|
| 66 | +Use **unique, hierarchical identifiers** scoped to your organization or globally. |
| 67 | +
|
| 68 | +- Use a URI (URL, PURL, or absolute path starting with `/`) |
| 69 | +- Prefer API URIs over human-facing view URIs |
| 70 | +- **DO NOT use `subject.source`** - it's confusing and optional. Instead, make `subject.id` globally unique and let `context.source` identify the event origin |
| 71 | + |
| 72 | +**Why**: |
| 73 | + |
| 74 | +- The ID should be a standalone identifier that can be used as a reference or link in any context |
| 75 | +- Manipulating a single `id` field is simpler than managing `id` + optional `source` |
| 76 | + |
| 77 | +```yaml |
| 78 | +# ✅ Good - Globally unique, hierarchical, semantic |
| 79 | +"subject.id": "/namespace/my-service" |
| 80 | +"subject.id": "/cluster/us-1/staging" |
| 81 | +"subject.id": "https://github.com/org-id/repo-id/workflow-id/run-id" |
| 82 | +"subject.id": "https://jenkins.example.com/job/job_name/" |
| 83 | +
|
| 84 | +# ❌ Avoid - Not globally unique or too generic |
| 85 | +"subject.id": "550e8400-e29b-41d4-a716-446655440000" # UUID |
| 86 | +"subject.id": "run-12345" # Not globally unique |
| 87 | +"subject.id": "production" # Too generic, not a path |
| 88 | +``` |
| 89 | + |
| 90 | +### environment.id - Deployment Environment |
| 91 | + |
| 92 | +Follow the same rules as `subject.id` since `environment.id` is a reference to an environment subject. However, often: |
| 93 | + |
| 94 | +- The subject/system doesn't know its environment, so this information isn't in the source event |
| 95 | +- Environments may lack clear URIs or scopes (VPC, Kubernetes cluster, region, etc.) |
| 96 | + |
| 97 | +Guidelines: |
| 98 | + |
| 99 | +- Define `environment.id` as an absolute path starting with `/` |
| 100 | +- Use your organization name for consistency |
| 101 | +- Be consistent across all apps and configurations - use the same naming convention |
| 102 | +- Use hierarchical paths like `/level/region/owner` ordered from most to least stable |
| 103 | +- Consider how you want to group data in dashboards and reports |
| 104 | + |
| 105 | +**Why**: Enables environment-level dashboards, filtering, and alerts. |
| 106 | + |
| 107 | +```yaml |
| 108 | +"environment": {"id": "/production"} |
| 109 | +"environment": {"id": "/pro"} |
| 110 | +"environment": {"id": "/pro/us-1/cluster-33"} |
| 111 | +"environment": {"id": "/staging"} |
| 112 | +"environment": {"id": "/dev/ephemeral-42"} |
| 113 | +``` |
| 114 | + |
| 115 | +### artifactId - Package URL (PURL) |
| 116 | + |
| 117 | +- Follow the same rules as `subject.id` since `artifactId` is a reference to an artifact subject |
| 118 | +- Follow the [Package URL specification](https://github.com/package-url/purl-spec) for your artifact type |
| 119 | +- Use the appropriate type if supported, otherwise fallback to `generic` (official CDEvents requirement) |
| 120 | + |
| 121 | +**Why**: Enables universal artifact identification, dependency tracking, and interoperability with other tools |
| 122 | + |
| 123 | +**Common Patterns**: |
| 124 | + |
| 125 | +```yaml |
| 126 | +# OCI images (Docker/container registries) |
| 127 | +# Note: OCI type doesn't support namespace - use query params for registry/repo |
| 128 | +"artifactId": "pkg:oci/my-app@sha256:abc123def456...?repository_url=ghcr.io/myorg/my-app&tag=v1.2.3" |
| 129 | +"artifactId": "pkg:oci/nginx@sha256:def456abc123...?repository_url=docker.io/library/nginx&tag=latest" |
| 130 | +
|
| 131 | +# NPM packages |
| 132 | +"artifactId": "pkg:npm/[email protected]" |
| 133 | +
|
| 134 | +# Maven artifacts |
| 135 | +"artifactId": "pkg:maven/org.springframework/[email protected]" |
| 136 | +
|
| 137 | +# Generic packages |
| 138 | +"artifactId": "pkg:generic/[email protected]" |
| 139 | +``` |
| 140 | + |
| 141 | +**Common Pitfalls**: |
| 142 | + |
| 143 | +- **Digest vs Tag**: Use digest (`@sha256:...`) for immutability - this is the image digest, NOT the source code commit SHA |
| 144 | +- **Version Semantics**: For OCI, the version is the image digest, not the git commit that built it |
| 145 | +- **OCI Namespace Limitation**: `pkg:oci/` does NOT support namespace in the path - use `repository_url` query parameter |
| 146 | +- **Registry Encoding**: OCI requires `repository_url` query parameter; other types encode registries differently |
| 147 | +- **Type-Specific Rules**: Each PURL type has unique encoding rules - consult the specification |
| 148 | + |
| 149 | +## Rules for Transformers |
| 150 | + |
| 151 | +### Use metadata for transformer chaining |
| 152 | + |
| 153 | +- Use `metadata` to transfer information between transformers |
| 154 | +- Use `metadata` from extractors to initialize information (not available with the `transform` subcommand) |
| 155 | +- Use the first transformer to initialize information when: |
| 156 | + - Not possible via extractor (pre-0.19 or `transform` subcommand) |
| 157 | + - Sharing information/transformers between multiple sources and transformer chains |
| 158 | + |
| 159 | +Example of "first" transformer: |
| 160 | + |
| 161 | +```toml |
| 162 | +[transformers.init_metadata] |
| 163 | +type = "vrl" |
| 164 | +template = """ |
| 165 | +.metadata = object(.metadata) ?? {} |
| 166 | +
|
| 167 | +[{ |
| 168 | + "metadata": merge(.metadata, { |
| 169 | + "environment_id": "cluster/A-dev", |
| 170 | + }), |
| 171 | + "headers": .headers, |
| 172 | + "body": .body, |
| 173 | +}] |
| 174 | +""" |
| 175 | +``` |
| 176 | + |
| 177 | +### Automatic `context.id` generation |
| 178 | + |
| 179 | +- Let cdviz-collector generate `context.id` by setting it to `"0"` |
| 180 | +- Do NOT omit `context.id` to generate valid cdevents as output |
| 181 | +- Do NOT reuse IDs from incoming events (webhooks, Kafka messages, etc.) |
| 182 | +- **Exception**: Keep `context.id` when the transformer's purpose is NOT to create a new CDEvent (filtering, normalizing, validating, or adding customData) |
| 183 | + |
| 184 | +**Why**: |
| 185 | + |
| 186 | +- Ensures content-based deduplication |
| 187 | +- Enables reproducible, deterministic IDs for testing |
| 188 | + |
| 189 | +### `context.timestamp` generation |
| 190 | + |
| 191 | +- Extract timestamp from input data (events, files) when available |
| 192 | +- Avoid `now()` or automatic timestamps for reproducibility |
| 193 | + |
| 194 | +**Why**: |
| 195 | + |
| 196 | +- Creates reproducible output for the same input |
| 197 | +- Ensures the same automatic ID generation, enabling reliable testing with transform CLI |
| 198 | + |
| 199 | +### Define `context.source` |
| 200 | + |
| 201 | +As defined in the CDEvents rules above, `context.source` should be the URI of the cdviz-collector service that creates or modifies the event. |
| 202 | + |
| 203 | +The value depends on cdviz-collector's running mode and external address: |
| 204 | + |
| 205 | +- **`connect` mode (server)**: Use the cdviz-collector URI with `source` as a query parameter |
| 206 | +- **`send` mode**: Use the URL of the triggering system (pipeline, workflow, etc.) |
| 207 | +- **`transform` mode**: Use `http://cdviz-collector.example.com?source=cli-transform` |
| 208 | + |
| 209 | +To simplify development, cdviz-collector provides a suggested value in metadata. Transformers may use or override it. |
| 210 | + |
| 211 | +- Customize the URL using `http.root_url` in `cdviz-collector.toml` (default: `http://cdviz-collector.example.com`) |
| 212 | + |
| 213 | +### Use `customData` for source-specific information |
| 214 | + |
| 215 | +- Use `customData` to preserve complementary information not covered by CDEvents standard fields |
| 216 | +- Structure as a JSON object with the source name at the first level (`github`, `argocd`, etc.) |
| 217 | +- For webhook events, mirror the original event structure under the first level (can be complete or filtered) |
| 218 | +- Additional first-level keys may be added for information useful to other consumers |
0 commit comments