|
| 1 | +# RFC 0012: Crashtracker Errors Intake Payload Schema |
| 2 | + |
| 3 | +The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", |
| 4 | +"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this |
| 5 | +document are to be interpreted as described in |
| 6 | +[IETF RFC 2119](https://datatracker.ietf.org/doc/html/rfc2119). |
| 7 | + |
| 8 | +## Summary |
| 9 | + |
| 10 | +This RFC specifies the payloads produced by the crashtracker |
| 11 | +`ErrorsIntakePayload` (`libdd-crashtracker/src/crash_info/errors_intake.rs`) |
| 12 | +and uploaded to the `/api/v2/errorsintake` API (direct) or the |
| 13 | +`/evp_proxy/v4/api/v2/errorsintake` agent proxy. It formalizes the |
| 14 | +shape, required fields, extensibility expectations, and how crash |
| 15 | +metadata is translated into `ddtags`. |
| 16 | + |
| 17 | +## Motivation |
| 18 | + |
| 19 | +The crashtracker structured log format (RFC 0011) documents the content |
| 20 | +stored locally, but downstream systems must also understand how that |
| 21 | +information is serialized for the Errors Intake pipeline. Having a |
| 22 | +single reference schema: |
| 23 | + |
| 24 | +- Guarantees consistency between language integrations that link |
| 25 | + `libdatadog`. |
| 26 | +- Enables backend and observability teams to validate new fields before |
| 27 | + rollout. |
| 28 | +- Clarifies how crashtracker concepts map onto `error-tracking-intake`. |
| 29 | + |
| 30 | +## Scope |
| 31 | + |
| 32 | +This RFC covers: |
| 33 | + |
| 34 | +- The JSON payload emitted by `ErrorsIntakePayload::from_crash_info` |
| 35 | + and `ErrorsIntakePayload::from_crash_ping`. |
| 36 | +- The mapping from `CrashInfo` to `error`, `ddtags`, and other |
| 37 | + top-level fields. |
| 38 | +- Expectations for optional / future fields and forward compatibility. |
| 39 | + |
| 40 | +This RFC does **not** redefine the crash info schema itself—see RFC 0011 |
| 41 | +for the authoritative crash report format. |
| 42 | + |
| 43 | +## Transport Overview |
| 44 | + |
| 45 | +The payload described here is identical regardless of delivery path: |
| 46 | + |
| 47 | +- **Direct submission:** HTTPS requests to |
| 48 | + `https://error-tracking-intake.<site>/api/v2/errorsintake` with |
| 49 | + `DD-API-KEY`. |
| 50 | +- **Agent proxy:** HTTP POST to |
| 51 | + `http(s)://<agent>/evp_proxy/v4/api/v2/errorsintake` with the |
| 52 | + `X-Datadog-EVP-Subdomain: error-tracking-intake` header. |
| 53 | + |
| 54 | +Producers MAY also write payloads to a `file://` endpoint (primarily for |
| 55 | +tests) using the same schema. |
| 56 | + |
| 57 | +## Payload Structure |
| 58 | + |
| 59 | +### Top-Level Fields |
| 60 | + |
| 61 | +- `timestamp`: **[required]** UNIX epoch in milliseconds. Derived from |
| 62 | + `CrashInfo.timestamp` |
| 63 | +- `ddsource`: **[required]** Always the string `"crashtracker"` to allow |
| 64 | + downstream filtering. |
| 65 | +- `ddtags`: **[required]** A comma-separated `key:value` string that |
| 66 | + encodes service metadata, runtime metadata, crash info, counters, and |
| 67 | + signal details (see [Tag Encoding](#tag-encoding)). |
| 68 | +- `error`: **[required]** An `ErrorObject` describing the crash details |
| 69 | + (see [Error Object](#error-object)). |
| 70 | +- `trace_id`: **[optional]** String trace identifier. Reserved for |
| 71 | + future correlation; currently unset by crashtracker. |
| 72 | +- `os_info`: **[required]** Same structure as defined in RFC 0011: |
| 73 | + - `architecture`: **[required]** (e.g. `"arm64"`) |
| 74 | + - `bitness`: **[required]** (e.g. `"64-bit"`) |
| 75 | + - `os_type`: **[required]** (e.g. `"Linux"`) |
| 76 | + - `version`: **[required]** (e.g. `"6.8.0"`) |
| 77 | +- `sig_info`: **[optional]** Present for Unix signal-based crashes. |
| 78 | + Reuses the fields from RFC 0011 (`si_addr`, `si_code`, `si_code_human_readable`, |
| 79 | + `si_signo`, `si_signo_human_readable`). |
| 80 | + |
| 81 | +### Error Object |
| 82 | + |
| 83 | +The nested `error` object is serialized from `ErrorObject`: |
| 84 | + |
| 85 | +- `type`: A human-readable error kind. For signal-based |
| 86 | + crashes, this is the signal human-readable name (e.g. `"SIGSEGV"`). |
| 87 | + Otherwise `"Unknown"` unless upstream sets `CrashInfo.error.kind`. |
| 88 | +- `message`: **[optional]** Human readable summary. For signals the |
| 89 | + default is `"Process terminated by signal <SIG>"`. |
| 90 | +- `stack`: **[optional]** When the crashing thread stack contains at |
| 91 | + least one frame, this field embeds the crash stacktrace as defined in |
| 92 | + RFC 0011 (format `"Datadog Crashtracker 1.0"` plus frames). |
| 93 | +- `is_crash`: Boolean. `true` for crash payloads and `false` for crash pings. |
| 94 | +- `fingerprint`: **[optional]** Correlates to `CrashInfo.fingerprint` |
| 95 | + for deduplication. |
| 96 | +- `source_type`: Always `"Crashtracking"` so downstream |
| 97 | + consumers can distinguish crashtracker-originated data from other |
| 98 | + producers. |
| 99 | +- `experimental`: **[optional]** Pass-through copy of |
| 100 | + `CrashInfo.experimental`. MUST contain valid JSON when present to |
| 101 | + allow experimentation without schema churn. |
| 102 | + |
| 103 | +### Tag Encoding |
| 104 | + |
| 105 | +`ddtags` is constructed by concatenating comma-delimited `key:value` |
| 106 | +pairs. Consumers SHOULD tolerate new tags and |
| 107 | +order changes. |
| 108 | + |
| 109 | +1. **Service tags** (always present): |
| 110 | + - `service:<name>` (defaults to `"unknown"` if missing) |
| 111 | + - `env:<env>` **[optional]** |
| 112 | + - `version:<service_version>` **[optional]** |
| 113 | +2. **Runtime tags**: |
| 114 | + - `language_name:<language>` This should always be present, as the Errortracking product uses this tag to identify the runtime of the crashing service. We should use the same naming convention defined in [here](https://github.com/DataDog/logs-backend/blob/122abe4e9cef1b76cffccb2eb6fa10607fcc4c87/domains/event-platform/libs/processing/processing-common/src/main/java/com/dd/logs/processing/processors/errortracking/LanguageDetection.java#L95-L113). |
| 115 | + - `language_version:<version>` *(optional)* |
| 116 | + - `tracer_version:<version>` *(optional)* |
| 117 | +3. **Crash info tags** (always present): |
| 118 | + - `data_schema_version:<value>` |
| 119 | + - `fingerprint:<value>` **[optional]** |
| 120 | + - `incomplete:<true|false>` |
| 121 | + - `is_crash:<true|false>` |
| 122 | + - `uuid:<CrashInfo.uuid>` |
| 123 | + - `<counter_name>:<counter_value>` for each entry in `CrashInfo.counters` |
| 124 | +4. **Signal tags** *(conditional on `sig_info`)*: |
| 125 | + - `si_addr:<hex>` **[optional]** |
| 126 | + - `si_code:<int>` |
| 127 | + - `si_code_human_readable:<string>` |
| 128 | + - `si_signo:<int>` |
| 129 | + - `si_signo_human_readable:<string>` |
| 130 | + |
| 131 | +Tags are appended as literal strings; no escaping is performed beyond |
| 132 | +the standard JSON encoding of the `ddtags` field itself. The receiver |
| 133 | +MUST accept unknown tags and preserve existing ones. |
| 134 | + |
| 135 | +## Extensibility & Compatibility |
| 136 | + |
| 137 | +- The schema follows the crash info semver. Because the payload embeds |
| 138 | + `CrashInfo`, additions to crash info fields may appear inside the |
| 139 | + `stack` or `experimental` objects without changing Errors Intake |
| 140 | + expectations. |
| 141 | +- Producers MAY add additional top-level fields provided they do not |
| 142 | + conflict with existing keys. Consumers MUST ignore unknown fields. |
| 143 | +- Additional tags MAY be appended to `ddtags`. Downstream systems MUST |
| 144 | + be resilient to new `key:value` pairs. |
| 145 | +- Future work MAY populate `trace_id` when a crash occurs within a |
| 146 | + traced span; consumers MUST handle both presence and absence. |
| 147 | + |
| 148 | +## Example Payload |
| 149 | + |
| 150 | +``` |
| 151 | +{ |
| 152 | + "timestamp": 1733420830123, |
| 153 | + "ddsource": "crashtracker", |
| 154 | + "ddtags": "service:checkout,env:prod,version:1.4.2,language_name:native,data_schema_version:1.4,incomplete:false,is_crash:true,uuid:f7e2...,collecting_sample:1", |
| 155 | + "error": { |
| 156 | + "type": "SIGSEGV", |
| 157 | + "message": "Process terminated by signal SIGSEGV", |
| 158 | + "stack": { |
| 159 | + "format": "Datadog Crashtracker 1.0", |
| 160 | + "frames": [ |
| 161 | + { |
| 162 | + "function": "main", |
| 163 | + "file": "app.rs", |
| 164 | + "line": 42 |
| 165 | + } |
| 166 | + // more frames ... |
| 167 | + ] |
| 168 | + }, |
| 169 | + "is_crash": true, |
| 170 | + "fingerprint": "sigsegv-main", |
| 171 | + "source_type": "Crashtracking" |
| 172 | + }, |
| 173 | + "trace_id": null, |
| 174 | + "os_info": { |
| 175 | + "architecture": "x86_64", |
| 176 | + "bitness": "64-bit", |
| 177 | + "os_type": "Linux", |
| 178 | + "version": "6.8.0" |
| 179 | + }, |
| 180 | + "sig_info": { |
| 181 | + "si_code": 1, |
| 182 | + "si_code_human_readable": "SEGV_MAPERR", |
| 183 | + "si_signo": 11, |
| 184 | + "si_signo_human_readable": "SIGSEGV", |
| 185 | + "si_addr": "0x0000000000001234" |
| 186 | + } |
| 187 | +} |
| 188 | +``` |
| 189 | + |
| 190 | +This example omits optional fields such as `experimental`, `span_ids`, |
| 191 | +and additional stack metadata for brevity. Refer to RFC 0011 for the |
| 192 | +full stack trace schema. |
| 193 | + |
0 commit comments