Skip to content

Commit 398a1d1

Browse files
emielverEmiel V
andauthored
Add clearer examples of constraints (#685)
Co-authored-by: Emiel V <[email protected]>
1 parent 8927be1 commit 398a1d1

File tree

1 file changed

+128
-9
lines changed

1 file changed

+128
-9
lines changed

docs/data-management/certified-metrics/eppo-schema.md

Lines changed: 128 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -54,20 +54,139 @@ Numerators and denominators follow a similar schema, with some fields only being
5454
| Property | Description | Example |
5555
| -------- | ----------- | ------- |
5656
| `fact_name` | The name of a fact as specified in `fact_source`* | Purchase Revenue |
57-
| `operation` | The [aggregation method](/data-management/metrics/simple-metric#aggregation-methods) to use. <br></br><br></br>For numerator aggregations options are `sum, count, count_distinct, distinct_entity, threshold, conversion, retention`. <br></br><br></br>For denominator aggregations, valid options are `sum, count, count_distinct, distinct_entity` | `sum` |
58-
| `aggregation_timeframe_start_value` <br></br> (optional) | Timeframe units since assignment after which events are included | 2 |
59-
| `aggregation_timeframe_end_value` <br></br> (optional) | Timeframe units since assignment after which events are excluded | 7 |
60-
| `aggregation_timeframe_unit` <br></br> (optional) | The time unit to use: `minutes`, `hours`, `days`, or `weeks` | `days` |
61-
| `winsorization_lower_percentile` <br></br> (optional) | Percentile at which to clip aggregated metrics | 0.001 |
62-
| `winsorization_upper_percentile` <br></br> (optional) | Percentile at which to clip aggregated metrics | 0.999 |
57+
| `operation` | The [aggregation method](/data-management/metrics/simple-metric#aggregation-methods) to use. <br></br><br></br>For numerator aggregations options are `sum, count, count_distinct, distinct_entity, threshold, conversion, retention`. <br></br><br></br>For denominator aggregations, valid options are `sum, count, count_distinct, distinct_entity` <br></br><br></br>**Note**: See [Constraints and Limitations](#constraints-and-limitations) for operation-specific parameter restrictions. | `sum` |
58+
| `aggregation_timeframe_start_value` <br></br> (optional) | Timeframe units since assignment after which events are included. <br></br><br></br>**Constraint**: Cannot be used with `conversion` operations. Requires `aggregation_timeframe_unit` to be specified. | 2 |
59+
| `aggregation_timeframe_end_value` <br></br> (optional) | Timeframe units since assignment after which events are excluded. <br></br><br></br>**Constraint**: Cannot be used with `conversion` operations. Requires `aggregation_timeframe_unit` to be specified. | 7 |
60+
| `aggregation_timeframe_unit` <br></br> (optional) | The time unit to use: `minutes`, `hours`, `days`, or `weeks`. <br></br><br></br>**Constraint**: Required when any timeframe parameters are used. | `days` |
61+
| `winsorization_lower_percentile` <br></br> (optional) | Percentile at which to clip aggregated metrics. <br></br><br></br>**Constraint**: Only supported for `sum`, `count`, `last_value`, and `first_value` operations. | 0.001 |
62+
| `winsorization_upper_percentile` <br></br> (optional) | Percentile at which to clip aggregated metrics. <br></br><br></br>**Constraint**: Only supported for `sum`, `count`, `last_value`, and `first_value` operations. | 0.999 |
6363
| `filters` <br></br> (optional) | A list of filters to apply to metric, each containing a fact property, an operation (`equals` or `not_equals`), and a list of values | <pre><code>- fact_property: Source <br></br> operation: equals <br></br> values: <br></br> - organic <br></br> - search </code></pre> |
64-
| `retention_threshold_days` <br></br> (optional, numerators only) | Number of days to use in retention calculation (only used if `operation` = `retention`) | 7 |
65-
| `conversion_threshold_days` <br></br> (optional, numerators only) | Number of days to use in conversion calculation (only used if `operation` = `conversion`) | 7 |
66-
| `threshold_metric_settings` <br></br> (optional, numerators only) | Setting for threshold metrics | <pre><code>comparison_operator: gt <br></br>aggregation_type: sum <br></br>breach_value: 0 <br></br>timeframe_unit: days <br></br>timeframe_value: 3 </code></pre> |
64+
| `retention_threshold_days` <br></br> (optional, numerators only) | Number of days to use in retention calculation. <br></br><br></br>**Constraint**: Only used with `operation` = `retention`. Cannot be combined with other advanced aggregation parameters. | 7 |
65+
| `conversion_threshold_days` <br></br> (optional, numerators only) | Number of days to use in conversion calculation. <br></br><br></br>**Constraint**: Only used with `operation` = `conversion`. Cannot be combined with other advanced aggregation parameters or timeframe parameters. | 7 |
66+
| `threshold_metric_settings` <br></br> (optional, numerators only) | Settings for threshold metrics. <br></br><br></br>**Constraint**: Required when `operation` = `threshold`. Cannot be combined with other advanced aggregation parameters. | <pre><code>comparison_operator: gt <br></br>aggregation_type: sum <br></br>breach_value: 0 <br></br>timeframe_unit: days <br></br>timeframe_value: 3 </code></pre> |
6767

6868

6969
*Note that `fact_name` can reference facts defined in a different yaml file.
7070

71+
## Constraints and Limitations
72+
73+
When defining certified metrics, there are several important constraints to be aware of. These validation rules help ensure your metrics are configured correctly and will help prevent common configuration errors.
74+
75+
### Winsorization Constraints
76+
77+
Winsorization parameters (`winsorization_lower_percentile` and `winsorization_upper_percentile`) can **only** be used with the following aggregation operations:
78+
- `sum`
79+
- `count`
80+
- `last_value`
81+
- `first_value`
82+
83+
**Cannot be used with:**
84+
- `count_distinct`
85+
- `distinct_entity`
86+
- `threshold`
87+
- `retention`
88+
- `conversion`
89+
90+
:::warning
91+
Attempting to use winsorization with unsupported operations like `count_distinct` or `threshold` will result in a validation error.
92+
:::
93+
94+
### Advanced Aggregation Parameter Constraints
95+
96+
Advanced aggregation parameters have strict operation requirements:
97+
98+
#### Retention Metrics
99+
- **Must use**: `retention_threshold_days` parameter
100+
- **Cannot use**: Other advanced aggregation parameters
101+
- **Operation**: Must be `retention`
102+
103+
#### Conversion Metrics
104+
- **Must use**: `conversion_threshold_days` parameter
105+
- **Cannot use**: Other advanced aggregation parameters or timeframe parameters
106+
- **Operation**: Must be `conversion`
107+
108+
#### Threshold Metrics
109+
- **Must use**: `threshold_metric_settings` parameter
110+
- **Cannot use**: Other advanced aggregation parameters
111+
- **Operation**: Must be `threshold`
112+
113+
### Timeframe Parameter Constraints
114+
115+
Timeframe parameters (`aggregation_timeframe_start_value`, `aggregation_timeframe_end_value`, `aggregation_timeframe_unit`) have the following restrictions:
116+
117+
- **Cannot be used** with `conversion` operations
118+
- **Must include** `aggregation_timeframe_unit` if any timeframe parameters are specified
119+
- The deprecated `aggregation_timeframe_value` parameter should be replaced with `aggregation_timeframe_end_value`
120+
121+
### Threshold Metric Settings
122+
123+
When using `threshold_metric_settings` for threshold operations, the settings object must include:
124+
- `comparison_operator`: The comparison operator to use (e.g., `gt`, `lt`, `gte`, `lte`, `eq`)
125+
- `aggregation_type`: The aggregation type to apply before threshold comparison
126+
- `breach_value`: The threshold value to compare against
127+
- `timeframe_unit`: The time unit for the threshold calculation
128+
- `timeframe_value`: The time value for the threshold calculation
129+
130+
### Denominator Operation Constraints
131+
132+
For ratio metrics, denominator aggregations are limited to:
133+
- `sum`
134+
- `count`
135+
- `count_distinct`
136+
- `distinct_entity`
137+
138+
**Cannot use** advanced operations like `threshold`, `retention`, or `conversion` in denominators.
139+
140+
### Common Validation Examples
141+
142+
Here are some examples of **invalid** configurations that will trigger validation errors:
143+
144+
#### ❌ Invalid: Winsorization with count_distinct
145+
```yaml
146+
numerator:
147+
fact_name: User ID
148+
operation: count_distinct
149+
winsorization_lower_percentile: 0.01 # ERROR: Cannot winsorize count_distinct
150+
```
151+
152+
#### ❌ Invalid: Threshold without threshold_metric_settings
153+
```yaml
154+
numerator:
155+
fact_name: Revenue
156+
operation: threshold # ERROR: Must include threshold_metric_settings
157+
```
158+
159+
#### ❌ Invalid: Mixed advanced aggregation parameters
160+
```yaml
161+
numerator:
162+
fact_name: User ID
163+
operation: retention
164+
retention_threshold_days: 7
165+
conversion_threshold_days: 14 # ERROR: Cannot mix retention and conversion parameters
166+
```
167+
168+
#### ❌ Invalid: Timeframe with conversion
169+
```yaml
170+
numerator:
171+
fact_name: User ID
172+
operation: conversion
173+
conversion_threshold_days: 7
174+
aggregation_timeframe_start_value: 1 # ERROR: Cannot use timeframe with conversion
175+
```
176+
177+
#### ✅ Valid: Proper threshold configuration
178+
```yaml
179+
numerator:
180+
fact_name: Revenue
181+
operation: threshold
182+
threshold_metric_settings:
183+
comparison_operator: gt
184+
aggregation_type: sum
185+
breach_value: 100
186+
timeframe_unit: days
187+
timeframe_value: 7
188+
```
189+
71190
### Examples
72191
73192
#### Simple Revenue Metrics

0 commit comments

Comments
 (0)