You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Scheduling Data Metric Function on table incurs Serverless Credit Usage in Snowflake. Refer [Billing and Pricing](https://docs.snowflake.com/en/user-guide/data-quality-intro#billing-and-pricing) for more details.
185
185
Please ensure you DROP Data Metric Function created via dmf_associations.sql if the assertion is no longer in use.
186
186
:::
@@ -208,6 +208,127 @@ either via CLI or the UI visible as normal assertions.
208
208
209
209
`datahub ingest -c snowflake.yml`
210
210
211
+
## Ingesting External (User-Created) DMFs
212
+
213
+
In addition to DataHub-created DMFs, you can also ingest results from your own custom Snowflake Data Metric Functions. "External" here means DMFs that were created directly in Snowflake without using DataHub's assertion compiler - they exist outside of DataHub's management.
214
+
215
+
### Why Use External DMFs?
216
+
217
+
You might want to ingest external DMFs if:
218
+
219
+
-**Pre-existing DMFs**: You already have DMFs in Snowflake that were created before adopting DataHub, and you want to see their results in DataHub without recreating them
220
+
-**Custom logic**: You need DMF logic that isn't supported by DataHub's assertion compiler (e.g., complex multi-table checks)
221
+
-**Team workflows**: Different teams manage DMFs directly in Snowflake, but you want centralized visibility in DataHub
222
+
-**Gradual adoption**: You want to start monitoring existing data quality checks in DataHub before fully migrating to DataHub-managed assertions
223
+
224
+
### Enabling External DMF Ingestion
225
+
226
+
To ingest external DMFs, add the `include_externally_managed_dmfs` flag to your Snowflake recipe:
227
+
228
+
```yaml
229
+
source:
230
+
type: snowflake
231
+
config:
232
+
# ... connection config ...
233
+
234
+
# Enable assertion results ingestion (required)
235
+
include_assertion_results: true
236
+
237
+
# Enable external DMF ingestion (new)
238
+
include_externally_managed_dmfs: true
239
+
240
+
# Time window for assertion results
241
+
start_time: "-7 days"
242
+
```
243
+
244
+
Both flags must be enabled for external DMF ingestion to work.
245
+
246
+
### Requirements for External DMFs
247
+
248
+
**External DMFs must return `1` for SUCCESS and `0` for FAILURE.**
249
+
250
+
DataHub interprets the `VALUE` column from Snowflake's `DATA_QUALITY_MONITORING_RESULTS` table as:
251
+
252
+
- `VALUE = 1`→ Assertion **PASSED**
253
+
- `VALUE = 0`→ Assertion **FAILED**
254
+
255
+
This is because DataHub cannot interpret arbitrary return values (e.g., "100 null rows" - is that good or bad?). You must build the pass/fail logic into your DMF.
256
+
257
+
::: warning What if my DMF returns other values?
258
+
If your DMF returns values other than 0 or 1, DataHub will mark the assertion result as **ERROR**:
259
+
260
+
- `VALUE = 1`→ **PASSED**
261
+
- `VALUE = 0`→ **FAILED**
262
+
- `VALUE != 0 and VALUE != 1`(e.g., 5, 100, -1) → **ERROR**
263
+
264
+
The ERROR state indicates that the DMF is not configured correctly for DataHub ingestion. You can identify these cases by:
265
+
266
+
1. Checking the ingestion logs for warnings like: `DMF 'my_dmf' returned invalid value 100. Expected 1 (pass) or 0 (fail). Marking as ERROR.`
267
+
2. Looking for assertions with ERROR status in the DataHub UI
268
+
:::
269
+
270
+
#### Example: Writing External DMFs Correctly
271
+
272
+
**WRONG** - Returns raw count (DataHub can't interpret this):
273
+
274
+
```sql
275
+
CREATE DATA METRIC FUNCTION my_null_check(ARGT TABLE(col VARCHAR))
- `snowflake_reference_id`: Snowflake's unique identifier for the DMF-table binding
328
+
- `snowflake_dmf_columns`: Comma-separated list of columns the DMF operates on
329
+
330
+
You can view external DMF assertions in the **Quality** tab of the associated dataset in the DataHub UI. They will show pass/fail history alongside any DataHub-created assertions.
331
+
211
332
## Caveats
212
333
213
334
- Currently, Snowflake supports at most 1000 DMF-table associations at the moment so you can not define more than 1000 assertions for snowflake.
0 commit comments