Skip to content

Commit a533ea0

Browse files
authored
Update Bigeye docs (#855)
1 parent 27e4ebe commit a533ea0

File tree

2 files changed

+56
-5
lines changed

2 files changed

+56
-5
lines changed

src/cookbooks/data_monitoring/bigquery_etl_integration.md

Lines changed: 55 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -3,23 +3,74 @@
33
Monitors can be defined alongside derived datasets in bigquery-etl. Monitoring in Bigeye for a specific table can be enabled by adding `monitoring` metadata to the `metadata.yaml` file:
44

55
```yaml
6-
friendly_name: Some Table
6+
friendly_name: Some Table [warn]
77
monitoring:
88
enabled: true # Enables monitoring for the table in Bigeye and deploys freshness and volume metrics
99
collection: Test # An existing collection these monitors should be part of in Bigeye
1010
```
1111
1212
Enabling monitoring for a table automatically deploys freshness and volume metrics for this table.
1313
14+
Bigeye monitors are triggered automatically via Airflow for queries that have `monitoring` set to `enabled: true`. The checks are executed after the ETL run for the table has been completed.
15+
16+
To indicate whether a failing check should block any downstream Airflow tasks, a `[warn]` or `[fail]` can be added to the name of the Bigeye metric. By default, all metrics that do not have either of those tags specified are considered as `[warn]`. These metrics won't be blocking any downstream Airflow tasks when checks fail, but any failing check will appear in the Bigeye dashboard. Metrics that have `[fail]` specified in their names will block the execution of downstream Airflow tasks in the event of a check failing.
17+
1418
## Bigconfig
1519

1620
Additional and custom monitors can be defined in a [Bigconfig](https://docs.bigeye.com/docs/bigconfig#example-template) `bigconfig.yml` file that is stored in the same directory as the table query. Bigconfig allows users to deploy other pre-defined monitors, such as row counts or null checks on a table or column level.
1721

22+
## Custom SQL Rules
23+
24+
> > This is a temporary workaround until custom SQL rules are supported in Bigconfig, which is currently being worked on.
25+
26+
Custom SQL rules can be configured in a separate `bigeye_custom_rules.sql` file alongside the query. This file can contain various rules:
27+
28+
```sql
29+
-- {
30+
-- "name": "Fenix releases version format",
31+
-- "alert_conditions": "value",
32+
-- "range": {
33+
-- "min": 0,
34+
-- "max": 1
35+
-- },
36+
-- "collections": ["Test"],
37+
-- "owner": "",
38+
-- "schedule": "Default Schedule - 13:00 UTC"
39+
-- }
40+
SELECT
41+
ROUND((COUNTIF(NOT REGEXP_CONTAINS(version, r"^[0-9]+\..+$"))) / COUNT(*) * 100, 2) AS perc
42+
FROM
43+
`{{ project_id }}.{{ dataset_id }}.{{ table_name }}`;
44+
45+
-- {
46+
-- "name": "Fenix releases product check",
47+
-- "alert_conditions": "value",
48+
-- "range": {
49+
-- "min": 0,
50+
-- "max": 1
51+
-- },
52+
-- "collections": ["Test"],
53+
-- "owner": "",
54+
-- "schedule": "Default Schedule - 13:00 UTC"
55+
-- }
56+
SELECT
57+
ROUND((COUNTIF(product != "fenix")) / COUNT(*) * 100, 2) AS perc
58+
FROM
59+
`{{ project_id }}.{{ dataset_id }}.{{ table_name }}`;
60+
```
61+
62+
The SQL comment before the rule SQL has to be a JSON object that contains the configuration parameters for this rule:
63+
64+
- `name`: the name of the SQL rule. Specify `[warn]` or `[fail]` to indicate whether a rule failure should block downstream Airflow tasks
65+
- `alert_conditions`: one of `value` (alerts based on the returned value) or `count` (alerts based on whether the query returns rows)
66+
- `collections`: list of collections this rule should be a part of
67+
- `owner`: email address of the rule owner
68+
- `schedule`: optional schedule of when this rule should be triggered. The rule will also get triggered as part of Airflow
69+
- `range`: optional range of allowed values when `"alert_conditions": "value"`
70+
1871
## Deployment
1972

2073
To generate a `bigconfig.yml` file with the default metrics when monitoring is enabled run: `bqetl monitoring update [PATH]`.
2174
The created file can be manually edited. For tables that do not have a `bigconfig.yml` checked into the repository, the file will get generated automatically before deployment to Bigeye. Files only need to be checked in if there are some customizations.
2275

23-
To manually deploy a Bigconfig file run: `bqetl monitoring deploy [PATH]`. The environment variable `BIGEYE_API_KEY` needs to be set to a valid API token that can be [created in Bigeye](https://docs.bigeye.com/docs/using-api-keys).
24-
25-
The deployment of Bigconfig files also runs automatically as part of the [artifact deployment process](../../concepts/pipeline/artifact_deployment.md), after tables and views have been deployed.
76+
The deployment of Bigconfig files runs automatically as part of the [artifact deployment process](../../concepts/pipeline/artifact_deployment.md), after tables and views have been deployed.

src/cookbooks/data_monitoring/deploying_metrics.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,4 +68,4 @@ Custom rules are useful for addressing unique data quality requirements that sta
6868

6969
- Autothresholds are recommended for freshness and volume metrics, as they automatically adjust based on typical patterns. For other metrics, it's advisable to manually set thresholds to ensure accuracy and relevance.
7070

71-
- It is recommended to add metrics at the view level rather than directly on tables. This ensures that even if a table becomes obsolete or is upgraded, unnecessary checks on previous versions are avoided. The only exception to this rule is for freshness and volume metrics, which must be deployed directly on tables.
71+
- It is recommended to add metrics at the table level. This ensures that checks run as closely to the source as possible. Also, cost of running checks on tables is usually much lower in Bigeye compared to running them on views as Bigeye makes use of the partition configurations.

0 commit comments

Comments
 (0)