Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
197 changes: 119 additions & 78 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,114 +2,137 @@
<img alt="Logo" src="https://raw.githubusercontent.com/elementary-data/elementary/master/static/github_banner.png"/ width="1000">
</p>

# [dbt native data observability](https://www.elementary-data.com/)
# [dbt-native data observability](https://www.elementary-data.com/)

<p align="center">
<a href="https://join.slack.com/t/elementary-community/shared_invite/zt-uehfrq2f-zXeVTtXrjYRbdE_V6xq4Rg"><img src="https://img.shields.io/badge/join-Slack-ff69b4"/></a>
<a href="https://docs.elementary-data.com/quickstart"><img src="https://img.shields.io/badge/docs-quickstart-orange"/></a>
<img alt="License" src="https://img.shields.io/badge/license-Apache--2.0-ff69b4"/>
<img alt="Downloads" src="https://static.pepy.tech/personalized-badge/elementary-lineage?period=total&units=international_system&left_color=grey&right_color=orange"&left_text=Downloads"/>
<img alt="Downloads" src="[https://static.pepy.tech/personalized-badge/elementary-lineage?period=total&units=international_system&left_color=grey&right_color=orange"&left_text=Downloads"/](https://static.pepy.tech/personalized-badge/elementary-lineage?period=total&units=international_system&left_color=grey&right_color=orange%22&left_text=Downloads%22/)>
</p>

## What is Elementary?

This dbt package is part of Elementary, the dbt-native data observability solution for data and analytics engineers.
Set up in minutes, gain immediate visibility, detect data issues, send actionable alerts, and understand impact and root cause.
Available as self-hosted or Cloud service with premium features.
This dbt-native package powers **Elementary**, helping data and analytics engineers **detect data anomalies** and build **rich metadata tables** from their dbt runs and tests. Gain immediate visibility into data quality trend and uncover potential issues, all within dbt.

#### Table of Contents
Choose the observability tool that fits your needs:

- [Quick start - dbt package](#quick-start---dbt-package)
- [Get more out of Elementary](#get-more-out-of-elementary-dbt-package)
- [Run results and dbt artifacts](#run-results-and-dbt-artifacts)
- [Data anomaly detection as dbt tests](#data-anomaly-detection-as-dbt-tests)
✅ [**Elementary Open Source**](https://docs.elementary-data.com/oss/oss-introduction) – A powerful, self-hosted tool for teams that want full control.

✅ [**Elementary Cloud Platform**](https://docs.elementary-data.com/cloud/introduction) – A fully managed, enterprise-ready solution with **automated ML-powered anomaly detection, flexible data discovery, integrated incident management, and collaboration tools**—all with minimal setup and infrastructure maintenance.

### Table of Contents

- [What's Inside the Elementary dbt Package?](#whats-inside-the-elementary-dbt-package)
- [Get more out of Elementary dbt package](#get-more-out-of-elementary-dbt-package)
- [Data Anomaly Detection & Schema changes as dbt Tests](#data-anomaly-detection--schema-changes-as-dbt-tests)
- [Elementary Tables - Run Results and dbt Artifacts](#elementary-tables---run-results-and-dbt-artifacts)
- [AI-powered data validation and unstructured data tests](#ai-powered-data-validation-and-unstructured-data-tests)
- [How Elementary works?](#how-elementary-works)
- [Quickstart - dbt Package](#quickstart---dbt-package)
- [Community & Support](#community--support)
- [Contribution](#contributions)
- [Contributions](#contributions)

## Quick start - dbt package
### **What's Inside the Elementary dbt Package?**

1. Add to your `packages.yml`:
The **Elementary dbt package** is designed to enhance data observability within your dbt workflows. It includes two core components:

```yml packages.yml
packages:
- package: elementary-data/elementary
version: 0.18.0
## Docs: https://docs.elementary-data.com
```
- **Elementary Tests** – A collection of **anomaly detection tests** and other data quality checks that help identify unexpected trends, missing data, or schema changes directly within your dbt runs.
- **Metadata & Test Results Tables** – The package automatically generates and updates **metadata tables** in your data warehouse, capturing valuable information from your dbt runs and test results. These tables act as the backbone of your **observability setup**, enabling **alerts and reports** when connected to an Elementary observability platform.

2. Run `dbt deps`
## Get more out of Elementary dbt package

3. Add to your `dbt_project.yml`:
The **Elementary dbt package** helps you find anomalies in your data and build metadata tables from your dbt runs and tests—but there's even more you can do.

```yml
models:
## elementary models will be created in the schema '<your_schema>_elementary'
## for details, see docs: https://docs.elementary-data.com/
elementary:
+schema: "elementary"
```
To generate observability reports, send alerts, and govern your data quality effectively, connect your dbt package to one of the following options:

4. Run `dbt run --select elementary`
- **Elementary OSS**
- **A self-hosted, open-source CLI** that integrates seamlessly with your dbt project and the Elementary dbt package. It **enables alerting and provides the basic Elementary data observability report**, offering a comprehensive view of your dbt runs, all dbt test results, data lineage, and test coverage. It’s ideal for small teams of data and/or analytics engineers seeking a straightforward, non-collaborative setup for data observability. Quickstart [here](https://docs.elementary-data.com/oss/quickstart/quickstart-cli), and our team and community can provide great support on [Slack](https://www.elementary-data.com/community) if needed.
- **Elementary Cloud**
- A **fully managed, enterprise-ready** solution designed for **scalability and automation**. It offers automated **ML-powered anomaly detection**, flexible **data discovery**, an integrated **incident management system**, and **collaboration features.** Delivering **high value with minimal setup and infrastructure maintenance**, it's ideal for teams looking to enhance data reliability without operational overhead. To learn more, [book a demo](https://cal.com/maayansa/elementary-intro-github-package) or [start a trial](https://www.elementary-data.com/signup).

Check out the [full documentation](https://docs.elementary-data.com/).
<kbd align="center">
<a href="https://storage.googleapis.com/elementary_static/elementary_demo.html"><img align="center" style="max-width:300px;" src="https://raw.githubusercontent.com/elementary-data/elementary/master/static/report_ui.gif"> </a>
</kbd>

## Get more out of Elementary dbt package
## Data Anomaly Detection & Schema changes as dbt Tests

Elementary has 3 offerings: This dbt package, Elementary Community (OSS) and Elementary (cloud service).
**Elementary tests are configured and executed like native tests in your project!**

- **dbt package**
- For basic data monitoring and dbt artifacts collection, Elementary offers a dbt package. The package adds logging, artifacts uploading, and Elementary tests (anomaly detection and schema) to your project.
- **Elementary Community**
- An open-source CLI tool you can deploy and orchestrate to send alerts and self-host the Elementary report. Best for data and analytics engineers that require basic observability capabilities or for evaluating features without vendor approval. Our community can provide great support on [Slack](https://www.elementary-data.com/community) if needed.
- **Elementary Cloud**
- Ideal for teams monitoring mission-critical data pipelines, requiring guaranteed uptime and reliability, short-time-to-value, advanced features, collaboration, and professional support. The solution is secure by design, and requires no access to your data from cloud. To learn more, [book a demo](https://cal.com/maayansa/elementary-intro-github-package) or [start a trial](https://www.elementary-data.com/signup).
Elementary dbt tests help track and alert on schema changes as well as key metrics and metadata over time, including freshness, volume, distribution, cardinality, and more.

## Run Results and dbt artifacts
**Seamlessly configured and run like native dbt tests,** Elementary tests detect anomalies and outliers, helping you catch data issues early.

The package automatically uploads dbt artifacts and run results to tables in your data warehouse:
Example of an Elementary test config in `schema.yml`:

Run results tables:
```

- dbt_run_results
- model_run_results
- snapshot_run_results
- dbt_invocations
- elementary_test_results (all dbt test results)
models:
- name: all_events
config:
elementary:
timestamp_column: 'loaded_at'
columns:
- name: event_count
tests:
- elementary.column_anomalies:
column_anomalies:
- average
where_expression: "event_type in ('event_1', 'event_2') and country_name != 'unwanted country'"
anomaly_sensitivity: 2
time_bucket:
period: day
count:1

Metadata tables:
```

- dbt_models
- dbt_tests
- dbt_sources
- dbt_exposures
- dbt_metrics
- dbt_snapshots
Elementary tests include:

Here you can find [additional details about the tables](https://docs.elementary-data.com/guides/modules-overview/dbt-package).
### **Anomaly Detection Tests**

## Data anomaly detection as dbt tests
- **Volume anomalies -** Monitors the row count of your table over time per time bucket.
- **Freshness anomalies -** Monitors the freshness of your table over time, as the expected time between data updates.
- **Event freshness anomalies -** Monitors the freshness of event data over time, as the expected time it takes each event to load - that is, the time between when the event actually occurs (the **`event timestamp`**), and when it is loaded to the database (the **`update timestamp`**).
- **Dimension anomalies -** Monitors the count of rows grouped by given **`dimensions`** (columns/expressions).
- **Column anomalies -** Executes column level monitors on a certain column, with a chosen metric.
- **All columns anomalies** - Executes column level monitors and anomaly detection on all the columns of the table.

Elementary dbt tests collect metrics and metadata over time, such as freshness, volume, schema changes, distribution, cardinality, etc.
Executed as any other dbt tests, the Elementary tests alert on anomalies and outliers.
### **Schema Tests**

**Elementary tests are configured and executed like native tests in your project!**
- **Schema changes -** Alerts on a deleted table, deleted or added columns, or change of data type of a column.
- **Schema changes from baseline** - Checks for schema changes against baseline columns defined in a source’s or model’s configuration.
- **JSON schema** - Allows validating that a string column matches a given JSON schema.
- **Exposure validation test -** Detects changes in your models’ columns that break downstream exposure.

Example of Elementary test config in `properties.yml`:
Read more about the available [Elementary tests and configuration](https://docs.elementary-data.com/data-tests/introduction).

```yml
models:
- name: your_model_name
config:
elementary:
timestamp_column: updated_at
tests:
- elementary.table_anomalies
- elementary.all_columns_anomalies
```
## Elementary Tables - Run Results and dbt Artifacts

The **Elementary dbt package** automatically stores **dbt artifacts and run results** in your data warehouse, creating structured tables that provide visibility into your dbt runs and metadata.

### **Metadata Tables - dbt Artifacts**

These tables provide a comprehensive view of your dbt project structure and configurations:

- **dbt_models** – Details on all dbt models.
- **dbt_tests** – Stores information about dbt tests.
- **dbt_sources** – Tracks source tables and freshness checks.
- **dbt_exposures** – Logs downstream data usage.
- **dbt_metrics** – Captures dbt-defined metrics.
- **dbt_snapshots** – Stores historical snapshot data.
- **dbt_seeds -** Stores current metadata about seed files in the dbt project.
- **dbt_columns** - Stores detailed information about columns across the dbt project.

### **Run Results Tables**

Read about the available [Elementary tests and configuration](https://docs.elementary-data.com/data-tests/introduction).
These tables track execution details, test outcomes, and performance metrics from your dbt runs:

- **dbt_run_results** – Captures high-level details of each dbt run.
- **model_run_results** – Stores execution data for dbt models.
- **snapshot_run_results** – Logs results from dbt snapshots.
- **dbt_invocations** – Tracks each instance of dbt being run.
- **elementary_test_results** – Consolidates all dbt test results, including Elementary anomaly tests.

For a full breakdown of these tables, see the [documentation](https://docs.elementary-data.com/dbt/package-models).

## AI-powered data validation and unstructured data tests

Expand All @@ -135,15 +158,33 @@ models:

Learn more in our [AI data validations documentation](https://docs.elementary-data.com/data-tests/ai-data-tests/ai_data_validations).

## How Elementary works?
## Quickstart - dbt Package

Elementary dbt package creates tables of metadata and test results in your data warehouse, as part of your dbt runs.
1. Add to your `packages.yml`:

The cloud service or the CLI tool read the data from these tables, send alerts and present the results in the UI.
```
packages:
- package: elementary-data/elementary
version: 0.18.0
## Docs: <https://docs.elementary-data.com>

<kbd align="center">
<a href="https://storage.googleapis.com/elementary_static/elementary_demo.html"><img align="center" style="max-width:300px;" src="https://raw.githubusercontent.com/elementary-data/elementary/master/static/report_ui.gif"> </a>
</kbd>
```

2. Run `dbt deps`
3. Add to your `dbt_project.yml`:

```
models:
## elementary models will be created in the schema '<your_schema>_elementary'
## for details, see docs: <https://docs.elementary-data.com/>
elementary:
+schema: "elementary"

```

4. Run `dbt run --select elementary`

Check out the [full documentation](https://docs.elementary-data.com/).

## Community & Support

Expand All @@ -154,4 +195,4 @@ The cloud service or the CLI tool read the data from these tables, send alerts a

Thank you :orange_heart: Whether it's a bug fix, new feature, or additional documentation - we greatly appreciate contributions!

Check out the [contributions guide](https://docs.elementary-data.com/general/contributions) and [open issues](https://github.com/elementary-data/elementary/issues) in the main repo.
Check out the [contributions guide](https://docs.elementary-data.com/oss/general/contributions) and [open issues](https://github.com/elementary-data/elementary/issues) in the main repo.
Loading