diff --git a/README.md b/README.md index db36faf2b..d94601dd2 100644 --- a/README.md +++ b/README.md @@ -2,114 +2,137 @@ Logo

-# [dbt native data observability](https://www.elementary-data.com/) +# [dbt-native data observability](https://www.elementary-data.com/)

License -Downloads +Downloads

## What is Elementary? -This dbt package is part of Elementary, the dbt-native data observability solution for data and analytics engineers. -Set up in minutes, gain immediate visibility, detect data issues, send actionable alerts, and understand impact and root cause. -Available as self-hosted or Cloud service with premium features. +This dbt-native package powers **Elementary**, helping data and analytics engineers **detect data anomalies** and build **rich metadata tables** from their dbt runs and tests. Gain immediate visibility into data quality trend and uncover potential issues, all within dbt. -#### Table of Contents +Choose the observability tool that fits your needs: -- [Quick start - dbt package](#quick-start---dbt-package) -- [Get more out of Elementary](#get-more-out-of-elementary-dbt-package) -- [Run results and dbt artifacts](#run-results-and-dbt-artifacts) -- [Data anomaly detection as dbt tests](#data-anomaly-detection-as-dbt-tests) +✅ [**Elementary Open Source**](https://docs.elementary-data.com/oss/oss-introduction) – A powerful, self-hosted tool for teams that want full control. + +✅ [**Elementary Cloud Platform**](https://docs.elementary-data.com/cloud/introduction) – A fully managed, enterprise-ready solution with **automated ML-powered anomaly detection, flexible data discovery, integrated incident management, and collaboration tools**—all with minimal setup and infrastructure maintenance. + +### Table of Contents + +- [What's Inside the Elementary dbt Package?](#whats-inside-the-elementary-dbt-package) +- [Get more out of Elementary dbt package](#get-more-out-of-elementary-dbt-package) +- [Data Anomaly Detection & Schema changes as dbt Tests](#data-anomaly-detection--schema-changes-as-dbt-tests) +- [Elementary Tables - Run Results and dbt Artifacts](#elementary-tables---run-results-and-dbt-artifacts) - [AI-powered data validation and unstructured data tests](#ai-powered-data-validation-and-unstructured-data-tests) -- [How Elementary works?](#how-elementary-works) +- [Quickstart - dbt Package](#quickstart---dbt-package) - [Community & Support](#community--support) -- [Contribution](#contributions) +- [Contributions](#contributions) -## Quick start - dbt package +### **What's Inside the Elementary dbt Package?** -1. Add to your `packages.yml`: +The **Elementary dbt package** is designed to enhance data observability within your dbt workflows. It includes two core components: -```yml packages.yml -packages: - - package: elementary-data/elementary - version: 0.18.0 - ## Docs: https://docs.elementary-data.com -``` +- **Elementary Tests** – A collection of **anomaly detection tests** and other data quality checks that help identify unexpected trends, missing data, or schema changes directly within your dbt runs. +- **Metadata & Test Results Tables** – The package automatically generates and updates **metadata tables** in your data warehouse, capturing valuable information from your dbt runs and test results. These tables act as the backbone of your **observability setup**, enabling **alerts and reports** when connected to an Elementary observability platform. -2. Run `dbt deps` +## Get more out of Elementary dbt package -3. Add to your `dbt_project.yml`: +The **Elementary dbt package** helps you find anomalies in your data and build metadata tables from your dbt runs and tests—but there's even more you can do. -```yml -models: - ## elementary models will be created in the schema '_elementary' - ## for details, see docs: https://docs.elementary-data.com/ - elementary: - +schema: "elementary" -``` +To generate observability reports, send alerts, and govern your data quality effectively, connect your dbt package to one of the following options: -4. Run `dbt run --select elementary` +- **Elementary OSS** +- **A self-hosted, open-source CLI** that integrates seamlessly with your dbt project and the Elementary dbt package. It **enables alerting and provides the basic Elementary data observability report**, offering a comprehensive view of your dbt runs, all dbt test results, data lineage, and test coverage. It’s ideal for small teams of data and/or analytics engineers seeking a straightforward, non-collaborative setup for data observability. Quickstart [here](https://docs.elementary-data.com/oss/quickstart/quickstart-cli), and our team and community can provide great support on [Slack](https://www.elementary-data.com/community) if needed. +- **Elementary Cloud** + - A **fully managed, enterprise-ready** solution designed for **scalability and automation**. It offers automated **ML-powered anomaly detection**, flexible **data discovery**, an integrated **incident management system**, and **collaboration features.** Delivering **high value with minimal setup and infrastructure maintenance**, it's ideal for teams looking to enhance data reliability without operational overhead. To learn more, [book a demo](https://cal.com/maayansa/elementary-intro-github-package) or [start a trial](https://www.elementary-data.com/signup). -Check out the [full documentation](https://docs.elementary-data.com/). + + + -## Get more out of Elementary dbt package +## Data Anomaly Detection & Schema changes as dbt Tests -Elementary has 3 offerings: This dbt package, Elementary Community (OSS) and Elementary (cloud service). +**Elementary tests are configured and executed like native tests in your project!** -- **dbt package** - - For basic data monitoring and dbt artifacts collection, Elementary offers a dbt package. The package adds logging, artifacts uploading, and Elementary tests (anomaly detection and schema) to your project. -- **Elementary Community** - - An open-source CLI tool you can deploy and orchestrate to send alerts and self-host the Elementary report. Best for data and analytics engineers that require basic observability capabilities or for evaluating features without vendor approval. Our community can provide great support on [Slack](https://www.elementary-data.com/community) if needed. -- **Elementary Cloud** - - Ideal for teams monitoring mission-critical data pipelines, requiring guaranteed uptime and reliability, short-time-to-value, advanced features, collaboration, and professional support. The solution is secure by design, and requires no access to your data from cloud. To learn more, [book a demo](https://cal.com/maayansa/elementary-intro-github-package) or [start a trial](https://www.elementary-data.com/signup). +Elementary dbt tests help track and alert on schema changes as well as key metrics and metadata over time, including freshness, volume, distribution, cardinality, and more. -## Run Results and dbt artifacts +**Seamlessly configured and run like native dbt tests,** Elementary tests detect anomalies and outliers, helping you catch data issues early. -The package automatically uploads dbt artifacts and run results to tables in your data warehouse: +Example of an Elementary test config in `schema.yml`: -Run results tables: +``` -- dbt_run_results -- model_run_results -- snapshot_run_results -- dbt_invocations -- elementary_test_results (all dbt test results) +models: + - name: all_events + config: + elementary: + timestamp_column: 'loaded_at' + columns: + - name: event_count + tests: + - elementary.column_anomalies: + column_anomalies: + - average + where_expression: "event_type in ('event_1', 'event_2') and country_name != 'unwanted country'" + anomaly_sensitivity: 2 + time_bucket: + period: day + count:1 -Metadata tables: +``` -- dbt_models -- dbt_tests -- dbt_sources -- dbt_exposures -- dbt_metrics -- dbt_snapshots +Elementary tests include: -Here you can find [additional details about the tables](https://docs.elementary-data.com/guides/modules-overview/dbt-package). +### **Anomaly Detection Tests** -## Data anomaly detection as dbt tests +- **Volume anomalies -** Monitors the row count of your table over time per time bucket. +- **Freshness anomalies -** Monitors the freshness of your table over time, as the expected time between data updates. +- **Event freshness anomalies -** Monitors the freshness of event data over time, as the expected time it takes each event to load - that is, the time between when the event actually occurs (the **`event timestamp`**), and when it is loaded to the database (the **`update timestamp`**). +- **Dimension anomalies -** Monitors the count of rows grouped by given **`dimensions`** (columns/expressions). +- **Column anomalies -** Executes column level monitors on a certain column, with a chosen metric. +- **All columns anomalies** - Executes column level monitors and anomaly detection on all the columns of the table. -Elementary dbt tests collect metrics and metadata over time, such as freshness, volume, schema changes, distribution, cardinality, etc. -Executed as any other dbt tests, the Elementary tests alert on anomalies and outliers. +### **Schema Tests** -**Elementary tests are configured and executed like native tests in your project!** +- **Schema changes -** Alerts on a deleted table, deleted or added columns, or change of data type of a column. +- **Schema changes from baseline** - Checks for schema changes against baseline columns defined in a source’s or model’s configuration. +- **JSON schema** - Allows validating that a string column matches a given JSON schema. +- **Exposure validation test -** Detects changes in your models’ columns that break downstream exposure. -Example of Elementary test config in `properties.yml`: +Read more about the available [Elementary tests and configuration](https://docs.elementary-data.com/data-tests/introduction). -```yml -models: - - name: your_model_name - config: - elementary: - timestamp_column: updated_at - tests: - - elementary.table_anomalies - - elementary.all_columns_anomalies -``` +## Elementary Tables - Run Results and dbt Artifacts + +The **Elementary dbt package** automatically stores **dbt artifacts and run results** in your data warehouse, creating structured tables that provide visibility into your dbt runs and metadata. + +### **Metadata Tables - dbt Artifacts** + +These tables provide a comprehensive view of your dbt project structure and configurations: + +- **dbt_models** – Details on all dbt models. +- **dbt_tests** – Stores information about dbt tests. +- **dbt_sources** – Tracks source tables and freshness checks. +- **dbt_exposures** – Logs downstream data usage. +- **dbt_metrics** – Captures dbt-defined metrics. +- **dbt_snapshots** – Stores historical snapshot data. +- **dbt_seeds -** Stores current metadata about seed files in the dbt project. +- **dbt_columns** - Stores detailed information about columns across the dbt project. + +### **Run Results Tables** -Read about the available [Elementary tests and configuration](https://docs.elementary-data.com/data-tests/introduction). +These tables track execution details, test outcomes, and performance metrics from your dbt runs: + +- **dbt_run_results** – Captures high-level details of each dbt run. +- **model_run_results** – Stores execution data for dbt models. +- **snapshot_run_results** – Logs results from dbt snapshots. +- **dbt_invocations** – Tracks each instance of dbt being run. +- **elementary_test_results** – Consolidates all dbt test results, including Elementary anomaly tests. + +For a full breakdown of these tables, see the [documentation](https://docs.elementary-data.com/dbt/package-models). ## AI-powered data validation and unstructured data tests @@ -135,15 +158,33 @@ models: Learn more in our [AI data validations documentation](https://docs.elementary-data.com/data-tests/ai-data-tests/ai_data_validations). -## How Elementary works? +## Quickstart - dbt Package -Elementary dbt package creates tables of metadata and test results in your data warehouse, as part of your dbt runs. +1. Add to your `packages.yml`: -The cloud service or the CLI tool read the data from these tables, send alerts and present the results in the UI. +``` +packages: + - package: elementary-data/elementary + version: 0.18.0 + ## Docs: - - - +``` + +2. Run `dbt deps` +3. Add to your `dbt_project.yml`: + +``` +models: + ## elementary models will be created in the schema '_elementary' + ## for details, see docs: + elementary: + +schema: "elementary" + +``` + +4. Run `dbt run --select elementary` + +Check out the [full documentation](https://docs.elementary-data.com/). ## Community & Support @@ -154,4 +195,4 @@ The cloud service or the CLI tool read the data from these tables, send alerts a Thank you :orange_heart: Whether it's a bug fix, new feature, or additional documentation - we greatly appreciate contributions! -Check out the [contributions guide](https://docs.elementary-data.com/general/contributions) and [open issues](https://github.com/elementary-data/elementary/issues) in the main repo. +Check out the [contributions guide](https://docs.elementary-data.com/oss/general/contributions) and [open issues](https://github.com/elementary-data/elementary/issues) in the main repo.