-
Notifications
You must be signed in to change notification settings - Fork 3.4k
feat(ingestion): add Omni BI platform source (INCUBATING) #16564
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
treff7es
merged 37 commits into
datahub-project:master
from
bearsandrhinos:feat/omni-source
Apr 3, 2026
Merged
Changes from all commits
Commits
Show all changes
37 commits
Select commit
Hold shift + click to select a range
751c2e7
feat(ingestion): add Omni BI platform source (INCUBATING)
bearsandrhinos 688218a
fix: address pre-PR review warnings in OmniSource
bearsandrhinos 8ada427
docs(omni): restructure docs to OSS folder format
bearsandrhinos b1689e4
fix(omni): resolve CI lint and markdown format failures
bearsandrhinos 14e85f1
fix(omni): resolve remaining CI failures
bearsandrhinos 96a33f1
fix(omni): fix import sort order to pass ruff 0.11.7
bearsandrhinos 307c9df
fix(omni): add omni entry point to pyproject.toml
bearsandrhinos 69ee24d
fix(omni): apply ruff format to test_omni_integration.py
bearsandrhinos 50a5079
fix(omni): resolve mypy type errors in source and integration tests
bearsandrhinos 0b6ab6c
fix(omni): resolve mypy errors in testQuick CI
bearsandrhinos 8022d42
Merge branch 'master' into feat/omni-source
bearsandrhinos e89e010
Update metadata-ingestion/src/datahub/ingestion/source/omni/omni.py
bearsandrhinos 257976d
Update metadata-ingestion/src/datahub/ingestion/source/omni/omni.py
bearsandrhinos b7eff21
Update metadata-ingestion/src/datahub/ingestion/source/omni/omni.py
bearsandrhinos 68ab468
fix(omni): add missing topics_scanned field to OmniSourceReport
bearsandrhinos 9b0b857
fix(omni): log when skipping YAML parse failures
bearsandrhinos 1199240
fix(omni): use Literal type for SemanticField confidence field
bearsandrhinos e6cd0a2
fix(omni): add base_url validator for http(s) scheme and trailing slash
bearsandrhinos 6ab5658
fix(omni): log when last_modified datetime parse fails
bearsandrhinos 00c3754
Update metadata-ingestion/tests/integration/omni/test_omni_integratio…
bearsandrhinos 89ed609
Update metadata-ingestion/src/datahub/ingestion/source/omni/omni.py
bearsandrhinos 5557e9f
fix(omni): fix linting errors and update golden file to standard MCP …
treff7es b1abb2c
Merge branch 'master' into feat/omni-source
treff7es aa00677
Omni API: log when pagination hits safety cap; document max pages
bearsandrhinos 10dbac8
Omni: fix _ingest_topic_payload typing; add dashboard tile helper (re…
bearsandrhinos 0098d6b
Omni: use DatasetSubTypes.TOPIC for topic dataset subtype
bearsandrhinos 5f46db1
Merge origin/master into feat/omni-source
bearsandrhinos cccf93b
fix(omni): alphabetize omni optional-deps after okta in pyproject.toml
bearsandrhinos 460ce6f
Merge branch 'master' into feat/omni-source
treff7es cd1f7e0
fix(ingest/omni): fix lint errors in omni source
treff7es 34fae34
fix(ingest/omni): re-raise exception after reporting failure
treff7es 22b2f0f
refactor(ingest/omni): inline _as_wu method
treff7es 9fa3fce
fix(ingest/omni): fix mypy type errors in omni source
treff7es 4afa12f
fix(ingest/omni): fix mypy attr-defined error in omni test
treff7es 944c183
fix(ingest/omni): regenerate pyproject.toml and uv.lock from setup.py
treff7es b295d9d
fix(ingest/omni): update golden file for correct browse paths
treff7es a041d60
fix(ingest/omni): correct lineage direction to match Omni data model
treff7es File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,18 @@ | ||
| ## Overview | ||
|
|
||
| Omni is a cloud-native business intelligence platform. Learn more in the [official Omni documentation](https://docs.omni.co/). | ||
|
|
||
| The DataHub integration for Omni covers BI entities such as dashboards, charts, semantic datasets, and related ownership context. Depending on module capabilities, it can also capture features such as lineage, usage, profiling, ownership, tags, and stateful deletion detection. | ||
|
|
||
| ## Concept Mapping | ||
|
|
||
| | Omni Concept | DataHub Concept | Notes | | ||
| | --------------- | ------------------------------------------------------------- | ------------------------------------------------------------------------------------- | | ||
| | `Folder` | [Container](../../metamodel/entities/container.md) | SubType `"Folder"` | | ||
| | `Dashboard` | [Dashboard](../../metamodel/entities/dashboard.md) | Published document with `hasDashboard=true` | | ||
| | `Tile` | [Chart](../../metamodel/entities/chart.md) | Each query presentation within a dashboard | | ||
| | `Topic` | [Dataset](../../metamodel/entities/dataset.md) | SubType `"Topic"` — the semantic join graph entry point | | ||
| | `View` | [Dataset](../../metamodel/entities/dataset.md) | SubType `"View"` — semantic layer table with dimensions and measures as schema fields | | ||
| | `Workbook` | [Dataset](../../metamodel/entities/dataset.md) | SubType `"Workbook"` — unpublished personal exploration document | | ||
| | Warehouse table | [Dataset](../../metamodel/entities/dataset.md) | Native platform entity (e.g. Snowflake, BigQuery); linked as upstream of Omni Views | | ||
| | Document owner | [User (a.k.a CorpUser)](../../metamodel/entities/corpuser.md) | Propagated as `TECHNICAL_OWNER` to Dashboard and Chart entities | | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,59 @@ | ||
| ### Capabilities | ||
|
|
||
| Use the **Important Capabilities** table above as the source of truth for supported features and whether additional configuration is required. | ||
|
|
||
| #### Physical table lineage | ||
|
|
||
| Omni Views reference physical warehouse tables via `sql_table_name` in model YAML. The connector resolves each reference to a DataHub dataset URN using the `connection_to_platform` mapping. If `normalize_snowflake_names: true` (default), database, schema, and table name components are uppercased to match the casing used by the DataHub Snowflake connector. | ||
|
|
||
| #### Column-level lineage | ||
|
|
||
| When `include_column_lineage: true` (default), the connector emits `FineGrainedLineage` entries by parsing `sql` expressions in model YAML and matching field references to known view columns. This enables precise field-level impact analysis across the full chain: | ||
|
|
||
| ``` | ||
| physical_table.column → semantic_view.field → dashboard_tile.field | ||
| ``` | ||
|
|
||
| #### Schema metadata | ||
|
|
||
| For each Omni Semantic View, the connector emits a `SchemaMetadata` aspect containing one `SchemaField` per dimension and measure defined in model YAML: | ||
|
|
||
| - **Dimensions**: emitted with inferred native type (string, date, timestamp, number, boolean) | ||
| - **Measures**: emitted with aggregation type and native type `NUMBER` | ||
| - Field descriptions are extracted from the YAML `description` attribute when present | ||
|
|
||
| #### Model and document filtering | ||
|
|
||
| Use `model_pattern` and `document_pattern` to restrict ingestion to specific models or dashboards: | ||
|
|
||
| ```yaml | ||
| model_pattern: | ||
| allow: | ||
| - "^prod-.*" | ||
| deny: | ||
| - ".*-dev$" | ||
|
|
||
| document_pattern: | ||
| allow: | ||
| - ".*" | ||
| ``` | ||
|
|
||
| ### Limitations | ||
|
|
||
| - Access Filters, User Attributes, and Cache schedules are not yet ingested. | ||
| - Column lineage is limited to fields that appear in model YAML `sql` expressions; complex or fully derived expressions may not fully resolve. | ||
| - Large organizations with many models may approach Omni API rate limits; tune `max_requests_per_minute` accordingly. | ||
| - True end-to-end integration tests require a live Omni environment; the test suite uses deterministic mock API responses. | ||
|
|
||
| ### Troubleshooting | ||
|
|
||
| If ingestion fails, validate credentials, permissions, and connectivity first. Then review the ingestion report and logs for source-specific errors. | ||
|
|
||
| Common issues: | ||
|
|
||
| | Symptom | Likely Cause | Resolution | | ||
| | ------------------------------------------------ | ----------------------------------------------------- | ----------------------------------------------------------------------------- | | ||
| | `403 Forbidden` on `/v1/connections` | API key lacks connection read scope | Ingestion continues with config fallbacks; physical lineage may be incomplete | | ||
| | Physical tables not linked to warehouse entities | `connection_to_platform` not configured | Add connection mapping for each Omni connection ID | | ||
| | Snowflake URN mismatch | Case mismatch between Omni and DataHub Snowflake URNs | Ensure `normalize_snowflake_names: true` (default) | | ||
| | Column lineage empty | View YAML has no `sql` expressions | Expected for views using direct `sql_table_name` without field-level SQL | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,38 @@ | ||
| ### Overview | ||
|
|
||
| The `omni` module ingests metadata from the [Omni](https://omni.co/) BI platform into DataHub. It is intended for production ingestion workflows and supports the following: | ||
|
|
||
| - Folders (as Containers), Dashboards, and Chart tiles | ||
| - Semantic layer: Models, Topics, and Views with schema fields (dimensions and measures) | ||
| - Physical warehouse tables with upstream lineage stitched to existing DataHub entities | ||
| - Column-level (fine-grained) lineage from semantic view fields back to warehouse columns | ||
| - Ownership propagated from the Omni document API | ||
|
|
||
| Lineage is emitted as a five-hop chain: | ||
|
|
||
| ``` | ||
| Folder → Dashboard → Chart (tile) → Topic → Semantic View → Physical Table | ||
| ``` | ||
|
|
||
| ### Prerequisites | ||
|
|
||
| Before running ingestion, ensure you have the following: | ||
|
|
||
| 1. **An Omni Organization API key** with read access to models, documents, and connections. Generate API keys in Omni Admin → API Keys. | ||
|
|
||
| 2. **Connection mapping configuration** if you want physical table lineage to stitch with existing warehouse entities in DataHub. You will need to map each Omni connection ID to the corresponding DataHub platform name, platform instance, and database name: | ||
|
|
||
| ```yaml | ||
| connection_to_platform: | ||
| "conn_abc123": "snowflake" | ||
| connection_to_platform_instance: | ||
| "conn_abc123": "my_snowflake_account" | ||
| connection_to_database: | ||
| "conn_abc123": "ANALYTICS_PROD" | ||
| ``` | ||
|
|
||
| Connection IDs can be found by calling the Omni `/v1/connections` API or from the Omni Admin UI. | ||
|
|
||
| :::note | ||
| If the Omni API key does not have permission to list connections (`403 Forbidden`), the connector will fall back to the `connection_to_platform` config overrides and continue ingestion without failing. | ||
| ::: |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,46 @@ | ||
| source: | ||
| type: omni | ||
| config: | ||
| # Coordinates | ||
| base_url: "https://your-org.omniapp.co/api" | ||
bearsandrhinos marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| # Credentials | ||
| api_key: "${OMNI_API_KEY}" | ||
|
|
||
| # Connection → warehouse stitching | ||
| # Map Omni connection IDs to DataHub platform names so that physical table | ||
| # URNs match what was ingested by your warehouse source connector. | ||
| connection_to_platform: | ||
| "conn_abc123": "snowflake" | ||
|
|
||
| # Optional: map connection IDs to platform instances | ||
| # connection_to_platform_instance: | ||
| # "conn_abc123": "my_snowflake_account" | ||
|
|
||
| # Optional: override the database name inferred from the Omni connection | ||
| # connection_to_database: | ||
| # "conn_abc123": "ANALYTICS_PROD" | ||
|
|
||
| # Optional: include workbook-only documents (not just published dashboards) | ||
| # include_workbook_only: false | ||
|
|
||
| # Optional: filter which models to ingest | ||
| # model_pattern: | ||
| # allow: | ||
| # - ".*" | ||
|
|
||
| # Optional: filter which documents (dashboards/workbooks) to ingest | ||
| # document_pattern: | ||
| # allow: | ||
| # - ".*" | ||
|
|
||
| # Optional: disable column-level lineage | ||
| # include_column_lineage: true | ||
|
|
||
| # Optional: stateful ingestion with stale entity removal | ||
| stateful_ingestion: | ||
| enabled: true | ||
| remove_stale_metadata: true | ||
|
|
||
| sink: | ||
| # sink configs | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Empty file.
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.