Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/changelog.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -249,7 +249,7 @@ Learn more about connecting Elementary to [Jira](https://docs.elementary-data.co

**Known Issues:**

* dbt-databricks must be <1.10.2 (See issue 1931 for more details)
* dbt-databricks must be below 1.10.2 (See issue 1931 for more details)

**Full Changelog**: [v0.18.3...v0.19.0](https://github.com/elementary-data/elementary/compare/v0.18.3...v0.19.0)

Expand Down
2 changes: 1 addition & 1 deletion docs/oss/release-notes/releases/0.19.0.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ This update includes:

**Known Issues:**

- dbt-databricks must be <1.10.2 (See [issue 1931](https://github.com/elementary-data/elementary/issues/1931) for more details)
- dbt-databricks must be below 1.10.2 (See [issue 1931](https://github.com/elementary-data/elementary/issues/1931) for more details)

Check out the full details here: https://github.com/elementary-data/elementary/releases/tag/v0.19.0

15 changes: 9 additions & 6 deletions docs/snippets/cloud/integrations/databricks.mdx
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import CreateServicePrincipal from '/snippets/dwh/databricks/create_service_principal.mdx';
import PermissionsAndSecurity from '/snippets/cloud/integrations/permissions-and-security.mdx';
import PermissionsAndSecurity from '/snippets/dwh/databricks/databricks_permissions_and_security.mdx';
import IpAllowlist from '/snippets/cloud/integrations/ip-allowlist.mdx';

You will connect Elementary Cloud to Databricks for syncing the Elementary schema (created by the [Elementary dbt package](/cloud/onboarding/quickstart-dbt-package)).
Expand All @@ -8,14 +8,17 @@ You will connect Elementary Cloud to Databricks for syncing the Elementary schem

<PermissionsAndSecurity />

### Fill the connection form
### Add an environment in Elementary (requires an admin user)

Provide the following fields:
In the Elementary platform, go to Environments in the left menu, and click on the "Create Environment" button.
Choose a name for your environment, and then choose Databricks as your data warehouse type.

- **Host**: The hostname of your Databricks account to connect to.
Provide the following fields in the form:

- **Server Host**: The hostname of your Databricks account to connect to.
- **Http path**: The path to the Databricks cluster or SQL warehouse.
- **Token**: The token you generated for Elementary. For more information, see [Generate a token](https://docs.databricks.com/aws/en/dev-tools/auth/pat#databricks-personal-access-tokens-for-service-principals) in the Databricks docs.
- **Access token**: The token you generated for the Elementary service principal (see step 7 under "Create service principal" above)
- **Catalog (optional)**: The name of the Databricks Catalog.
- **Elementary schema**: The name of your Elementary schema. Usually `[schema name]_elementary`.
- **Elementary schema**: The name of your Elementary schema. Usually `[your dbt target schema]_elementary`.

<IpAllowlist />
41 changes: 25 additions & 16 deletions docs/snippets/dwh/databricks/create_service_principal.mdx
Original file line number Diff line number Diff line change
@@ -1,37 +1,46 @@
### Create service principal

1. In your Databrick console, go to the admin settings by clicking your username in the to right corner -> Admin settings (add photo)
1. Open your Databricks console, and then open your relevant workspace.

2. Click on your Profile icon on the right and choose Settings.

<img
src="https://res.cloudinary.com/diuctyblm/image/upload/f_auto,q_auto/v1/dwh/databricks/admin_settings"
alt="Admin settings"
src="https://res.cloudinary.com/dgpojk42n/image/upload/v1763312536/databricks_01_choose_settings.png"
alt="Choose settings"
/>

2. Go to the Service principals tab, then click Add service principal (add photo)
3. On the sidebar, click on *Identity and access*, and then under the *Service Principals* row click on *Manage*.

<img
src="https://res.cloudinary.com/diuctyblm/image/upload/f_auto,q_auto/v1/dwh/databricks/service_principals_settings"
alt="Service principal settings"
src="https://res.cloudinary.com/dgpojk42n/image/upload/v1763313131/databricks_02_manage_service_principal_b5gc72.png"
alt="Choose settings"
/>

3. Give the service principal a good name (e.g elementary) and click Add (add photo)
4. Click on the *Add service principal* button, choose "Add new" and give a name to the service principal. This will be used by Elementary Cloud
to access your Databricks instance.

<img
src="https://res.cloudinary.com/diuctyblm/image/upload/f_auto,q_auto/v1/dwh/databricks/add_service_principal"
src="https://res.cloudinary.com/dgpojk42n/image/upload/v1763313361/databricks_04_add_service_principal_e54raz.png"
alt="Add service principal"
/>

4. Then, from the service principal configuration view, copy the Application Id (add photo)
5. Click on your newly created service principal, add the "Databricks SQL access" entitlement, and click Update. Also, please copy the
"Application ID" field as it will be used later in the permissions section.

<img
src="https://res.cloudinary.com/diuctyblm/image/upload/f_auto,q_auto/v1/dwh/databricks/service_principal_id"
alt="Service principal ID"
src="https://res.cloudinary.com/dgpojk42n/image/upload/v1763313542/databricks_05_add_databricks_sql_access_zzdcf7.png"
alt="Add databricks SQL access"
/>

4. Finally, run the following query:
6. Next, you may also need to allow Token Usage for this service principal (if it is not allowed for all users). To do so, under the settings menu choose Advanced -> Personal Access Tokens -> Permission Settings.
Then add the service principal there.

<img
src="https://res.cloudinary.com/dgpojk42n/image/upload/v1763316575/databricks_06_token_usage_eufjwv.png"
alt="Add databricks SQL access"
/>

```
GRANT SELECT ON SCHEMA <elementary_schema> TO `<service_principal_id>`;
```
7. Create a personal access token for your service principal. For more details, please click [here](https://docs.databricks.com/aws/en/dev-tools/auth/pat#databricks-personal-access-tokens-for-service-principals)

Make sure to replace the `<elementary_schema>` and `<service_principal_id>` placeholders with the correct values
8. Finally, in order to enable Elementary's automated monitors feature, please ensure [predictive optimization](https://docs.databricks.com/aws/en/optimizations/predictive-optimization#enable-or-disable-predictive-optimization-for-your-account) is enabled in your account.
This is required for table statistics to be updated (Elementary relies on this to obtain up-to-date row counts)
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
### Permissions and security

#### Required permissions

Elementary cloud requires the following permissions:

- **Elementary schema read-only access** - This is required by Elementary to read dbt metadata & test results collected by the Elementary dbt package as a part of your pipeline runs.
This permission does not give access to your data.

- **Information schema metadata access** - Elementary needs access to the `system.information_schema.tables` and `system.information_schema.columns` system tables, to get metadata
about existing tables and columns in your data warehouse. This is used to power features such as column-level lineage and automated volume & freshness monitors.

- **Read access needed for some metadata operations (optional)** - In order to enable Elementary's automated volume & freshness monitors, Elementary needs access to query history, as well
as Databricks APIs to obtain table statistics.
These operations require granting SELECT access on your tables. This is a Databricks limitation - Elementary **never** reads any data from your tables, only metadata. However, there isn't
today any table-level metadata-only permission available in Databricks, so SELECT is required.


#### Grants SQL template

Please use the following SQL statements to grant the permissions specified above (you should replace the placeholders with the correct values):

```sql
-- Grant read access on the elementary schema (usually [your dbt target schema]_elementary)
GRANT USE CATALOG ON CATALOG <catalog> TO `<service_principal_app_id>`;
GRANT USE SCHEMA, SELECT ON SCHEMA <elementary_schema> TO `<service_principal_app_id>`;

-- Grant access to information schema tables
GRANT USE CATALOG ON CATALOG system TO `<service_principal_app_id>`;
GRANT USE SCHEMA ON SCHEMA system.information_schema TO `<service_principal_app_id>`;
GRANT SELECT ON TABLE system.information_schema.tables TO `<service_principal_app_id>`;
GRANT SELECT ON TABLE system.information_schema.columns TO `<service_principal_app_id>`;

-- Grant select on tables for history & statistics access
-- (Optional, required for automated volume & freshness tests - see explanation above. You can also limit to specific schemas used by dbt instead of granting on the full catalog)
GRANT USE CATALOG, USE SCHEMA, SELECT ON catalog <catalog> to `<service_principal_app_id>`;
```
Loading