diff --git a/docs/changelog.mdx b/docs/changelog.mdx index 7c9c5f177..7dade9393 100644 --- a/docs/changelog.mdx +++ b/docs/changelog.mdx @@ -249,7 +249,7 @@ Learn more about connecting Elementary to [Jira](https://docs.elementary-data.co **Known Issues:** -* dbt-databricks must be <1.10.2 (See issue 1931 for more details) +* dbt-databricks must be below 1.10.2 (See issue 1931 for more details) **Full Changelog**: [v0.18.3...v0.19.0](https://github.com/elementary-data/elementary/compare/v0.18.3...v0.19.0) diff --git a/docs/oss/release-notes/releases/0.19.0.mdx b/docs/oss/release-notes/releases/0.19.0.mdx index 9293b4c32..6d410ff61 100644 --- a/docs/oss/release-notes/releases/0.19.0.mdx +++ b/docs/oss/release-notes/releases/0.19.0.mdx @@ -10,7 +10,7 @@ This update includes: **Known Issues:** -- dbt-databricks must be <1.10.2 (See [issue 1931](https://github.com/elementary-data/elementary/issues/1931) for more details) +- dbt-databricks must be below 1.10.2 (See [issue 1931](https://github.com/elementary-data/elementary/issues/1931) for more details) Check out the full details here: https://github.com/elementary-data/elementary/releases/tag/v0.19.0 diff --git a/docs/snippets/cloud/integrations/databricks.mdx b/docs/snippets/cloud/integrations/databricks.mdx index d13bef953..e3b513c83 100644 --- a/docs/snippets/cloud/integrations/databricks.mdx +++ b/docs/snippets/cloud/integrations/databricks.mdx @@ -1,5 +1,5 @@ import CreateServicePrincipal from '/snippets/dwh/databricks/create_service_principal.mdx'; -import PermissionsAndSecurity from '/snippets/cloud/integrations/permissions-and-security.mdx'; +import PermissionsAndSecurity from '/snippets/dwh/databricks/databricks_permissions_and_security.mdx'; import IpAllowlist from '/snippets/cloud/integrations/ip-allowlist.mdx'; You will connect Elementary Cloud to Databricks for syncing the Elementary schema (created by the [Elementary dbt package](/cloud/onboarding/quickstart-dbt-package)). @@ -8,14 +8,17 @@ You will connect Elementary Cloud to Databricks for syncing the Elementary schem -### Fill the connection form +### Add an environment in Elementary (requires an admin user) -Provide the following fields: +In the Elementary platform, go to Environments in the left menu, and click on the "Create Environment" button. +Choose a name for your environment, and then choose Databricks as your data warehouse type. -- **Host**: The hostname of your Databricks account to connect to. +Provide the following fields in the form: + +- **Server Host**: The hostname of your Databricks account to connect to. - **Http path**: The path to the Databricks cluster or SQL warehouse. -- **Token**: The token you generated for Elementary. For more information, see [Generate a token](https://docs.databricks.com/aws/en/dev-tools/auth/pat#databricks-personal-access-tokens-for-service-principals) in the Databricks docs. +- **Access token**: The token you generated for the Elementary service principal (see step 7 under "Create service principal" above) - **Catalog (optional)**: The name of the Databricks Catalog. -- **Elementary schema**: The name of your Elementary schema. Usually `[schema name]_elementary`. +- **Elementary schema**: The name of your Elementary schema. Usually `[your dbt target schema]_elementary`. diff --git a/docs/snippets/dwh/databricks/create_service_principal.mdx b/docs/snippets/dwh/databricks/create_service_principal.mdx index 094aa060c..b04eaa7cd 100644 --- a/docs/snippets/dwh/databricks/create_service_principal.mdx +++ b/docs/snippets/dwh/databricks/create_service_principal.mdx @@ -1,37 +1,46 @@ ### Create service principal -1. In your Databrick console, go to the admin settings by clicking your username in the to right corner -> Admin settings (add photo) +1. Open your Databricks console, and then open your relevant workspace. + +2. Click on your Profile icon on the right and choose Settings. Admin settings -2. Go to the Service principals tab, then click Add service principal (add photo) +3. On the sidebar, click on *Identity and access*, and then under the *Service Principals* row click on *Manage*. Service principal settings -3. Give the service principal a good name (e.g elementary) and click Add (add photo) +4. Click on the *Add service principal* button, choose "Add new" and give a name to the service principal. This will be used by Elementary Cloud +to access your Databricks instance. Add service principal -4. Then, from the service principal configuration view, copy the Application Id (add photo) +5. Click on your newly created service principal, add the "Databricks SQL access" entitlement, and click Update. Also, please copy the +"Application ID" field as it will be used later in the permissions section. Service principal ID -4. Finally, run the following query: +6. Next, you may also need to allow Token Usage for this service principal (if it is not allowed for all users). To do so, under the settings menu choose Advanced -> Personal Access Tokens -> Permission Settings. +Then add the service principal there. + +Add databricks SQL access -``` -GRANT SELECT ON SCHEMA TO ``; -``` +7. Create a personal access token for your service principal. For more details, please click [here](https://docs.databricks.com/aws/en/dev-tools/auth/pat#databricks-personal-access-tokens-for-service-principals) -Make sure to replace the `` and `` placeholders with the correct values +8. Finally, in order to enable Elementary's automated monitors feature, please ensure [predictive optimization](https://docs.databricks.com/aws/en/optimizations/predictive-optimization#enable-or-disable-predictive-optimization-for-your-account) is enabled in your account. +This is required for table statistics to be updated (Elementary relies on this to obtain up-to-date row counts) diff --git a/docs/snippets/dwh/databricks/databricks_permissions_and_security.mdx b/docs/snippets/dwh/databricks/databricks_permissions_and_security.mdx new file mode 100644 index 000000000..32ef78686 --- /dev/null +++ b/docs/snippets/dwh/databricks/databricks_permissions_and_security.mdx @@ -0,0 +1,37 @@ +### Permissions and security + +#### Required permissions + +Elementary cloud requires the following permissions: + +- **Elementary schema read-only access** - This is required by Elementary to read dbt metadata & test results collected by the Elementary dbt package as a part of your pipeline runs. +This permission does not give access to your data. + +- **Information schema metadata access** - Elementary needs access to the `system.information_schema.tables` and `system.information_schema.columns` system tables, to get metadata +about existing tables and columns in your data warehouse. This is used to power features such as column-level lineage and automated volume & freshness monitors. + +- **Read access needed for some metadata operations (optional)** - In order to enable Elementary's automated volume & freshness monitors, Elementary needs access to query history, as well +as Databricks APIs to obtain table statistics. +These operations require granting SELECT access on your tables. This is a Databricks limitation - Elementary **never** reads any data from your tables, only metadata. However, there isn't +today any table-level metadata-only permission available in Databricks, so SELECT is required. + + +#### Grants SQL template + +Please use the following SQL statements to grant the permissions specified above (you should replace the placeholders with the correct values): + +```sql +-- Grant read access on the elementary schema (usually [your dbt target schema]_elementary) +GRANT USE CATALOG ON CATALOG TO ``; +GRANT USE SCHEMA, SELECT ON SCHEMA TO ``; + +-- Grant access to information schema tables +GRANT USE CATALOG ON CATALOG system TO ``; +GRANT USE SCHEMA ON SCHEMA system.information_schema TO ``; +GRANT SELECT ON TABLE system.information_schema.tables TO ``; +GRANT SELECT ON TABLE system.information_schema.columns TO ``; + +-- Grant select on tables for history & statistics access +-- (Optional, required for automated volume & freshness tests - see explanation above. You can also limit to specific schemas used by dbt instead of granting on the full catalog) +GRANT USE CATALOG, USE SCHEMA, SELECT ON catalog to ``; +```