Skip to content

Commit 3d5a315

Browse files
authored
Databricks Volumes and Delta Tables connectors: Add more how-to video links, and required permissions details (#465)
1 parent 9233984 commit 3d5a315

7 files changed

+108
-63
lines changed

snippets/general-shared-text/databricks-delta-table-api-placeholders.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
- `<server-hostname>` (_required_): The target Databricks cluster's or SQL warehouse's **Server Hostname** value.
33
- `<http-path>` (_required_): The cluster's or SQL warehouse's **HTTP Path** value.
44
- `<token>` (_required_ for PAT authentication): For Databricks personal access token (PAT) authentication, the target Databricks user's PAT value.
5-
- `<client-id>` and `<client-secret>` (_required_ for OAuth authentication): For Databricks OAuth machine-to-machine (M2M) authentication, the Databricks managed service principal's **UUID** (client ID or Application ID) and OAuth **Secret** (client secret) values.
5+
- `<client-id>` and `<client-secret>` (_required_ for OAuth authentication): For Databricks OAuth machine-to-machine (M2M) authentication, the Databricks managed service principal's **UUID** (or **Client ID** or **Application ID**) and OAuth **Secret** (client secret) values.
66
- `<catalog>` (_required_): The name of the catalog in Unity Catalog for the target volume and table in the Databricks workspace.
77
- `<database>`: The name of the database in Unity Catalog for the target volume and table. The default is `default` if not otherwise specified.
88
- `<table-name>` (_required_): The name of the target table in Unity Catalog.

snippets/general-shared-text/databricks-delta-table-cli-api.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ The following environment variables:
1313
- `DATABRICKS_HOST` - The Databricks cluster's or SQL warehouse's **Server Hostname** value, represented by `--server-hostname` (CLI) or `server_hostname` (Python).
1414
- `DATABRICKS_HTTP_PATH` - The cluster's or SQL warehouse's **HTTP Path** value, represented by `--http-path` (CLI) or `http_path` (Python).
1515
- `DATABRICKS_TOKEN` - For Databricks personal access token authentication, the token's value, represented by `--token` (CLI) or `token` (Python).
16-
- `DATABRICKS_CLIENT_ID` - For Databricks managed service principal authenticaton, the service principal's **UUID** value, represented by `--client-id` (CLI) or `client_id` (Python).
16+
- `DATABRICKS_CLIENT_ID` - For Databricks managed service principal authenticaton, the service principal's **UUID** (or **Client ID** or **Application ID**) value, represented by `--client-id` (CLI) or `client_id` (Python).
1717
- `DATABRICKS_CLIENT_SECRET` - For Databricks managed service principal authenticaton, the service principal's OAuth **Secret** value, represented by `--client-secret` (CLI) or `client_secret` (Python).
1818
- `DATABRICKS_CATALOG` - The name of the catalog in Unity Catalog, represented by `--catalog` (CLI) or `catalog` (Python).
1919
- `DATABRICKS_DATABASE` - The name of the schema (database) inside of the catalog, represented by `--database` (CLI) or `database` (Python). The default is `default` if not otherwise specified.

snippets/general-shared-text/databricks-delta-table-platform.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ Fill in the following fields:
44
- **Server Hostname** (_required_): The target Databricks cluster's or SQL warehouse's **Server Hostname** value.
55
- **HTTP Path** (_required_): The cluster's or SQL warehouse's **HTTP Path** value.
66
- **Token** (_required_ for PAT authentication): For Databricks personal access token (PAT) authentication, the target Databricks user's PAT value.
7-
- **UUID** and **OAuth Secret** (_required_ for OAuth authentication): For Databricks OAuth machine-to-machine (M2M) authentication, the Databricks managed service principal's **UUID** (client ID or Application ID) and OAuth **Secret** (client secret) values.
7+
- **UUID** and **OAuth Secret** (_required_ for OAuth authentication): For Databricks OAuth machine-to-machine (M2M) authentication, the Databricks managed service principal's **UUID** (or **Client ID** or **Application ID**) and OAuth **Secret** (client secret) values.
88
- **Catalog** (_required_): The name of the catalog in Unity Catalog for the target volume and table in the Databricks workspace.
99
- **Database**: The name of the database in Unity Catalog for the target volume and table. The default is `default` if not otherwise specified.
1010
- **Table Name** (_required_): The name of the target table in Unity Catalog.

snippets/general-shared-text/databricks-delta-table.mdx

Lines changed: 60 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
- A SQL warehouse for [AWS](https://docs.databricks.com/compute/sql-warehouse/create.html),
1010
[Azure](https://learn.microsoft.com/azure/databricks/compute/sql-warehouse/create), or
1111
[GCP](https://docs.gcp.databricks.com/compute/sql-warehouse/create.html).
12-
- A cluster for [AWS](https://docs.databricks.com/compute/use-compute.html),
12+
- An all-purpose cluster for [AWS](https://docs.databricks.com/compute/use-compute.html),
1313
[Azure](https://learn.microsoft.com/azure/databricks/compute/use-compute), or
1414
[GCP](https://docs.gcp.databricks.com/compute/use-compute.html).
1515

@@ -102,7 +102,7 @@
102102

103103
- A Databricks managed service principal.
104104
This service principal must have the appropriate access permissions to the catalog, schema, table, volume, and cluster or SQL warehouse.
105-
- The service principal's **UUID** value.
105+
- The service principal's **UUID** (or **Client ID** or **Application ID**) value.
106106
- The OAuth **Secret** value for the service principal.
107107

108108
To get this information, see Steps 1-3 of the instructions for [AWS](https://docs.databricks.com/dev-tools/auth/oauth-m2m.html),
@@ -112,4 +112,61 @@
112112
<Note>
113113
For Azure Databricks, this connector only supports Databricks managed service principals.
114114
Microsoft Entra ID managed service principals are not supported.
115-
</Note>
115+
</Note>
116+
117+
The following video shows how to create a Databricks managed service principal:
118+
119+
<iframe
120+
width="560"
121+
height="315"
122+
src="https://www.youtube.com/embed/wBmqv5DaA1E"
123+
title="YouTube video player"
124+
frameborder="0"
125+
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
126+
allowfullscreen
127+
></iframe>
128+
129+
- The Databricks workspace user or Databricks managed service principal must have the following _minimum_ set of permissions and privileges to write to an
130+
existing volume or table in Unity Catalog:
131+
132+
- To use an all-purpose cluster for access, `Can Restart` permission on that cluster. Learn how to check and set cluster permissions for
133+
[AWS](https://docs.databricks.com/compute/clusters-manage.html#compute-permissions),
134+
[Azure](https://learn.microsoft.com/azure/databricks/compute/clusters-manage#cluster-level-permissions), or
135+
[GCP](https://docs.gcp.databricks.com/compute/clusters-manage.html#compute-permissions).
136+
- To use a SQL warehouse for access, `Can use` permission on that SQL warehouse. Learn how to check and set SQL warehouse permissions for
137+
[AWS](https://docs.databricks.com/compute/sql-warehouse/create.html#manage-a-sql-warehouse),
138+
[Azure](https://learn.microsoft.com/azure/databricks/compute/sql-warehouse/create#manage), or
139+
[GCP](https://docs.gcp.databricks.com/compute/sql-warehouse/create.html#manage-a-sql-warehouse).
140+
- To access a Unity Catalog volume, the following privileges:
141+
142+
- `USE CATALOG` on the volume's parent catalog in Unity Catalog.
143+
- `USE SCHEMA` on the volume's parent schema in Unity Catalog.
144+
- `READ VOLUME` and `WRITE VOLUME` on the volume.
145+
146+
Learn how to check and set Unity Catalog privileges for
147+
[AWS](https://docs.databricks.com/data-governance/unity-catalog/manage-privileges/index.html#show-grant-and-revoke-privileges),
148+
[Azure](https://learn.microsoft.com/azure/databricks/data-governance/unity-catalog/manage-privileges/#grant), or
149+
[GCP](https://docs.gcp.databricks.com/data-governance/unity-catalog/manage-privileges/index.html#show-grant-and-revoke-privileges).
150+
151+
The following videos shows how to grant a Databricks managed service principal privileges to a Unity Catalog volume:
152+
153+
<iframe
154+
width="560"
155+
height="315"
156+
src="https://www.youtube.com/embed/DykQRxgh2aQ"
157+
title="YouTube video player"
158+
frameborder="0"
159+
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
160+
allowfullscreen
161+
></iframe>
162+
163+
- To access a Unity Catalog table, the following privileges:
164+
165+
- `USE CATALOG` on the table's parent catalog in Unity Catalog.
166+
- `USE SCHEMA` on the tables's parent schema in Unity Catalog.
167+
- `MODIFY` and `SELECT` on the table.
168+
169+
Learn how to check and set Unity Catalog privileges for
170+
[AWS](https://docs.databricks.com/data-governance/unity-catalog/manage-privileges/index.html#show-grant-and-revoke-privileges),
171+
[Azure](https://learn.microsoft.com/azure/databricks/data-governance/unity-catalog/manage-privileges/#grant), or
172+
[GCP](https://docs.gcp.databricks.com/data-governance/unity-catalog/manage-privileges/index.html#show-grant-and-revoke-privileges).
Lines changed: 3 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,8 @@
11
- `<name>` (_required_) - A unique name for this connector.
22
- `<host>` (_required_) - The Databricks workspace host URL.
3-
- `<client-id>` (_required_) - The application ID value for the Databricks-managed service principal that has access to the volume.
4-
- `<client-secret>` (_required_) - The associated OAuth secret value for the Databricks-managed service principal that has access to the volume.
3+
- `<client-id>` (_required_) - The **Client ID** (or **UUID** or **Application ID**) value for the Databricks managed service principal that has the appropriate privileges to the volume.
4+
- `<client-secret>` (_required_) - The associated OAuth **Secret** value for the Databricks managed service principal that has the appropriate privileges to the volume.
55
- `<catalog>` (_required_) - The name of the catalog to use.
66
- `<schema>` - The name of the associated schema. If not specified, `default` is used.
77
- `<volume>` (_required_) - The name of the associated volume.
8-
- `<volume-path>` - Any optional path to access within the volume.
9-
10-
To learn how to create a Databricks-managed service principal, get its application ID, and generate an associated OAuth secret,
11-
see the documentation for
12-
[AWS](https://docs.databricks.com/dev-tools/auth/oauth-m2m.html),
13-
[Azure](https://learn.microsoft.com/databricks/dev-tools/auth/oauth-m2m),
14-
or [GCP](https://docs.gcp.databricks.com/dev-tools/auth/oauth-m2m.html).
15-
16-
For Azure, only Databricks-managed service principals are supported. Microsoft Entra ID-managed service principals are not supported.
17-
18-
To learn how to grant a Databricks-managed service principal access to a volume, see the documentation for
19-
[AWS](https://docs.databricks.com/volumes/utility-commands.html#change-permissions-on-a-volume),
20-
[Azure](https://learn.microsoft.com/azure/databricks/volumes/utility-commands#change-permissions-on-a-volume),
21-
or [GCP](https://docs.gcp.databricks.com/volumes/utility-commands.html#change-permissions-on-a-volume).
8+
- `<volume-path>` - Any optional path to access within the volume.

snippets/general-shared-text/databricks-volumes-platform.mdx

Lines changed: 2 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -6,21 +6,6 @@ Fill in the following fields:
66
- **Schema** : The name of the associated schema. If not specified, **default** is used.
77
- **Volume** (_required_): The name of the associated volume.
88
- **Volume Path** : Any optional path to access within the volume.
9-
- **Client Secret** (_required_): The associated OAuth secret value for the Databricks-managed service principal that has access to the volume.
10-
- **Client ID** (_required_): The application ID value for the Databricks-managed service principal that has access to the volume.
11-
12-
To learn how to create a Databricks-managed service principal, get its application ID, and generate an associated OAuth secret,
13-
see the documentation for
14-
[AWS](https://docs.databricks.com/dev-tools/auth/oauth-m2m.html),
15-
[Azure](https://learn.microsoft.com/databricks/dev-tools/auth/oauth-m2m),
16-
or [GCP](https://docs.gcp.databricks.com/dev-tools/auth/oauth-m2m.html).
17-
18-
For Azure, only Databricks-managed service principals are supported. Microsoft Entra ID-managed service principals are not supported.
19-
20-
To learn how to grant a Databricks-managed service principal access to a volume, see the documentation for
21-
[AWS](https://docs.databricks.com/volumes/utility-commands.html#change-permissions-on-a-volume),
22-
[Azure](https://learn.microsoft.com/azure/databricks/volumes/utility-commands#change-permissions-on-a-volume),
23-
or [GCP](https://docs.gcp.databricks.com/volumes/utility-commands.html#change-permissions-on-a-volume).
24-
25-
9+
- **Client Secret** (_required_): The associated OAuth **Secret** value for the Databricks managed service principal that has the appropriate privileges to the volume.
10+
- **Client ID** (_required_): The **Client ID** (or **UUID** or **Application ID**) value for the Databricks managed service principal that has appropriate privileges to the volume.
2611

snippets/general-shared-text/databricks-volumes.mdx

Lines changed: 40 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@ allowfullscreen
1010

1111
The preceding video shows how to use Databricks personal access tokens (PATs), which are supported only for [Unstructured Ingest](/ingestion/overview).
1212

13-
To learn how to use Databricks-managed service principals, which are supported by both the [Unstructured Platform](/platform/overview) and Unstructured Ingest,
14-
see the additional videos later on this page.
13+
To learn how to use Databricks managed service principals, which are supported by both the [Unstructured Platform](/platform/overview) and Unstructured Ingest,
14+
see the additional video later on this page.
1515

1616
- The Databricks workspace URL. Get the workspace URL for
1717
[AWS](https://docs.databricks.com/workspace/workspace-details.html#workspace-instance-names-urls-and-ids),
@@ -29,7 +29,7 @@ see the additional videos later on this page.
2929
[Azure](https://learn.microsoft.com/azure/databricks/dev-tools/auth/),
3030
or [GCP](https://docs.gcp.databricks.com/dev-tools/auth/index.html).
3131

32-
The following videos show how to create a Databricks-managed service principal and then grant it access to a Databricks volume:
32+
The following video shows how to create a Databricks managed service principal:
3333

3434
<iframe
3535
width="560"
@@ -41,20 +41,9 @@ see the additional videos later on this page.
4141
allowfullscreen
4242
></iframe>
4343

44-
<iframe
45-
width="560"
46-
height="315"
47-
src="https://www.youtube.com/embed/DykQRxgh2aQ"
48-
title="YouTube video player"
49-
frameborder="0"
50-
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
51-
allowfullscreen
52-
></iframe>
53-
54-
For the [Unstructured Platform](/platform/overview), only the following Databricks authentication type is supported:
55-
56-
- For OAuth machine-to-machine (M2M) authentication (AWS, Azure, and GCP): The client ID and OAuth secret values for the corresponding service principal.
57-
Note that for Azure, only Databricks-managed service principals are supported. Microsoft Entra ID-managed service principals are not supported.
44+
For the [Unstructured Platform](/platform/overview), only Databricks OAuth machine-to-machine (M2M) authentication is supported for AWS, Azure, and GCP.
45+
You will need the the **Client ID** (or **UUID** or **Application** ID) and OAuth **Secret** (client secret) values for the corresponding service principal.
46+
Note that for Azure, only Databricks managed service principals are supported. Microsoft Entra ID managed service principals are not supported.
5847

5948
For [Unstructured Ingest](/ingestion/overview), the following Databricks authentication types are supported:
6049

@@ -69,10 +58,37 @@ see the additional videos later on this page.
6958
- For Google Cloud Platform credentials authentication (GCP only): The local path to the corresponding Google Cloud service account's credentials file.
7059
- For Google Cloud Platform ID authentication (GCP only): The Google Cloud service account's email address.
7160

72-
- The Databricks catalog name for the volume. Get the catalog name for [AWS](https://docs.databricks.com/catalogs/manage-catalog.html), [Azure](https://learn.microsoft.com/azure/databricks/catalogs/manage-catalog), or [GCP](https://docs.gcp.databricks.com/catalogs/manage-catalog.html).
73-
- The Databricks schema name for the volume. Get the schema name for [AWS](https://docs.databricks.com/schemas/manage-schema.html), [Azure](https://learn.microsoft.com/azure/databricks/schemas/manage-schema), or [GCP](https://docs.gcp.databricks.com/schemas/manage-schema.html).
74-
- The Databricks volume name, and optionally any path in that volume that you want to access directly. Get the volume information for [AWS](https://docs.databricks.com/files/volumes.html), [Azure](https://learn.microsoft.com/azure/databricks/files/volumes), or [GCP](https://docs.gcp.databricks.com/files/volumes.html).
75-
- Make sure that the target user or service principal has access to the target volume. To learn more, see the documentation for
76-
[AWS](https://docs.databricks.com/volumes/utility-commands.html#change-permissions-on-a-volume),
77-
[Azure](https://learn.microsoft.com/azure/databricks/volumes/utility-commands#change-permissions-on-a-volume),
78-
or [GCP](https://docs.gcp.databricks.com/volumes/utility-commands.html#change-permissions-on-a-volume).
61+
- The name of the parent catalog in Unity Catalog for
62+
[AWS](https://docs.databricks.com/catalogs/create-catalog.html),
63+
[Azure](https://learn.microsoft.com/azure/databricks/catalogs/create-catalog), or
64+
[GCP](https://docs.gcp.databricks.com/catalogs/create-catalog.html) for the volume.
65+
- The name of the parent schema in Unity Catalog for
66+
[AWS](https://docs.databricks.com/schemas/create-schema.html),
67+
[Azure](https://learn.microsoft.com/azure/databricks/schemas/create-schema), or
68+
[GCP](https://docs.gcp.databricks.com/schemas/create-schema.html) for the volume.
69+
- The name of the volume in Unity Catalog for [AWS](https://docs.databricks.com/tables/managed.html),
70+
[Azure](https://learn.microsoft.com/azure/databricks/tables/managed), or
71+
[GCP](https://docs.gcp.databricks.com/tables/managed.html), and optionally any path in that volume that you want to access directly, beginning with the volume's root.
72+
- The Databricks workspace user or service principal must have the following _minimum_ set of privileges to read from or write to the
73+
existing volume in Unity Catalog:
74+
75+
- `USE CATALOG` on the volume's parent catalog in Unity Catalog.
76+
- `USE SCHEMA` on the volume's parent schema in Unity Catalog.
77+
- `READ VOLUME` and `WRITE VOLUME` on the volume.
78+
79+
Learn how to check and set Unity Catalog privileges for
80+
[AWS](https://docs.databricks.com/data-governance/unity-catalog/manage-privileges/index.html#show-grant-and-revoke-privileges),
81+
[Azure](https://learn.microsoft.com/azure/databricks/data-governance/unity-catalog/manage-privileges/#grant), or
82+
[GCP](https://docs.gcp.databricks.com/data-governance/unity-catalog/manage-privileges/index.html#show-grant-and-revoke-privileges).
83+
84+
The following videos shows how to grant a Databricks managed service principal privileges to a Unity Catalog volume:
85+
86+
<iframe
87+
width="560"
88+
height="315"
89+
src="https://www.youtube.com/embed/DykQRxgh2aQ"
90+
title="YouTube video player"
91+
frameborder="0"
92+
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
93+
allowfullscreen
94+
></iframe>

0 commit comments

Comments
 (0)