Skip to content

Commit 120de23

Browse files
authored
Databricks Volumes and Delta Tables connectors: New mini how-to video links; more links to 3rd-party docs; tables and volumes can be in the same schema or in separate ones (#466)
1 parent 3d5a315 commit 120de23

File tree

5 files changed

+164
-77
lines changed

5 files changed

+164
-77
lines changed

snippets/general-shared-text/databricks-delta-table-api-placeholders.mdx

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,15 @@
44
- `<token>` (_required_ for PAT authentication): For Databricks personal access token (PAT) authentication, the target Databricks user's PAT value.
55
- `<client-id>` and `<client-secret>` (_required_ for OAuth authentication): For Databricks OAuth machine-to-machine (M2M) authentication, the Databricks managed service principal's **UUID** (or **Client ID** or **Application ID**) and OAuth **Secret** (client secret) values.
66
- `<catalog>` (_required_): The name of the catalog in Unity Catalog for the target volume and table in the Databricks workspace.
7-
- `<database>`: The name of the database in Unity Catalog for the target volume and table. The default is `default` if not otherwise specified.
7+
- `<database>`: The name of the schema (formerly known as a database) in Unity Catalog for the target table. The default is `default` if not otherwise specified.
8+
9+
If the target table and volume are in the same schema (formerly known as a database), then `<database>` and `<schema>` will have the same values.
10+
811
- `<table-name>` (_required_): The name of the target table in Unity Catalog.
9-
- `<schema>`: The name of the schema in Unity Catalog for the target volume and table. The default is `default` if not otherwise specified.
12+
- `<schema>`: The name of the schema (formerly known as a database) in Unity Catalog for the target volume. The default is `default` if not otherwise specified.
13+
14+
If the target volume and table are in the same schema (formerly known as a database), then `<schema>` and `<database>` will have the same values.
15+
1016
- `<volume>` (_required_): The name of the target volume in Unity Catalog.
1117
- `<volume-path>`: Any target folder path inside of the volume to use instead of the volume's root. If not otherwise specified, processing occurs at the volume's root.
1218

snippets/general-shared-text/databricks-delta-table-cli-api.mdx

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -16,8 +16,11 @@ The following environment variables:
1616
- `DATABRICKS_CLIENT_ID` - For Databricks managed service principal authenticaton, the service principal's **UUID** (or **Client ID** or **Application ID**) value, represented by `--client-id` (CLI) or `client_id` (Python).
1717
- `DATABRICKS_CLIENT_SECRET` - For Databricks managed service principal authenticaton, the service principal's OAuth **Secret** value, represented by `--client-secret` (CLI) or `client_secret` (Python).
1818
- `DATABRICKS_CATALOG` - The name of the catalog in Unity Catalog, represented by `--catalog` (CLI) or `catalog` (Python).
19-
- `DATABRICKS_DATABASE` - The name of the schema (database) inside of the catalog, represented by `--database` (CLI) or `database` (Python). The default is `default` if not otherwise specified.
20-
- `DATABRICKS_TABLE` - The name of the table inside of the schema (database), represented by `--table-name` (CLI) or `table_name` (Python). The default is `elements` if not otherwise specified.
19+
- `DATABRICKS_DATABASE` - The name of the schema (formerly known as a database) inside of the catalog for the target table, represented by `--database` (CLI) or `database` (Python). The default is `default` if not otherwise specified.
20+
21+
If you are also using a volume, and the target table and volume are in the same schema (formerly known as a database), then `DATABRICKS_DATABASE` and `DATABRICKS_SCHEMA` will have the same values.
22+
23+
- `DATABRICKS_TABLE` - The name of the table inside of the schema (formerly known as a database), represented by `--table-name` (CLI) or `table_name` (Python). The default is `elements` if not otherwise specified.
2124

2225
For the SQL-based implementation, add these environment variables:
2326

@@ -26,7 +29,9 @@ For the SQL-based implementation, add these environment variables:
2629

2730
For the volume-based implementation, add these environment variables:
2831

29-
- `DATABRICKS_SCHEMA` - The name of the schema (database) inside of the catalog, represented by `--schema` (CLI) or `schema` (Python). This name of this database (schema) must be the same as
30-
the value of the `DATABRICKS_DATABASE` environment variable and is required for compatiblity. The default is `default` if not otherwise specified.
31-
- `DATABRICKS_VOLUME` - The name of the volume inside of the schema (database), represented by `--volume` (CLI) or `volume` (Python).
32+
- `DATABRICKS_SCHEMA` - The name of the schema (formerly known as a database) inside of the catalog for the target volume, represented by `--schema` (CLI) or `schema` (Python). The default is `default` if not otherwise specified.
33+
34+
If the target volume and table are in the same schema (formerly known as a database), then `DATABRICKS_SCHEMA` and `DATABRICKS_SCHEMA` will have the same values.
35+
36+
- `DATABRICKS_VOLUME` - The name of the volume inside of the schema (formerly known as a database), represented by `--volume` (CLI) or `volume` (Python).
3237
- `DATABRICKS_VOLUME_PATH` - Optionally, a specific path inside of the volume that you want to start accessing from, starting from the volume's root, represented by `--volume-path` (CLI) or `volume_path` (Python). The default is to start accessing from the volume's root if not otherwise specified.

snippets/general-shared-text/databricks-delta-table-platform.mdx

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,14 @@ Fill in the following fields:
66
- **Token** (_required_ for PAT authentication): For Databricks personal access token (PAT) authentication, the target Databricks user's PAT value.
77
- **UUID** and **OAuth Secret** (_required_ for OAuth authentication): For Databricks OAuth machine-to-machine (M2M) authentication, the Databricks managed service principal's **UUID** (or **Client ID** or **Application ID**) and OAuth **Secret** (client secret) values.
88
- **Catalog** (_required_): The name of the catalog in Unity Catalog for the target volume and table in the Databricks workspace.
9-
- **Database**: The name of the database in Unity Catalog for the target volume and table. The default is `default` if not otherwise specified.
9+
- **Database**: The name of the schema (formerly known as a database) in Unity Catalog for the target table. The default is `default` if not otherwise specified.
10+
11+
If the target table and volume are in the same schema (formerly known as a database), then **Database** and **Schema** will have the same names.
12+
1013
- **Table Name** (_required_): The name of the target table in Unity Catalog.
11-
- **Schema**: The name of the schema in Unity Catalog for the target volume and table. The default is `default` if not otherwise specified.
14+
- **Schema**: The name of the schema (formerly known as a database) in Unity Catalog for the target volume. The default is `default` if not otherwise specified.
15+
16+
If the target volume and table are in the same schema (formerly known as a database), then **Schema** and **Database** will have the same names.
17+
1218
- **Volume** (_required_): The name of the target volume in Unity Catalog.
1319
- **Volume Path**: Any target folder path inside of the volume to use instead of the volume's root. If not otherwise specified, processing occurs at the volume's root.

snippets/general-shared-text/databricks-delta-table.mdx

Lines changed: 84 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,35 @@
99
- A SQL warehouse for [AWS](https://docs.databricks.com/compute/sql-warehouse/create.html),
1010
[Azure](https://learn.microsoft.com/azure/databricks/compute/sql-warehouse/create), or
1111
[GCP](https://docs.gcp.databricks.com/compute/sql-warehouse/create.html).
12+
13+
The following video shows how to create a SQL warehouse if you do not already have one available, get its **Server Hostname** and **HTTP Path** values, and set permissions for someone other than the warehouse's owner to use it:
14+
15+
<iframe
16+
width="560"
17+
height="315"
18+
src="https://www.youtube.com/embed/N-Aw9-U3_fE"
19+
title="YouTube video player"
20+
frameborder="0"
21+
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
22+
allowfullscreen
23+
></iframe>
24+
1225
- An all-purpose cluster for [AWS](https://docs.databricks.com/compute/use-compute.html),
1326
[Azure](https://learn.microsoft.com/azure/databricks/compute/use-compute), or
1427
[GCP](https://docs.gcp.databricks.com/compute/use-compute.html).
1528

29+
The following video shows how to create an all-purpose cluster if you do not already have one available, get its **Server Hostname** and **HTTP Path** values, and set permissions for someone other than the cluster's owner to use it:
30+
31+
<iframe
32+
width="560"
33+
height="315"
34+
src="https://www.youtube.com/embed/apgibaelVY0"
35+
title="YouTube video player"
36+
frameborder="0"
37+
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
38+
allowfullscreen
39+
></iframe>
40+
1641
- The SQL warehouse's or cluster's **Server Hostname** and **HTTP Path** values for [AWS](https://docs.databricks.com/integrations/compute-details.html),
1742
[Azure](https://learn.microsoft.com/azure/databricks/integrations/compute-details), or
1843
[GCP](https://docs.gcp.databricks.com/integrations/compute-details.html).
@@ -25,7 +50,7 @@
2550
for [AWS](https://docs.databricks.com/catalogs/create-catalog.html),
2651
[Azure](https://learn.microsoft.com/azure/databricks/catalogs/create-catalog), or
2752
[GCP](https://docs.gcp.databricks.com/catalogs/create-catalog.html).
28-
- A schema
53+
- A schema (formerly known as a database)
2954
for [AWS](https://docs.databricks.com/schemas/create-schema.html),
3055
[Azure](https://learn.microsoft.com/azure/databricks/schemas/create-schema), or
3156
[GCP](https://docs.gcp.databricks.com/schemas/create-schema.html)
@@ -34,7 +59,19 @@
3459
for [AWS](https://docs.databricks.com/tables/managed.html),
3560
[Azure](https://learn.microsoft.com/azure/databricks/tables/managed), or
3661
[GCP](https://docs.gcp.databricks.com/tables/managed.html)
37-
within that schema.
62+
within that schema (formerly known as a database).
63+
64+
The following video shows how to create a catalog, schema (formerly known as a database), and a table in Unity Catalog if you do not already have them available, and set privileges for someone other than their owner to use them:
65+
66+
<iframe
67+
width="560"
68+
height="315"
69+
src="https://www.youtube.com/embed/ffNnq-6bpd4"
70+
title="YouTube video player"
71+
frameborder="0"
72+
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
73+
allowfullscreen
74+
></iframe>
3875

3976
This table must contain the following column names and their data types:
4077

@@ -86,22 +123,33 @@
86123
);
87124
```
88125

126+
<Info>
127+
In Databricks, a table's _schema_ is different than a _schema_ (formerly known as a database) in a catalog-schema object relationship in Unity Catalog.
128+
</Info>
129+
89130
- Within Unity Catalog, a volume
90131
for [AWS](https://docs.databricks.com/volumes/utility-commands.html),
91132
[Azure](https://learn.microsoft.com/azure/databricks/volumes/utility-commands),
92-
or [GCP](https://docs.gcp.databricks.com/volumes/utility-commands.html)
93-
within the same schema as the table.
94-
- For Databricks personal access token authentication to the workspace, the
95-
Databricks personal access token value for
96-
[AWS](https://docs.databricks.com/dev-tools/auth/pat.html#databricks-personal-access-tokens-for-workspace-users),
97-
[Azure](https://learn.microsoft.com/azure/databricks/dev-tools/auth/pat#azure-databricks-personal-access-tokens-for-workspace-users), or
98-
[GCP](https://docs.gcp.databricks.com/dev-tools/auth/pat.html#databricks-personal-access-tokens-for-workspace-users).
99-
This token must be for the workspace user who
100-
has the appropriate access permissions to the catalog, schema, table, volume, and cluster or SQL warehouse,
133+
or [GCP](https://docs.gcp.databricks.com/volumes/utility-commands.html). The volume can be in the same
134+
schema (formerly known as a database) as the table, or the volume and table can be in separate schemas. In either case, both of these
135+
schemas must share the same parent catalog.
136+
137+
The following video shows how to create a catalog, schema (formerly known as a database), and a volume in Unity Catalog if you do not already have them available, and set privileges for someone other than their owner to use them:
138+
139+
<iframe
140+
width="560"
141+
height="315"
142+
src="https://www.youtube.com/embed/yF9DJphhQQc"
143+
title="YouTube video player"
144+
frameborder="0"
145+
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
146+
allowfullscreen
147+
></iframe>
148+
101149
- For Databricks managed service principal authentication (using Databricks OAuth M2M) to the workspace:
102150

103151
- A Databricks managed service principal.
104-
This service principal must have the appropriate access permissions to the catalog, schema, table, volume, and cluster or SQL warehouse.
152+
This service principal must have the appropriate access permissions to the catalog, schema (formerly known as a database), table, volume, and cluster or SQL warehouse.
105153
- The service principal's **UUID** (or **Client ID** or **Application ID**) value.
106154
- The OAuth **Secret** value for the service principal.
107155

@@ -110,11 +158,11 @@
110158
[GCP](https://docs.gcp.databricks.com/dev-tools/auth/oauth-m2m.html).
111159

112160
<Note>
113-
For Azure Databricks, this connector only supports Databricks managed service principals.
161+
For Azure Databricks, this connector only supports Databricks managed service principals for authentication.
114162
Microsoft Entra ID managed service principals are not supported.
115163
</Note>
116164

117-
The following video shows how to create a Databricks managed service principal:
165+
The following video shows how to create a Databricks managed service principal if you do not already have one available:
118166

119167
<iframe
120168
width="560"
@@ -126,6 +174,26 @@
126174
allowfullscreen
127175
></iframe>
128176

177+
- For Databricks personal access token authentication to the workspace, the
178+
Databricks personal access token value for
179+
[AWS](https://docs.databricks.com/dev-tools/auth/pat.html#databricks-personal-access-tokens-for-workspace-users),
180+
[Azure](https://learn.microsoft.com/azure/databricks/dev-tools/auth/pat#azure-databricks-personal-access-tokens-for-workspace-users), or
181+
[GCP](https://docs.gcp.databricks.com/dev-tools/auth/pat.html#databricks-personal-access-tokens-for-workspace-users).
182+
This token must be for the workspace user who
183+
has the appropriate access permissions to the catalog, schema (formerly known as a database), table, volume, and cluster or SQL warehouse,
184+
185+
The following video shows how to create a Databricks personal access token if you do not already have one available:
186+
187+
<iframe
188+
width="560"
189+
height="315"
190+
src="https://www.youtube.com/embed/OzEU2miAS6I"
191+
title="YouTube video player"
192+
frameborder="0"
193+
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
194+
allowfullscreen
195+
></iframe>
196+
129197
- The Databricks workspace user or Databricks managed service principal must have the following _minimum_ set of permissions and privileges to write to an
130198
existing volume or table in Unity Catalog:
131199

@@ -140,30 +208,18 @@
140208
- To access a Unity Catalog volume, the following privileges:
141209

142210
- `USE CATALOG` on the volume's parent catalog in Unity Catalog.
143-
- `USE SCHEMA` on the volume's parent schema in Unity Catalog.
211+
- `USE SCHEMA` on the volume's parent schema (formerly known as a database) in Unity Catalog.
144212
- `READ VOLUME` and `WRITE VOLUME` on the volume.
145213

146214
Learn how to check and set Unity Catalog privileges for
147215
[AWS](https://docs.databricks.com/data-governance/unity-catalog/manage-privileges/index.html#show-grant-and-revoke-privileges),
148216
[Azure](https://learn.microsoft.com/azure/databricks/data-governance/unity-catalog/manage-privileges/#grant), or
149217
[GCP](https://docs.gcp.databricks.com/data-governance/unity-catalog/manage-privileges/index.html#show-grant-and-revoke-privileges).
150218

151-
The following videos shows how to grant a Databricks managed service principal privileges to a Unity Catalog volume:
152-
153-
<iframe
154-
width="560"
155-
height="315"
156-
src="https://www.youtube.com/embed/DykQRxgh2aQ"
157-
title="YouTube video player"
158-
frameborder="0"
159-
allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture"
160-
allowfullscreen
161-
></iframe>
162-
163219
- To access a Unity Catalog table, the following privileges:
164220

165221
- `USE CATALOG` on the table's parent catalog in Unity Catalog.
166-
- `USE SCHEMA` on the tables's parent schema in Unity Catalog.
222+
- `USE SCHEMA` on the tables's parent schema (formerly known as a database) in Unity Catalog.
167223
- `MODIFY` and `SELECT` on the table.
168224

169225
Learn how to check and set Unity Catalog privileges for

0 commit comments

Comments
 (0)