You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/connections/storage/databricks-delta-lake/databricks-delta-lake-aws.md
+13-13Lines changed: 13 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,9 +15,9 @@ This page will help you connect the Databricks Destination with AWS (S3).
15
15
Please note the following prerequisites for setup.
16
16
17
17
1. The target Databricks workspace must be Unity Catalog enabled. Segment doesn't support the Hive metastore. Visit the Databricks guide [enabling the Unity Catalog](https://docs.databricks.com/en/data-governance/unity-catalog/enable-workspaces.html){:target="_blank"} for more information.
18
-
2.The user completing the setup needs the following permissions:
19
-
- AWS: The ability to create an S3 bucket and IAM role.
20
-
- Databricks: Admin access at the account and workspace level.
18
+
2.You'll need the following permissions for setup:
19
+
-**AWS**: The ability to create an S3 bucket and IAM role.
20
+
-**Databricks**: Admin access at the account and workspace level.
21
21
22
22
## Authentication
23
23
@@ -26,11 +26,11 @@ Segment supports both OAuth and personal access token (PAT) for authentication.
26
26
## Key terms
27
27
28
28
As you set up Databricks, keep the following key terms in mind.
29
-
1.**Databricks Workspace URL**: The base URL for your Databricks workspace.
30
-
2.**Service principal Application ID**: The ID tied to the service principal you'll create for Segment.
31
-
3.**Service Principal Secret/Token**: The client secret or PAT you'll create for the service principal.
32
-
4.**Target Unity Catalog**: The catalog where Segment lands your data.
33
-
5.**Workspace Admin Token** (*PAT only*): The access token you'll generate for your Databricks workspace admin.
29
+
-**Databricks Workspace URL**: The base URL for your Databricks workspace.
30
+
-**Service principal Application ID**: The ID tied to the service principal you'll create for Segment.
31
+
-**Service Principal Secret/Token**: The client secret or PAT you'll create for the service principal.
32
+
-**Target Unity Catalog**: The catalog where Segment lands your data.
33
+
-**Workspace Admin Token** (*PAT only*): The access token you'll generate for your Databricks workspace admin.
34
34
35
35
## Setup
36
36
@@ -45,7 +45,7 @@ The workspace URL is used by you and Segment to access your workspace API.
45
45
46
46
The service principal is used by Segment to access your Databricks workspace and associated APIs.
47
47
1. Follow the Databricks [guide](https://docs.databricks.com/en/administration-guide/users-groups/service-principals.html#manage-service-principals-in-your-account){:target="_blank"} for adding a service principal to your account and assigning to the workspace. This name can be anything, but Segment recommends something that identifies the purpose (for example, `Segment Storage Destinations`). Note the Application ID that Databricks generates for later use. Segment doesn't require Account admin or Marketplace admin roles.
48
-
2. (*OAuth only*) Follow the Databricks instructions to [generate an OAuth secret](https://docs.databricks.com/en/dev-tools/authentication-oauth.html#step-2-create-an-oauth-secret-for-a-service-principal){:target="_blank"}. Note the secret generated by Databricks for later use. Once you navigate away from this page the Secret is no longer visible. If you lose or forget the secret, you can delete the existing secret and create a new one.
48
+
2. (*OAuth only*) Follow the Databricks instructions to [generate an OAuth secret](https://docs.databricks.com/en/dev-tools/authentication-oauth.html#step-2-create-an-oauth-secret-for-a-service-principal){:target="_blank"}. Note the secret generated by Databricks for later use. Once you navigate away from this page the secret is no longer visible. If you lose or forget the secret, you can delete the existing secret and create a new one.
49
49
50
50
### Step 3: Enable entitlements for the service principal on the workspace
51
51
@@ -54,7 +54,7 @@ This step allows the Segment service principal to create and use a small SQL war
54
54
55
55
### Step 4: Create an external location and storage credentials
56
56
57
-
This step creates the storage location where Segment lands your Delta lake and the associated credentials Segment uses to access the storage.
57
+
This step creates the storage location where Segment lands your delta lake and the associated credentials Segment uses to access the storage.
58
58
1. Follow the Databricks guide for [managing external locations and storage credentials](https://docs.databricks.com/en/data-governance/unity-catalog/manage-external-locations-and-credentials.html){:target="_blank"}. This guide assumes the target S3 bucket already exists. If not, follow the [AWS guide](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html){:target="_blank"} for creating a bucket.
59
59
2. Once the external location and storage credentials are created in your Databricks workspace, update the permissions to allow access to the Segment service principal.
60
60
1. In your workspace, navigate to **Data > External Data > Storage Credentials**.
@@ -104,7 +104,7 @@ The Trust policy should look like:
The workspace admin access token is used by your Databricks workspace admin to generate a personal access token for the service principal.
107
+
Your Databricks workspace admin uses the workspace admin access token to generate a personal access token for the service principal.
108
108
1. Follow the Databricks guide for [generating personal access tokens](https://docs.databricks.com/en/dev-tools/auth.html#databricks-personal-access-tokens-for-workspace-users){:target="_blank"} for workspace users. Note the generated token for later use.
109
109
110
110
### Step 6: Enable personal access tokens for the workspace (PAT only)
@@ -115,7 +115,7 @@ This step allows the creation and use of personal access tokens for the workspac
115
115
116
116
### Step 7: Generate a personal access token for the service principal (PAT only)
117
117
118
-
The personal access token is the token used by Segment to access the Databricks workspace API. The Databricks UI doesn't allow for the creation of service principal tokens. Tokens must be generated using either the Databricks workspace API (*recommended*) or the Databricks CLI.
118
+
Segment uses the personal access token to access the Databricks workspace API. The Databricks UI doesn't allow for the creation of service principal tokens. Tokens must be generated using either the Databricks workspace API (*recommended*) or the Databricks CLI.
119
119
1. Generating a token requires the following values:
120
120
-**Databricks Workspace URL**: The base URL to your Databricks workspace.
121
121
-**Workspace Admin Token**: The token generated for your Databricks admin user.
This catalog is the target catalog where Segment lands your schemas/tables.
142
142
1. Follow the Databricks guide for [creating a catalog](https://docs.databricks.com/en/data-governance/unity-catalog/create-catalogs.html#create-a-catalog){:target="_blank"}.
143
-
- Be sure to select the storage location created earlier. You can use any valid catalog name (for example, "Segment"). Note the catalog name for later use.
143
+
- Be sure to select the storage location created earlier. You can use any valid catalog name (for example, "Segment"). Note this name for later use.
144
144
2. Select the catalog you've just created.
145
145
1. Select the Permissions tab, then click **Grant**
146
146
2. Select the Segment service principal from the dropdown, and check `ALL PRIVILEGES`.
Copy file name to clipboardExpand all lines: src/connections/storage/databricks-delta-lake/databricks-delta-lake-azure.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -14,17 +14,17 @@ This page will help you connect the Databricks Destination with Azure.
14
14
15
15
Please note the following pre-requisites for setup.
16
16
17
-
1.The target Databricks workspace must be Unity Catalog enabled. Segment doesn't support the Hive megastore. Visit the Databricks guide for [enabling Unity Catalog](https://learn.microsoft.com/en-us/azure/databricks/data-governance/unity-catalog/enable-workspaces){:target="_blank"} for more info.
18
-
2.The user completing setup needs the following permissions:
17
+
1.Your Databricks workspace must be Unity Catalog enabled. Segment doesn't support the Hive metastore. Visit the Databricks guide for [enabling Unity Catalog](https://learn.microsoft.com/en-us/azure/databricks/data-governance/unity-catalog/enable-workspaces){:target="_blank"} for more info.
18
+
2.You'll need the following permissions for setup:
19
19
-**Azure**: Ability to create service principals, as well as create and manage the destination storage container and its associated role assignments.
20
20
-**Databricks**: Admin access to the account and workspace level.
21
21
22
22
## Key terms
23
23
24
24
As you set up Databricks, keep the following key terms in mind.
25
25
26
-
1.**Databricks Workspace URL**: The base URL for your Databricks workspace.
27
-
2.**Target Unity Catalog**: The catalog where Segment lands your data.
26
+
-**Databricks Workspace URL**: The base URL for your Databricks workspace.
27
+
-**Target Unity Catalog**: The catalog where Segment lands your data.
28
28
29
29
## Set up Databricks with Azure
30
30
@@ -38,7 +38,7 @@ Check your browser's address bar when in your workspace. The workspace URL will
38
38
39
39
### Step 2: Add the Segment Storage Destinations service principal to your Entra ID (Active Directory)
40
40
41
-
The service principal is used by Segment to access your Databricks workspace APIs as well as your ADLS Gen2 storage container. You can use either Azure PowerShell or the Azure CLI.
41
+
Segment uses the service principal to access your Databricks workspace APIs as well as your ADLS Gen2 storage container. You can use either Azure PowerShell or the Azure CLI.
42
42
43
43
1.**Recommended**: Azure PowerShell
44
44
1. Log in to the Azure console with a user allowed to add new service principals.
@@ -84,15 +84,15 @@ This step allows Segment to access your workspace.
84
84
85
85
### Step 5: Enable entitlements for the service principal on the workspace
86
86
87
-
This step allows the Segment service principal to create and use a small SQL warehouse to create and update table schemas in the Unity Catalog.
87
+
This step allows the Segment service principal to create a small SQL warehouse for creating and updating table schemas in the Unity Catalog.
88
88
89
89
1. Follow the [managing workspace entitlements](https://learn.microsoft.com/en-us/azure/databricks/administration-guide/users-groups/service-principals#--manage-workspace-entitlements-for-a-service-principal){:target="_blank"} instructions for a service principal. Segment requires `Allow cluster creation` and `Databricks SQL access` entitlements.
90
90
91
91
### Step 6: Create an external location and storage credentials
92
92
93
-
This step creates the storage location where Segment lands your Delta lake and the associated credentials Segment uses to access the storage.
93
+
This step creates the storage location where Segment lands your delta lake and the associated credentials Segment uses to access the storage.
94
94
1. Follow the Databricks guide for [managing external locations and storage credentials](https://learn.microsoft.com/en-us/azure/databricks/data-governance/unity-catalog/manage-external-locations-and-credentials){:target="_blank"}.
95
-
- Use the storage container that you updated in step 3.
95
+
- Use the storage container you updated in step 3.
96
96
- For storage credentials, you can use a service principal or managed identity.
97
97
2. Once you create the external location and storage credentials in your Databricks workspace, update the permissions to allow access to the Segment service principal.
98
98
- In your workspace, navigate to **Data > External Data > Storage Credientials**. Click the name of the credentials created above and go to the Permissions tab. Click **Grant**, then select the Segment service principal from the drop down. Select the following checkboxes:
0 commit comments