You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/connections/storage/catalog/databricks-delta-lake/databricks-profiles-sync.md
+22-18Lines changed: 22 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,22 +3,23 @@ title: Databricks Profiles Sync
3
3
plan: unify
4
4
---
5
5
6
-
With Databricks Profiles Sync, you can use Profiles Sync to sync Segment profiles into your Databricks Lakehouse.
6
+
With Databricks Profiles Sync, you can use [Profiles Sync](/docs/unify/profiles-sync/overview/) to sync Segment profiles into your Databricks Lakehouse.
7
+
7
8
8
-
<!--
9
-
Use Databricks as a warehouse destination and materialized view for Profiles Sync Warehouses
10
-
-->
11
9
## Getting started
12
10
13
-
Before starting with the Databricks Profiles Sync destination, note the following prerequisites for setup.
11
+
Before getting started with Databricks Profiles Sync, note the following prerequisites for setup.
14
12
15
13
- The target Databricks workspace must be Unity Catalog enabled. Segment doesn't support the Hive metastore. Visit the Databricks guide [enabling the Unity Catalog](https://docs.databricks.com/en/data-governance/unity-catalog/enable-workspaces.html){:target="_blank"} for more information.
16
-
Segment creates [managed tables](https://docs.databricks.com/en/data-governance/unity-catalog/create-tables.html#managed-tables){:target="_blank"} in the Unity catalog.
14
+
- Segment creates [managed tables](https://docs.databricks.com/en/data-governance/unity-catalog/create-tables.html#managed-tables){:target="_blank"} in the Unity catalog.
15
+
- Segment supports only OAuth [(M2M)](https://docs.databricks.com/en/dev-tools/auth/oauth-m2m.html){:target="_blank"} for authentication.
17
16
18
-
- Segment uses the service principal to access your Databricks workspace and associated APIs.
17
+
#### Service principal requirements and setup
18
+
19
+
Segment uses the service principal to access your Databricks workspace and associated APIs.
19
20
- Use the Databricks guide for [adding a service principal to your account](https://docs.databricks.com/en/administration-guide/users-groups/service-principals.html#manage-service-principals-in-your-account){:target="_blank"}. This name can be anything, but Segment recommends something that identifies the purpose (for example, "Segment Profiles Sync"). Note the Application ID that Databricks generates for later use. Segment doesn't require `Account admin` or `Marketplace admin` roles.
20
21
21
-
-The service principal needs the following setup:
22
+
The service principal needs the following setup:
22
23
- OAuth secret tocken generated. Follow the [Databricks guide for generating an OAuth secret](https://docs.databricks.com/en/dev-tools/authentication-oauth.html#step-2-create-an-oauth-secret-for-a-service-principal){:target="_blank"}. Note the secret generated by Databricks for later use. Once you navigate away from the page the secret is no longer visible. If you lose or forget the secret, you can delete the existing secret and create a new one.
23
24
-[Catalog level priveleges](https://docs.databricks.com/en/data-governance/unity-catalog/manage-privileges/privileges.html#general-unity-catalog-privilege-types){:target="_blank"} which include:
- Databricks SQL access [entitlement](https://docs.databricks.com/en/administration-guide/users-groups/service-principals.html#manage-workspace-entitlements-for-a-service-principal){:target="_blank"} at the workspace level.
31
32
- CAN USE [permissions](https://docs.databricks.com/en/security/auth-authz/access-control/sql-endpoint-acl.html#sql-warehouse-permissions){:target="_blank"} on the SQL warehouse that will be used for the sync.
32
33
33
-
- Segment supports only OAuth [(M2M)](https://docs.databricks.com/en/dev-tools/auth/oauth-m2m.html){:target="_blank"} for authentication.
34
-
- A SQL warehouse is required for compute. Segment recommends the following size:
34
+
35
+
36
+
#### Size and performance
37
+
38
+
A SQL warehouse is required for compute. Segment recommends the following size:
35
39
-**Size**: small
36
40
-**Type** Serverless otherwise Pro
37
41
-**Clusters**: Minimum of 2 - Maximum of 6
38
42
39
-
- To improve the query performance of the Delta Lake, Segment recommends to create compact jobs per table using OPTIMIZE following [Databricks recommendations](https://docs.databricks.com/en/delta/optimize.html#){:target="_blank"}.
43
+
- To improve the query performance of the Delta Lake, Segment recommends creating compact jobs per table using OPTIMIZE following [Databricks recommendations](https://docs.databricks.com/en/delta/optimize.html#){:target="_blank"}.
40
44
41
45
- If the SQL warehouse isn't running, Segment attempts to start the SQL warehouse to validate the connection when you hit the **Test Connection** button during setup. For a better experience, Segment recommends manually starting the warehouse in advance.
Use the following five steps to connect your Databricks warehouse.
58
+
Use the five steps below to connect your Databricks warehouse.
55
59
56
60
> warning ""
57
61
> To configure your warehouse, you'll need read and write permissions.
58
62
59
63
### Step 1: Name your destination
60
64
61
-
Add a name to help you identify this warehouse in Segment. You can change this name at any time by navigating to the destination settings (**Connections > Destinations > Settings**) page.
65
+
Add a name to help you identify your warehouse in Segment. You can change this name at any time by navigating to the destination settings (**Connections > Destinations > Settings**) page.
62
66
63
67
### Step 2: Enter the Databricks compute resources URL
64
68
65
69
You'll use the Databricks workspace URL, along with Segment, to access your workspace API.
66
70
67
-
Check your browser's address bar when inside the workspace. The workspace URL will look something like: `https://<workspace-deployment-name>.cloud.databricks.com`. Remove any characters after this portion and note the URL for later use.
71
+
Check your browser's address bar when inside the workspace. The workspace URL should resemble: `https://<workspace-deployment-name>.cloud.databricks.com`. Remove any characters after this portion and note the URL for later use.
68
72
69
73
### Step 3: Enter a Unity catalog name
70
74
71
-
This catalog is the target catalog where Segment lands your schemasand tablestables.
75
+
This catalog is the target catalog where Segment lands your schemas and tables.
72
76
1. Follow the Databricks guide for [creating a catalog](https://docs.databricks.com/en/data-governance/unity-catalog/create-catalogs.html#create-a-catalog){:target="_blank"}. Be sure to select the storage location created earlier. You can use any valid catalog name (for example, "Segment"). Note this name for later use.
73
77
2. Select the catalog you've just created.
74
-
1. Select the Permissions tab, then click **Grant**
78
+
1. Select the Permissions tab, then click **Grant**.
75
79
2. Select the Segment service principal from the dropdown, and check `ALL PRIVILEGES`.
76
80
3. Click **Grant**.
77
81
78
82
### Step 4: Add the SQL warehouse details from your Databricks warehouse
79
83
80
84
Next, add SQL warehouse details about your compute resource.
81
-
-**HTTP Path**: The connection details for your SQL warehouse
85
+
-**HTTP Path**: The connection details for your SQL warehouse.
82
86
-**Port**: The port number of your SQL warehouse.
83
87
84
88
@@ -100,6 +104,6 @@ Select tables to sync, then click **Next**. Segment creates the warehouse and co
100
104
You can view sync status, and the tables you're syncing from the Profiles Sync overview page.
101
105
102
106
103
-
Learn more about [using Selective Sync](/docs/unify/profiles-sync/using-selective-sync) with Profiles Sync.
107
+
Learn more about [using selective sync](/docs/unify/profiles-sync/#using-selective-sync) with Profiles Sync.
0 commit comments