|
1 | 1 | ---
|
2 |
| -title: Databricks Delta Lake |
3 |
| -redirect_from: |
4 |
| - - '/connections/warehouses/catalog/databricks-delta-lake/' |
| 2 | +title: Databricks Delta Lake Destination |
| 3 | +public: true |
| 4 | + |
5 | 5 | ---
|
6 | 6 |
|
7 |
| -Setup docs coming soon! |
| 7 | + |
| 8 | +With the Databricks Delta Lake Destination, you can ingest event data from Segment into the bronze layer of your Databricks Delta Lake. |
| 9 | + |
| 10 | +This page will help you get started with syncing Segment events into your Databricks Delta Lake Destination. |
| 11 | + |
| 12 | + |
| 13 | +## Getting started |
| 14 | + |
| 15 | +Before getting started with the Databricks Destination, note the following prerequisites. |
| 16 | + |
| 17 | +- The target Databricks workspace must be Unity Catalog enabled. Segment doesn't support the Hive metastore. Visit the Databricks guide [enabling the Unity Catalog](https://docs.databricks.com/en/data-governance/unity-catalog/enable-workspaces.html){:target="_blank"} for more information. |
| 18 | +- Segment creates [managed tables](https://docs.databricks.com/en/data-governance/unity-catalog/create-tables.html#managed-tables){:target="_blank"} in the Unity catalog. The service account needs access to create schemas on the catalog and can delete, drop, or vacuum tables. |
| 19 | +- Segment supports only OAuth [(M2M)](https://docs.databricks.com/en/dev-tools/auth/oauth-m2m.html){:target="_blank"} for authentication. |
| 20 | +- A SQL warehouse is required for compute. Segment recommends the following size: |
| 21 | + - **Size**: small |
| 22 | + - **Type** Serverless otherwise Pro |
| 23 | + - **Clusters**: Minimum of 2 - Maximum of 6 |
| 24 | + |
| 25 | +> success "" |
| 26 | +> Segment recommends manually starting your SQL warehouse in advance. If the SQL warehouse isn't running, Segment attempts to start the SQL warehouse to validate the connection when you hit the **Test Connection** button during setup. |
| 27 | +
|
| 28 | +## Set up Databricks in Segment |
| 29 | + |
| 30 | +Use the following steps to set up Databricks in Segment: |
| 31 | + |
| 32 | +1. Navigate to **Connections > Catalog**. |
| 33 | +2. Select the **Destinations** tab. |
| 34 | +3. Under Connection Type, select **Storage**, and click on the **Databricks storage** tile. |
| 35 | +4. (Optional) Select a source(s) to connect to the destination. |
| 36 | +5. Follow the steps below to [connect your Databricks warehouse](#connect-your-databricks-warehouse). |
| 37 | + |
| 38 | +## Connect your Databricks warehouse |
| 39 | + |
| 40 | +Use the five steps below to connect your Databricks warehouse. |
| 41 | + |
| 42 | +> warning "" |
| 43 | +> You'll need read and write warehouse permissions for Segment to write to your database. |
| 44 | +
|
| 45 | +### Step 1: Name your destination |
| 46 | + |
| 47 | +Add a name to help you identify this warehouse in Segment. You can change this name at any time by navigating to the destination settings (**Connections > Destinations > Settings**) page. |
| 48 | + |
| 49 | +### Step 2: Enter the Databricks compute resources URL |
| 50 | + |
| 51 | + |
| 52 | +You'll use the Databricks workspace URL, along with Segment, to access your workspace API. |
| 53 | + |
| 54 | +Check your browser's address bar when inside the workspace. The workspace URL should resemble: `https://<workspace-deployment-name>.cloud.databricks.com`. Remove any characters after this portion and note the URL for later use. |
| 55 | + |
| 56 | +### Step 3: Enter a Unity catalog name |
| 57 | + |
| 58 | +This catalog is the target catalog where Segment lands your schemas and tables. |
| 59 | +1. Follow the Databricks guide for [creating a catalog](https://docs.databricks.com/en/data-governance/unity-catalog/create-catalogs.html#create-a-catalog){:target="_blank"}. Be sure to select the storage location created earlier. You can use any valid catalog name (for example, "Segment"). Note this name for later use. |
| 60 | +2. Select the catalog you've just created. |
| 61 | + 1. Select the Permissions tab, then click **Grant** |
| 62 | + 2. Select the Segment service principal from the dropdown, and check `ALL PRIVILEGES`. |
| 63 | + 3. Click **Grant**. |
| 64 | + |
| 65 | +### Step 4: Add the SQL warehouse details from your Databricks warehouse |
| 66 | + |
| 67 | +Next, add SQL warehouse details about your compute resource. |
| 68 | +- **HTTP Path**: The connection details for your SQL warehouse. |
| 69 | +- **Port**: The port number of your SQL warehouse. |
| 70 | + |
| 71 | + |
| 72 | +### Step 5: Add the Principal service client ID and Client secret |
| 73 | + |
| 74 | +Segment uses the service principal to access your Databricks workspace and associated APIs. |
| 75 | +1. Follow the Databricks guide for [adding a service principal to your account](https://docs.databricks.com/en/administration-guide/users-groups/service-principals.html#manage-service-principals-in-your-account){:target="_blank"}. This name can be anything, but Segment recommends something that identifies the purpose (for example, "Segment Storage Destinations"). Note the Application ID that Databricks generates for later use. Segment doesn't require Account admin or Marketplace admin roles. |
| 76 | +2. (*OAuth only*) Follow the Databricks instructions to [generate an OAuth secret](https://docs.databricks.com/en/dev-tools/authentication-oauth.html#step-2-create-an-oauth-secret-for-a-service-principal){:target="_blank"}. Note the secret generated by Databricks for later use. Once you navigate away from this page, the secret is no longer visible. If you lose or forget the secret, delete the existing secret and create a new one. |
| 77 | + |
| 78 | + |
| 79 | +Once connected, you'll see a confirmation screen with next steps and more info on using your warehouse. |
| 80 | + |
0 commit comments