You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Segment Data Lakes provide a way to collect large quantities of data in a format that's optimized for targeted data science and data analytics workflows. You can read [more information about Data Lakes](/docs/connections/storage/data-lakes/) and learn [how they differ from Warehouses](/docs/connections/storage/data-lakes/comparison/) in Segment's Data Lakes documentation.
9
-
Segment supports two type of data-lakes:
10
-
-[AWS Data Lakes](/docs/connections/storage/catalog/data-lakes/#set-up-segment-data-lakes)
11
-
-[Azure Data Lakes](/docs/connections/storage/catalog/data-lakes/#set-up-azure-data-lakes)
12
9
13
10
> note "Lake Formation"
14
11
> You can also set up your Segment Data Lakes using [Lake Formation](/docs/connections/storage/data-lakes/lake-formation/), a fully managed service built on top of the AWS Glue Data Catalog.
@@ -258,7 +255,8 @@ curl -X POST 'https://<per-workspace-url>/api/2.0/preview/scim/v2/ServicePrincip
258
255
> warning "Optional configuration settings for log4j vulnerability"
259
256
> While Databricks released a statement that clusters are likely unaffected by the log4j vulnerability, out of an abundance of caution, Databricks recommends updating to log4j 2.15+ or adding the following options to the Spark configuration: <br/> `spark.driver.extraJavaOptions "-Dlog4j2.formatMsgNoLookups=true"`<br/>`spark.executor.extraJavaOptions "-Dlog4j2.formatMsgNoLookups=true"`
260
257
261
-
1. Connect to a [Hive metastore](https://docs.databricks.com/data/metastores/external-hive-metastore.html){:target="_blank”} on your Databricks cluster using the following Spark configuration, replacing the variables (`<example_variable>`) with information from your workspace: <br/>
258
+
1. Connect to a [Hive metastore](https://docs.databricks.com/data/metastores/external-hive-metastore.html){:target="_blank”} on your Databricks cluster.
259
+
2. Copy the following Spark configuration, replacing the variables (`<example_variable>`) with information from your workspace: <br/>
262
260
```py
263
261
## Configs so we can read from the storage account
@@ -330,13 +328,11 @@ After you set up the necessary resources in Azure, the next step is to set up th
330
328
2. Click the **Configure Data Lakes** button, and select the source you'd like to receive data from. Click **Next**.
331
329
3. In the **Connection Settings** section, enter the following values:
332
330
-**Azure Storage Account**: The name of the Azure Storage account that you set up in [Step 1 - Create an ALDS-enabled storage account](#step-1---create-an-alds-enabled-storage-account).
333
-

334
331
-**Azure Storage Container**: The name of the Azure Storage Container you created in [Step 1 - Create an ALDS-enabled storage account](#step-1---create-an-alds-enabled-storage-account).
335
-

336
-
-**Azure Subscription ID**: The ID of your [Azure subscription](https://docs.microsoft.com/en-us/azure/azure-portal/get-subscription-tenant-id){:target="_blank”}. <br> Please add it as it is in the Azure portal, in the format `********-****-****-****-************`
337
-
-**Azure Tenant ID**: The Tenant ID of your [Azure Active directory](https://docs.microsoft.com/en-us/azure/active-directory/fundamentals/active-directory-how-to-find-tenant){:target="_blank”}. <br> Please add it as it is in the Azure portal, in the format `********-****-****-****-************`
332
+
-**Azure Subscription ID**: The ID of your [Azure subscription](https://docs.microsoft.com/en-us/azure/azure-portal/get-subscription-tenant-id){:target="_blank”}.
333
+
-**Azure Tenant ID**: The Tenant ID of your [Azure Active directory](https://docs.microsoft.com/en-us/azure/active-directory/fundamentals/active-directory-how-to-find-tenant){:target="_blank”}.
338
334
-**Databricks Cluster ID**: The ID of your [Databricks cluster](https://docs.databricks.com/workspace/workspace-details.html#cluster-url-and-id){:target="_blank”}.
339
-
-**Databricks Instance URL**: The ID of your [Databricks workspace](https://docs.databricks.com/workspace/workspace-details.html#workspace-instance-names-urls-and-ids){:target="_blank”}. <br> The correct format for adding the URL is 'adb-0000000000000000.00.azureatabricks.net'
335
+
-**Databricks Instance URL**: The ID of your [Databricks workspace](https://docs.databricks.com/workspace/workspace-details.html#workspace-instance-names-urls-and-ids){:target="_blank”}.
340
336
-**Databricks Workspace Name**: The name of your [Databricks workspace](https://docs.databricks.com/workspace/workspace-details.html#workspace-instance-names-urls-and-ids){:target="_blank”}.
341
337
-**Databricks Workspace Resource Group**: The resource group that hosts your Azure Databricks instance. This is visible in Azure on the overview page for your Databricks instance.
342
338
-**Region**: The location of the Azure Storage account you set up in [Step 1 - Create an ALDS-enabled storage account](#step-1---create-an-alds-enabled-storage-account).
<br/>After you've added to your config, restart your cluster so that your changes can take effect. If you continue to encounter errors, [contact Segment Support](https://segment.com/help/contact/){:target="_blank"}.
486
482
487
483
#### What do I do if I get a "Version table does not exist" error when setting up the Azure MySQL database?
488
-
Check your Spark configs to ensure that the information you entered about the database is correct, then restart the cluster. The Databricks cluster automatically initializes the Hive Metastore, so an issue with your config file will stop the table from being created. If you continue to encounter errors, [contact Segment Support](https://segment.com/help/contact/){:target="_blank"}.
484
+
Check your Spark configs to ensure that the information you entered about the database is correct, then restart the cluster. The Databricks cluster automatically initializes the Hive Metastore, so an issue with your config file will stop the table from being created. If you continue to encounter errors, [contact Segment Support](https://segment.com/help/contact/){:target="_blank"}.
0 commit comments