Skip to content

Commit 44f24b4

Browse files
authored
Merge pull request #569 from segmentio/repo-sync
repo sync
2 parents e2b30d0 + f89d8b3 commit 44f24b4

File tree

5 files changed

+16
-11
lines changed

5 files changed

+16
-11
lines changed

.github/styles/Vocab/Docs/accept.txt

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@
1717
(?:U|u)rls?\b
1818
adset
1919
Adwords
20+
Aircall
2021
allowlist
2122
Amberflo
2223
Appboy
@@ -36,6 +37,7 @@ Cocoapods
3637
Contentful
3738
Criteo
3839
csv
40+
Databricks
3941
datetime
4042
deeplink
4143
Dev
@@ -71,6 +73,7 @@ Jimo
7173
Jivox
7274
Kameleoon
7375
Kissmetrics
76+
Leanplum
7477
Lightbox
7578
Littledata
7679
Mailchimp
@@ -117,6 +120,4 @@ waitlist
117120
WebKit
118121
Wootric
119122
Xcode
120-
Zendesk
121-
Leanplum
122-
Aircall
123+
Zendesk

src/connections/destinations/actions.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ Moving from a classic destination to an actions-based destination is a manual pr
8181
## Edit a destination action
8282
You can add or remove, disable and re-enable, and rename individual actions from the Actions tab on the destination's information page in the Segment app. Click an individual action to edit it.
8383

84-
From the edit screen you can change the action's name and mapping, and toggle it on or off. See [Customizing mappings](#customizing-mappings) for more information.
84+
From the edit screen you can change the action's name and mapping, and toggle it on or off. See [Customizing mappings](#customize-mappings) for more information.
8585

8686
![](images/actions-list.png)
8787

20.2 KB
Loading
36.5 KB
Loading

src/connections/storage/catalog/data-lakes/index.md

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,9 @@ redirect_from: '/connections/destinations/catalog/data-lakes/'
66

77

88
Segment Data Lakes provide a way to collect large quantities of data in a format that's optimized for targeted data science and data analytics workflows. You can read [more information about Data Lakes](/docs/connections/storage/data-lakes/) and learn [how they differ from Warehouses](/docs/connections/storage/data-lakes/comparison/) in Segment's Data Lakes documentation.
9+
Segment supports two type of data-lakes:
10+
- [AWS Data Lakes](/docs/connections/storage/catalog/data-lakes/#set-up-segment-data-lakes)
11+
- [Azure Data Lakes](/docs/connections/storage/catalog/data-lakes/#set-up-azure-data-lakes)
912

1013
> note "Lake Formation"
1114
> You can also set up your Segment Data Lakes using [Lake Formation](/docs/connections/storage/data-lakes/lake-formation/), a fully managed service built on top of the AWS Glue Data Catalog.
@@ -255,8 +258,7 @@ curl -X POST 'https://<per-workspace-url>/api/2.0/preview/scim/v2/ServicePrincip
255258
> warning "Optional configuration settings for log4j vulnerability"
256259
> While Databricks released a statement that clusters are likely unaffected by the log4j vulnerability, out of an abundance of caution, Databricks recommends updating to log4j 2.15+ or adding the following options to the Spark configuration: <br/> `spark.driver.extraJavaOptions "-Dlog4j2.formatMsgNoLookups=true"`<br/>`spark.executor.extraJavaOptions "-Dlog4j2.formatMsgNoLookups=true"`
257260
258-
1. Connect to a [Hive metastore](https://docs.databricks.com/data/metastores/external-hive-metastore.html){:target="_blank”} on your Databricks cluster.
259-
2. Copy the following Spark configuration, replacing the variables (`<example_variable>`) with information from your workspace: <br/>
261+
1. Connect to a [Hive metastore](https://docs.databricks.com/data/metastores/external-hive-metastore.html){:target="_blank”} on your Databricks cluster using the following Spark configuration, replacing the variables (`<example_variable>`) with information from your workspace: <br/>
260262
```py
261263
## Configs so we can read from the storage account
262264
spark.hadoop.fs.azure.account.oauth.provider.type.<storage_account_name>.dfs.core.windows.net org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider
@@ -267,7 +269,7 @@ spark.hadoop.fs.azure.account.oauth2.client.id.<storage_account_name>.dfs.core.w
267269
##
268270
##
269271
spark.hadoop.javax.jdo.option.ConnectionDriverName org.mariadb.jdbc.Driver
270-
spark.hadoop.javax.jdo.option.ConnectionURL jdbc:mysql://<db-host>:<port>/<database-name>?useSSL=true&requireSSL=false
272+
spark.hadoop.javax.jdo.option.ConnectionURL jdbc:mysql://<db-host>:<port>/<database-name>?useSSL=true&requireSSL=true&enabledSslProtocolSuites=TLSv1.2
271273
spark.hadoop.javax.jdo.option.ConnectionUserName <database_user>
272274
spark.hadoop.javax.jdo.option.ConnectionPassword <database_password>
273275
##
@@ -328,11 +330,13 @@ After you set up the necessary resources in Azure, the next step is to set up th
328330
2. Click the **Configure Data Lakes** button, and select the source you'd like to receive data from. Click **Next**.
329331
3. In the **Connection Settings** section, enter the following values:
330332
- **Azure Storage Account**: The name of the Azure Storage account that you set up in [Step 1 - Create an ALDS-enabled storage account](#step-1---create-an-alds-enabled-storage-account).
333+
![img.png](images/storageaccount.png)
331334
- **Azure Storage Container**: The name of the Azure Storage Container you created in [Step 1 - Create an ALDS-enabled storage account](#step-1---create-an-alds-enabled-storage-account).
332-
- **Azure Subscription ID**: The ID of your [Azure subscription](https://docs.microsoft.com/en-us/azure/azure-portal/get-subscription-tenant-id){:target="_blank”}.
333-
- **Azure Tenant ID**: The Tenant ID of your [Azure Active directory](https://docs.microsoft.com/en-us/azure/active-directory/fundamentals/active-directory-how-to-find-tenant){:target="_blank”}.
335+
![img_1.png](images/storagecontainer.png)
336+
- **Azure Subscription ID**: The ID of your [Azure subscription](https://docs.microsoft.com/en-us/azure/azure-portal/get-subscription-tenant-id){:target="_blank”}. <br> Please add it as it is in the Azure portal, in the format `********-****-****-****-************`
337+
- **Azure Tenant ID**: The Tenant ID of your [Azure Active directory](https://docs.microsoft.com/en-us/azure/active-directory/fundamentals/active-directory-how-to-find-tenant){:target="_blank”}. <br> Please add it as it is in the Azure portal, in the format `********-****-****-****-************`
334338
- **Databricks Cluster ID**: The ID of your [Databricks cluster](https://docs.databricks.com/workspace/workspace-details.html#cluster-url-and-id){:target="_blank”}.
335-
- **Databricks Instance URL**: The ID of your [Databricks workspace](https://docs.databricks.com/workspace/workspace-details.html#workspace-instance-names-urls-and-ids){:target="_blank”}.
339+
- **Databricks Instance URL**: The ID of your [Databricks workspace](https://docs.databricks.com/workspace/workspace-details.html#workspace-instance-names-urls-and-ids){:target="_blank”}. <br> The correct format for adding the URL is 'adb-0000000000000000.00.azureatabricks.net'
336340
- **Databricks Workspace Name**: The name of your [Databricks workspace](https://docs.databricks.com/workspace/workspace-details.html#workspace-instance-names-urls-and-ids){:target="_blank”}.
337341
- **Databricks Workspace Resource Group**: The resource group that hosts your Azure Databricks instance. This is visible in Azure on the overview page for your Databricks instance.
338342
- **Region**: The location of the Azure Storage account you set up in [Step 1 - Create an ALDS-enabled storage account](#step-1---create-an-alds-enabled-storage-account).
@@ -481,4 +485,4 @@ spark.sql.hive.metastore.schema.verification.record.version false
481485
<br/>After you've added to your config, restart your cluster so that your changes can take effect. If you continue to encounter errors, [contact Segment Support](https://segment.com/help/contact/){:target="_blank"}.
482486

483487
#### What do I do if I get a "Version table does not exist" error when setting up the Azure MySQL database?
484-
Check your Spark configs to ensure that the information you entered about the database is correct, then restart the cluster. The Databricks cluster automatically initializes the Hive Metastore, so an issue with your config file will stop the table from being created. If you continue to encounter errors, [contact Segment Support](https://segment.com/help/contact/){:target="_blank"}.
488+
Check your Spark configs to ensure that the information you entered about the database is correct, then restart the cluster. The Databricks cluster automatically initializes the Hive Metastore, so an issue with your config file will stop the table from being created. If you continue to encounter errors, [contact Segment Support](https://segment.com/help/contact/){:target="_blank"}.

0 commit comments

Comments
 (0)