You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/connections/storage/catalog/data-lakes/index.md
+53-9Lines changed: 53 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -88,18 +88,15 @@ To set up [Azure Data Lakes], create your [Azure resources](/docs/src/connection
88
88
89
89
### Prerequisites
90
90
91
-
Before you can configure your Azure resources, you must first [create an Azure subscription](https://azure.microsoft.com/en-us/free/){:target="_blank”}.
91
+
Before you can configure your Azure resources, you must first [create an Azure subscription](https://azure.microsoft.com/en-us/free/){:target="_blank”}, create an account with `Microsoft.Authorization/roleAssignments/write` permissions, and configure the [Azure Command Line Interface (Azure CLI)](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli){:target="_blank”}.
92
92
93
93
### Step 1 - Create an ALDS-enabled storage account
94
94
95
-
> note " "
96
-
> Take note of the Location, Storage Account Name, and the name of your Azure Storage Container: you'll need these variables when configuring the Azure Data Lakes destination in the Segment app.
97
-
98
95
1. Sign in to your [Azure environment](https://portal.azure.com){:target="_blank”}.
99
-
2. From the Azure home page, select **Create a resource**.
96
+
2. From the [Azure home page](https://portal.azure.com/#home){:target="_blank”}, select **Create a resource**.
100
97
3. Search for and select **Storage account**.
101
98
4. On the Storage account resource page, select the **Storage account** plan and click **Create**.
102
-
5. On the **Basic** tab, select an existing subscription and resource group, give your storage account a name, and update any necessary instance details. Take note of the **Region** you select in this step, as you'll need it when creating the [Azure Data Lakes] destination in the Segment app.
99
+
5. On the **Basic** tab, select an existing subscription and resource group, give your storage account a name, and update any necessary instance details.
103
100
6. Click **Next: Advanced**.
104
101
7. On the **Advanced Settings** tab in the Security section, select the following options:
105
102
- Require secure transfer for REST API operations
@@ -109,21 +106,68 @@ Before you can configure your Azure resources, you must first [create an Azure s
109
106
8. In the Data Lake Storage Gen2 section, select **Enable hierarchical namespace**. In the Blob storage selection, select the **Hot** option.
110
107
9. Click **Next: Networking**.
111
108
10. On the **Networking** page, select **Disable public access and use private access**.
112
-
11. Click **Review + create**. Take note of your location and storage account name, and verify that all of the other settings are correct. When you feel satisified with your selections, clikc**Create**.
109
+
11. Click **Review + create**. Take note of your location and storage account name, and review your chosen settings. When you are satisfied with your selections, click**Create**.
113
110
12. After your resource is deployed, click **Go to resource**.
114
111
13. On the storage account overview page, select the **Containers** button in the Data storage tab.
115
112
14. Select **Container**. Give your container a name, and select the **Private** level of public access. Click **Create**.
116
113
117
-
118
-
### Step 2 - Set up KeyVault
114
+
> warning " "
115
+
> Before continuing, note the Location, Storage account name, and the Azure storage container name: you'll need these variables when configuring the Azure Data Lakes destination in the Segment app.
116
+
117
+
### Step 2 - Set up Key Vault
118
+
119
+
1. From the [home page of your Azure portal](https://portal.azure.com/#home){:target="_blank”}, select **Create a resource**.
120
+
2. Search for and select **Key Vault**.
121
+
3. On the Key Vault resource page, select the **Key Vault** plan and click **Create**.
122
+
4. On the **Basic** tab, select an existing subscription and resource group, give your Key Vault a name, and update the **Days to retain deleted vaults** setting, if desired.
123
+
6. Click **Review + create**.
124
+
7. Review your chosen settings. When you are satisfied with your selections, click **Review + create**.
125
+
8. After your resource is deployed, click **Go to resource**.
126
+
9. On the Key Vault page, select the **Access control (IAM)** tab.
127
+
10. Click **Add** and select **Add role assignment**.
128
+
11. On the **Roles** tab, select the `Key Vault Secrets User` role. Click **Next**.
129
+
12. On the **Members** tab, assign access to a **User, group, or service principal**.
130
+
13. Click **Select members**.
131
+
14. Search for and select the `Databricks Resource Provider` service principal.
132
+
15.
119
133
120
134
### Step 3 - Set up Azure MySQL database
121
135
136
+
1. From the [home page of your Azure portal](https://portal.azure.com/#home){:target="_blank”}, select **Create a resource**.
137
+
2. Search for and select **Azure Database for MySQL**.
138
+
3. On the Azure Database for MySQL resource page, select the **Azure Database for MySQL** plan and click **Create**.
139
+
4. Select **Single server** and click **Create**.
140
+
5. On the **Basic** tab, select an existing subscription and resource group, enter server details and create an administrator account.
141
+
6. Click **Review + create**.
142
+
7. Review your chosen settings. When you are satisfied with your selections, click **Create**.
143
+
8. After your resource is deployed, click **Go to resource**.
144
+
9. From the resouce page, select the **Connection security** tab.
145
+
10. Under the Firewall rules section, select **Yes** to allow access to Azure services, and click the **Allow current client IP address (xx.xxx.xxx.xx)** button to allow access from your current IP address.
146
+
11. Click **Save** to save the changes you made on the **Connection security** page, and select the **Server parameters** tab.
147
+
12. Update the `lower_case_table_names` value to 2, and click **Save**.
148
+
13. Select the **Overview** tab and click the **Restart** button to restart your database. Restarting your database updates the `lower_case_table_name` setting.
149
+
14. Once the server restarts successfully, open your Azure CLI.
150
+
15. Sign into the MySQL server from your command line by entering the following command:
151
+
```sql
152
+
mysql --host=/[HOSTNAME] --port=3306 --user=[USERNAME] --password=[PASSWORD]
153
+
```
154
+
16. Run the `CREATE DATABASE` command to create your Hive Metastore:
155
+
```sql
156
+
CREATE DATABASE <name>;
157
+
```
158
+
159
+
> warning " "
160
+
> Before continuing, note the MySQL server URL, username and password for the admin account, and your database name: you'll need these variables when configuring the Azure Data Lakes destination in the Segment app.
161
+
162
+
122
163
### Step 4 - Set up Databricks
123
164
124
165
> note "Databricks pricing tier"
125
166
> If you create a Databricks instance only for [Azure Data Lakes] to use, only the standard pricing tier is required. However, if you use your Databricks instance for other applications, you may require premium pricing.
126
167
168
+
> warning " "
169
+
> Before continuing, note the Cluster ID, Workspace name, Workspace URL, and the Azure Resource Group for Databricks Workspace: you'll need these variables when configuring the Azure Data Lakes destination in the Segment app.
0 commit comments