Skip to content

Commit 8fa1e03

Browse files
committed
First draft of steps 2, 3 [DOC-493]
1 parent 4596fe1 commit 8fa1e03

File tree

1 file changed

+53
-9
lines changed
  • src/connections/storage/catalog/data-lakes

1 file changed

+53
-9
lines changed

src/connections/storage/catalog/data-lakes/index.md

Lines changed: 53 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -88,18 +88,15 @@ To set up [Azure Data Lakes], create your [Azure resources](/docs/src/connection
8888

8989
### Prerequisites
9090

91-
Before you can configure your Azure resources, you must first [create an Azure subscription](https://azure.microsoft.com/en-us/free/){:target="_blank”}.
91+
Before you can configure your Azure resources, you must first [create an Azure subscription](https://azure.microsoft.com/en-us/free/){:target="_blank”}, create an account with `Microsoft.Authorization/roleAssignments/write` permissions, and configure the [Azure Command Line Interface (Azure CLI)](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli){:target="_blank”}.
9292

9393
### Step 1 - Create an ALDS-enabled storage account
9494

95-
> note " "
96-
> Take note of the Location, Storage Account Name, and the name of your Azure Storage Container: you'll need these variables when configuring the Azure Data Lakes destination in the Segment app.
97-
9895
1. Sign in to your [Azure environment](https://portal.azure.com){:target="_blank”}.
99-
2. From the Azure home page, select **Create a resource**.
96+
2. From the [Azure home page](https://portal.azure.com/#home){:target="_blank”}, select **Create a resource**.
10097
3. Search for and select **Storage account**.
10198
4. On the Storage account resource page, select the **Storage account** plan and click **Create**.
102-
5. On the **Basic** tab, select an existing subscription and resource group, give your storage account a name, and update any necessary instance details. Take note of the **Region** you select in this step, as you'll need it when creating the [Azure Data Lakes] destination in the Segment app.
99+
5. On the **Basic** tab, select an existing subscription and resource group, give your storage account a name, and update any necessary instance details.
103100
6. Click **Next: Advanced**.
104101
7. On the **Advanced Settings** tab in the Security section, select the following options:
105102
- Require secure transfer for REST API operations
@@ -109,21 +106,68 @@ Before you can configure your Azure resources, you must first [create an Azure s
109106
8. In the Data Lake Storage Gen2 section, select **Enable hierarchical namespace**. In the Blob storage selection, select the **Hot** option.
110107
9. Click **Next: Networking**.
111108
10. On the **Networking** page, select **Disable public access and use private access**.
112-
11. Click **Review + create**. Take note of your location and storage account name, and verify that all of the other settings are correct. When you feel satisified with your selections, clikc **Create**.
109+
11. Click **Review + create**. Take note of your location and storage account name, and review your chosen settings. When you are satisfied with your selections, click **Create**.
113110
12. After your resource is deployed, click **Go to resource**.
114111
13. On the storage account overview page, select the **Containers** button in the Data storage tab.
115112
14. Select **Container**. Give your container a name, and select the **Private** level of public access. Click **Create**.
116113

117-
118-
### Step 2 - Set up KeyVault
114+
> warning " "
115+
> Before continuing, note the Location, Storage account name, and the Azure storage container name: you'll need these variables when configuring the Azure Data Lakes destination in the Segment app.
116+
117+
### Step 2 - Set up Key Vault
118+
119+
1. From the [home page of your Azure portal](https://portal.azure.com/#home){:target="_blank”}, select **Create a resource**.
120+
2. Search for and select **Key Vault**.
121+
3. On the Key Vault resource page, select the **Key Vault** plan and click **Create**.
122+
4. On the **Basic** tab, select an existing subscription and resource group, give your Key Vault a name, and update the **Days to retain deleted vaults** setting, if desired.
123+
6. Click **Review + create**.
124+
7. Review your chosen settings. When you are satisfied with your selections, click **Review + create**.
125+
8. After your resource is deployed, click **Go to resource**.
126+
9. On the Key Vault page, select the **Access control (IAM)** tab.
127+
10. Click **Add** and select **Add role assignment**.
128+
11. On the **Roles** tab, select the `Key Vault Secrets User` role. Click **Next**.
129+
12. On the **Members** tab, assign access to a **User, group, or service principal**.
130+
13. Click **Select members**.
131+
14. Search for and select the `Databricks Resource Provider` service principal.
132+
15.
119133

120134
### Step 3 - Set up Azure MySQL database
121135

136+
1. From the [home page of your Azure portal](https://portal.azure.com/#home){:target="_blank”}, select **Create a resource**.
137+
2. Search for and select **Azure Database for MySQL**.
138+
3. On the Azure Database for MySQL resource page, select the **Azure Database for MySQL** plan and click **Create**.
139+
4. Select **Single server** and click **Create**.
140+
5. On the **Basic** tab, select an existing subscription and resource group, enter server details and create an administrator account.
141+
6. Click **Review + create**.
142+
7. Review your chosen settings. When you are satisfied with your selections, click **Create**.
143+
8. After your resource is deployed, click **Go to resource**.
144+
9. From the resouce page, select the **Connection security** tab.
145+
10. Under the Firewall rules section, select **Yes** to allow access to Azure services, and click the **Allow current client IP address (xx.xxx.xxx.xx)** button to allow access from your current IP address.
146+
11. Click **Save** to save the changes you made on the **Connection security** page, and select the **Server parameters** tab.
147+
12. Update the `lower_case_table_names` value to 2, and click **Save**.
148+
13. Select the **Overview** tab and click the **Restart** button to restart your database. Restarting your database updates the `lower_case_table_name` setting.
149+
14. Once the server restarts successfully, open your Azure CLI.
150+
15. Sign into the MySQL server from your command line by entering the following command:
151+
```sql
152+
mysql --host=/[HOSTNAME] --port=3306 --user=[USERNAME] --password=[PASSWORD]
153+
```
154+
16. Run the `CREATE DATABASE` command to create your Hive Metastore:
155+
```sql
156+
CREATE DATABASE <name>;
157+
```
158+
159+
> warning " "
160+
> Before continuing, note the MySQL server URL, username and password for the admin account, and your database name: you'll need these variables when configuring the Azure Data Lakes destination in the Segment app.
161+
162+
122163
### Step 4 - Set up Databricks
123164

124165
> note "Databricks pricing tier"
125166
> If you create a Databricks instance only for [Azure Data Lakes] to use, only the standard pricing tier is required. However, if you use your Databricks instance for other applications, you may require premium pricing.
126167
168+
> warning " "
169+
> Before continuing, note the Cluster ID, Workspace name, Workspace URL, and the Azure Resource Group for Databricks Workspace: you'll need these variables when configuring the Azure Data Lakes destination in the Segment app.
170+
127171
### Step 5 - Set up a Service Principal
128172

129173
### Step 6 - Configure Databricks cluster

0 commit comments

Comments
 (0)