Skip to content

Commit 95996bf

Browse files
Merge pull request #226606 from normesta/normesta-reg-updates-11
Some tweaks
2 parents 00dc06a + 5292455 commit 95996bf

File tree

1 file changed

+11
-20
lines changed

1 file changed

+11
-20
lines changed

articles/storage/blobs/data-lake-storage-use-databricks-spark.md

Lines changed: 11 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ author: normesta
77
ms.subservice: data-lake-storage-gen2
88
ms.service: storage
99
ms.topic: tutorial
10-
ms.date: 02/01/2023
10+
ms.date: 02/07/2023
1111
ms.author: normesta
1212
ms.reviewer: dineshm
1313
ms.custom: devx-track-python, py-fresh-zinc
@@ -40,10 +40,6 @@ If you don't have an Azure subscription, create a [free account](https://azure.m
4040

4141
See [Tutorial: Connect to Azure Data Lake Storage Gen2](/azure/databricks/getting-started/connect-to-azure-storage) (Steps 1 through 3). After completing these steps, make sure to paste the tenant ID, app ID, and client secret values into a text file. You'll need those soon.
4242

43-
- An Azure Databricks workspace. See [Create an Azure Databricks workspace](/azure/databricks/getting-started/#--create-an-azure-databricks-workspace).
44-
45-
- An Azure Databricks cluster. See [Create a cluster](/azure/databricks/getting-started/quick-start#step-1-create-a-cluster).
46-
4743
## Download the flight data
4844

4945
This tutorial uses flight data from the Bureau of Transportation Statistics to demonstrate how to perform an ETL operation. You must download this data to complete the tutorial.
@@ -78,26 +74,21 @@ Use AzCopy to copy data from your *.csv* file into your Data Lake Storage Gen2 a
7874

7975
- Replace the `<container-name>` placeholder with the name of a container in your storage account.
8076

81-
## Create a container and mount it
82-
83-
In this section, you'll create a container and a folder in your storage account.
84-
85-
1. In the [Azure portal](https://portal.azure.com), go to the Azure Databricks service that you created, and select **Launch Workspace**.
77+
## Create an Azure Databricks workspace, cluster, and notebook
8678

87-
2. In the sidebar, select **Workspace**.
79+
1. Create an Azure Databricks workspace. See [Create an Azure Databricks workspace](/azure/databricks/getting-started/#--create-an-azure-databricks-workspace).
8880

89-
3. In the Workspace folder, select **Create > Notebook**.
81+
2. Create a cluster. See [Create a cluster](/azure/databricks/getting-started/quick-start#step-1-create-a-cluster).
9082

91-
> [!div class="mx-imgBorder"]
92-
> ![Screenshot of create notebook option.](./media/data-lake-storage-use-databricks-spark/create-notebook.png)
83+
3. Create a notebook. See [Create a notebook](/azure/databricks/notebooks/notebooks-manage#--create-a-notebook). Choose Python as the default language of the notebook.
9384

94-
4. In the **Create Notebook** dialog, enter a name and then select **Python** in the **Default Language** drop-down list. This selection determines the default language of the notebook.
85+
## Create a container and mount it
9586

96-
5. In the **Cluster** drop-down list, make sure that the cluster you created earlier is selected.
87+
1. In the **Cluster** drop-down list, make sure that the cluster you created earlier is selected.
9788

98-
6. Click **Create**. The notebook opens with an empty cell at the top.
89+
2. Click **Create**. The notebook opens with an empty cell at the top.
9990

100-
7. Copy and paste the following code block into the first cell, but don't run this code yet.
91+
3. Copy and paste the following code block into the first cell, but don't run this code yet.
10192

10293
```python
10394
configs = {"fs.azure.account.auth.type": "OAuth",
@@ -113,9 +104,9 @@ In this section, you'll create a container and a folder in your storage account.
113104
extra_configs = configs)
114105
```
115106

116-
18. In this code block, replace the `appId`, `clientSecret`, `tenant`, and `storage-account-name` placeholder values in this code block with the values that you collected while completing the prerequisites of this tutorial. Replace the `container-name` placeholder value with the name of the container.
107+
4. In this code block, replace the `appId`, `clientSecret`, `tenant`, and `storage-account-name` placeholder values in this code block with the values that you collected while completing the prerequisites of this tutorial. Replace the `container-name` placeholder value with the name of the container.
117108

118-
19. Press the **SHIFT + ENTER** keys to run the code in this block.
109+
5. Press the **SHIFT + ENTER** keys to run the code in this block.
119110

120111
Keep this notebook open as you will add commands to it later.
121112

0 commit comments

Comments
 (0)