You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/storage/blobs/data-lake-storage-use-databricks-spark.md
+11-20Lines changed: 11 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ author: normesta
7
7
ms.subservice: data-lake-storage-gen2
8
8
ms.service: storage
9
9
ms.topic: tutorial
10
-
ms.date: 02/01/2023
10
+
ms.date: 02/07/2023
11
11
ms.author: normesta
12
12
ms.reviewer: dineshm
13
13
ms.custom: devx-track-python, py-fresh-zinc
@@ -40,10 +40,6 @@ If you don't have an Azure subscription, create a [free account](https://azure.m
40
40
41
41
See [Tutorial: Connect to Azure Data Lake Storage Gen2](/azure/databricks/getting-started/connect-to-azure-storage) (Steps 1 through 3). After completing these steps, make sure to paste the tenant ID, app ID, and client secret values into a text file. You'll need those soon.
42
42
43
-
- An Azure Databricks workspace. See [Create an Azure Databricks workspace](/azure/databricks/getting-started/#--create-an-azure-databricks-workspace).
44
-
45
-
- An Azure Databricks cluster. See [Create a cluster](/azure/databricks/getting-started/quick-start#step-1-create-a-cluster).
46
-
47
43
## Download the flight data
48
44
49
45
This tutorial uses flight data from the Bureau of Transportation Statistics to demonstrate how to perform an ETL operation. You must download this data to complete the tutorial.
@@ -78,26 +74,21 @@ Use AzCopy to copy data from your *.csv* file into your Data Lake Storage Gen2 a
78
74
79
75
- Replace the `<container-name>` placeholder with the name of a container in your storage account.
80
76
81
-
## Create a container and mount it
82
-
83
-
In this section, you'll create a container and a folder in your storage account.
84
-
85
-
1. In the [Azure portal](https://portal.azure.com), go to the Azure Databricks service that you created, and select **Launch Workspace**.
77
+
## Create an Azure Databricks workspace, cluster, and notebook
86
78
87
-
2. In the sidebar, select **Workspace**.
79
+
1. Create an Azure Databricks workspace. See [Create an Azure Databricks workspace](/azure/databricks/getting-started/#--create-an-azure-databricks-workspace).
88
80
89
-
3. In the Workspace folder, select **Create > Notebook**.
81
+
2. Create a cluster. See [Create a cluster](/azure/databricks/getting-started/quick-start#step-1-create-a-cluster).
90
82
91
-
> [!div class="mx-imgBorder"]
92
-
> 
83
+
3. Create a notebook. See [Create a notebook](/azure/databricks/notebooks/notebooks-manage#--create-a-notebook). Choose Python as the default language of the notebook.
93
84
94
-
4. In the **Create Notebook** dialog, enter a name and then select **Python** in the **Default Language** drop-down list. This selection determines the default language of the notebook.
85
+
## Create a container and mount it
95
86
96
-
5. In the **Cluster** drop-down list, make sure that the cluster you created earlier is selected.
87
+
1. In the **Cluster** drop-down list, make sure that the cluster you created earlier is selected.
97
88
98
-
6. Click **Create**. The notebook opens with an empty cell at the top.
89
+
2. Click **Create**. The notebook opens with an empty cell at the top.
99
90
100
-
7. Copy and paste the following code block into the first cell, but don't run this code yet.
91
+
3. Copy and paste the following code block into the first cell, but don't run this code yet.
101
92
102
93
```python
103
94
configs = {"fs.azure.account.auth.type": "OAuth",
@@ -113,9 +104,9 @@ In this section, you'll create a container and a folder in your storage account.
113
104
extra_configs= configs)
114
105
```
115
106
116
-
18. In this code block, replace the `appId`, `clientSecret`, `tenant`, and`storage-account-name` placeholder values in this code block with the values that you collected while completing the prerequisites of this tutorial. Replace the `container-name` placeholder value with the name of the container.
107
+
4. In this code block, replace the `appId`, `clientSecret`, `tenant`, and`storage-account-name` placeholder values in this code block with the values that you collected while completing the prerequisites of this tutorial. Replace the `container-name` placeholder value with the name of the container.
117
108
118
-
19. Press the **SHIFT+ENTER** keys to run the code in this block.
109
+
5. Press the **SHIFT+ENTER** keys to run the code in this block.
119
110
120
111
Keep this notebook openas you will add commands to it later.
0 commit comments