You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/storage/blobs/data-lake-storage-use-databricks-spark.md
+3-28Lines changed: 3 additions & 28 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,7 +21,6 @@ This tutorial shows you how to connect your Azure Databricks cluster to data sto
21
21
In this tutorial, you will:
22
22
23
23
> [!div class="checklist"]
24
-
> - Create a Databricks cluster
25
24
> - Ingest unstructured data into a storage account
26
25
> - Run analytics on your data in Blob storage
27
26
@@ -45,38 +44,14 @@ If you don't have an Azure subscription, create a [free account](https://azure.m
45
44
46
45
- An Azure Databricks cluster. See [Create a cluster](/azure/databricks/getting-started/quick-start#step-1-create-a-cluster).
47
46
48
-
###Download the flight data
47
+
## Download the flight data
49
48
50
49
This tutorial uses flight data from the Bureau of Transportation Statistics to demonstrate how to perform an ETL operation. You must download this data to complete the tutorial.
51
50
52
51
1. Download the [On_Time_Reporting_Carrier_On_Time_Performance_1987_present_2016_1.zip](https://github.com/Azure-Samples/AzureStorageSnippets/blob/master/blobs/tutorials/On_Time_Reporting_Carrier_On_Time_Performance_1987_present_2016_1.zip) file. This file contains the flight data.
53
52
54
53
2. Unzip the contents of the zipped file and make a note of the file name and the path of the file. You need this information in a later step.
55
54
56
-
## Create an Azure Databricks service
57
-
58
-
In this section, you create an Azure Databricks service by using the Azure portal.
59
-
60
-
1. In the Azure portal, select **Create a resource** > **Analytics** > **Azure Databricks**.
61
-
62
-

63
-
64
-
2. Under **Azure Databricks Service**, provide the following values to create a Databricks service:
65
-
66
-
|Property |Description |
67
-
|---------|---------|
68
-
|**Workspace name**| Provide a name for your Databricks workspace. |
69
-
|**Subscription**| From the drop-down, select your Azure subscription. |
70
-
|**Resource group**| Specify whether you want to create a new resource group or use an existing one. A resource group is a container that holds related resources for an Azure solution. For more information, see [Azure Resource Group overview](../../azure-resource-manager/management/overview.md). |
71
-
|**Location**| Select **West US 2**. For other available regions, see [Azure services available by region](https://azure.microsoft.com/regions/services/). |
72
-
|**Pricing Tier**| Select **Standard**. |
73
-
74
-

75
-
76
-
3. The account creation takes a few minutes. To monitor the operation status, view the progress bar at the top.
77
-
78
-
4. Select **Pin to dashboard** and then select **Create**.
79
-
80
55
## Ingest data
81
56
82
57
### Copy source data into the storage account
@@ -113,8 +88,8 @@ In this section, you'll create a container and a folder in your storage account.
113
88
114
89
3. In the Workspace folder, select **Create > Notebook**.
115
90
116
-
> [!div class="mx-imgBorder"]
117
-
> 
91
+
> [!div class="mx-imgBorder"]
92
+
> 
118
93
119
94
4. In the **Create Notebook** dialog, enter a name and then select **Python** in the **Default Language** drop-down list. This selection determines the default language of the notebook.
0 commit comments