You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the imported notebook, go to **command 5** as shown in the following code snippet.
55
52
56
-
4. Now let's update the **Transformation** notebook with your storage connection information. Go to **command 5** (as shown in below code snippet) in the imported notebook above, and replace `<storage name>`and `<access key>` with your own storage connection information. Ensure this account is the same storage account created earlier and contains the `sinkdata` container.
53
+
- Replace `<storage name>`and `<access key>` with your own storage connection information.
54
+
- Use the storage account with the `sinkdata` container.
57
55
58
56
```python
59
57
# Supply storageName and accessKey values
@@ -77,62 +75,70 @@ To import a **Transformation** notebook to your Databricks workspace:
77
75
print e \# Otherwise print the whole stack trace.
78
76
```
79
77
80
-
5. Generate a **Databricks access token**for Data Factory to access Databricks. **Save the access token**for later use in creating a Databricks linked service, which looks something like 'dapi32db32cbb4w6eee18b7d87e45exxxxxx'.
81
-
78
+
1. Generate a **Databricks access token**for Data Factory to access Databricks.
79
+
1. In your Databricks workspace, select your user profile icon in the upper right.
**Save the access token**for later use in creating a Databricks linked service. The access token looks something like 'dapi32db32cbb4w6eee18b7d87e45exxxxxx'.
88
+
86
89
## How to use this template
87
90
88
-
1. Go to **Transformation with Azure Databricks** template. Create new linked services for following connections.
91
+
1. Go to the **Transformation with Azure Databricks** templateand create new linked services for following connections.
1.**Source Blob Connection** – for accessing source data.
95
+
-**Source Blob Connection** – to access the source data.
93
96
94
-
You can use the public blob storage containing the source filesfor this sample. Reference following screenshot for configuration. Use below**SASURL** to connect to source storage (read-only access):
97
+
For this exercise, you can use the public blob storage that contains the source files. Reference following screenshot for configuration. Use the following**SASURL** to connect to source storage (read-only access):
1.**Azure Databricks** – for connecting to Databricks cluster.
109
+
-**Azure Databricks** – to connect to the Databricks cluster.
107
110
108
-
Create a Databrickslinked service using access key generated in**Prerequisite**2.c. If you have an *interactive cluster*, you may select that. (This example uses the *New job cluster* option.)
111
+
Create a Databricks-linked service using the access key you generated earier. You may opt to select an *interactive cluster*ifyou have one. This example uses the *New job cluster* option.
1. Select **Use this template**, and you would see a pipeline createdas shown below:
115
+
1. Select **Use this template**. You'll see a pipeline created.
113
116
114
117

115
118
116
119
## Pipeline introduction and configuration
117
120
118
-
In the new pipeline created, most settings have been configured automatically with default values. Check out the configurations and update where necessary to suit your own settings. For details, you can check below instructions andscreenshots for reference.
121
+
In the new pipeline, most settings are configured automatically with default values. Review the configurations of your pipeline andmake any necessary changes:
119
122
120
-
1. A Validation activity **Availability flag**is created fordoing a Source Availability check. *SourceAvailabilityDataset* created in previous step is selected as Dataset.
123
+
- A _Validation_ activity **Availability flag**is created forchecking the source. The Dataset value should be set to *SourceAvailabilityDataset* which was created earlier.
1. A Copy activity **file-to-blob**is created for copying dataset from source to sink. Reference the below screenshots forsource and sink configurations in the copy activity.
127
+
- A _Copy data_ activity **file-to-blob**is created for copying the dataset fromthe source to the sink. Check the source and sink tabs to change these settings.
1. Select **Settings** tab. For *Notebook path*, the template defines a path by default. You may need to browse and select the correct notebook path uploaded in**Prerequisite**2.
139
+
1. Select **Settings** tab. For *Notebook path*, the template defines a path by default. You may need to browse and select the correct notebook path uploaded in**Prerequisite**2.
0 commit comments