Skip to content

Commit ac02d53

Browse files
authored
Merge pull request #171718 from Clare-Zheng82/0908-Update_Databricks_doc_and_resolve_review_comments
Update Databricks notebook document and resolve non-blocking issues in PR 171347
2 parents 181ef35 + dffcbdd commit ac02d53

File tree

3 files changed

+12
-12
lines changed

3 files changed

+12
-12
lines changed
27.1 KB
Loading

articles/data-factory/transform-data-using-databricks-notebook.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ ms.topic: tutorial
77
ms.author: abnarain
88
author: nabhishek
99
ms.custom: seo-lt-2019
10-
ms.date: 08/31/2021
10+
ms.date: 09/08/2021
1111
---
1212

1313
# Run a Databricks notebook with the Databricks Notebook Activity in Azure Data Factory
@@ -104,7 +104,7 @@ In this section, you author a Databricks linked service. This linked service con
104104

105105
1. For **Access Token**, generate it from Azure Databricks workplace. You can find the steps [here](https://docs.databricks.com/api/latest/authentication.html#generate-token).
106106

107-
1. For **Cluster version**, select **4.2** (with Apache Spark 2.3.1, Scala 2.11).
107+
1. For **Cluster version**, select the version you want to use.
108108

109109
1. For **Cluster node type**, select **Standard\_D3\_v2** under **General Purpose (HDD)** category for this tutorial.
110110

@@ -144,13 +144,13 @@ In this section, you author a Databricks linked service. This linked service con
144144

145145
1. Create a **New Folder** in Workplace and call it as **adftutorial**.
146146

147-
![Screenshot showing how to create a new folder.](media/transform-data-using-databricks-notebook/databricks-notebook-activity-image13.png)
147+
:::image type="content" source="media/transform-data-using-databricks-notebook/databricks-notebook-activity-image13.png" alt-text="Screenshot showing how to create a new folder.":::
148148

149149
1. [Screenshot showing how to create a new notebook.](https://docs.databricks.com/user-guide/notebooks/index.html#creating-a-notebook) (Python), let’s call it **mynotebook** under **adftutorial** Folder, click **Create.**
150150

151-
![Screenshot showing how to create a new notebook.](media/transform-data-using-databricks-notebook/databricks-notebook-activity-image14.png)
151+
:::image type="content" source="media/transform-data-using-databricks-notebook/databricks-notebook-activity-image14.png" alt-text="Screenshot showing how to create a new notebook.":::
152152

153-
![Screenshot showing how to set the properties of the new notebook.](media/transform-data-using-databricks-notebook/databricks-notebook-activity-image15.png)
153+
:::image type="content" source="media/transform-data-using-databricks-notebook/databricks-notebook-activity-image15.png" alt-text="Screenshot showing how to set the properties of the new notebook.":::
154154

155155
1. In the newly created notebook "mynotebook'" add the following code:
156156

@@ -163,7 +163,7 @@ In this section, you author a Databricks linked service. This linked service con
163163
print (y)
164164
```
165165
166-
![Screenshot showing how to create widgets for parameters.](media/transform-data-using-databricks-notebook/databricks-notebook-activity-image16.png)
166+
:::image type="content" source="media/transform-data-using-databricks-notebook/databricks-notebook-activity-image16.png" alt-text="Screenshot showing how to create widgets for parameters.":::
167167
168168
1. The **Notebook Path** in this case is **/adftutorial/mynotebook**.
169169
@@ -197,25 +197,25 @@ The **Pipeline run** dialog box asks for the **name** parameter. Use **/path/fil
197197
198198
1. Switch to the **Monitor** tab. Confirm that you see a pipeline run. It takes approximately 5-8 minutes to create a Databricks job cluster, where the notebook is executed.
199199
200-
![Screenshot showing how to monitor the pipeline.](media/transform-data-using-databricks-notebook/databricks-notebook-activity-image22.png)
200+
:::image type="content" source="media/transform-data-using-databricks-notebook/databricks-notebook-activity-image-22.png" alt-text="Screenshot showing how to monitor the pipeline.":::
201201
202202
1. Select **Refresh** periodically to check the status of the pipeline run.
203203
204-
1. To see activity runs associated with the pipeline run, select **View Activity Runs** in the **Actions** column.
204+
1. To see activity runs associated with the pipeline run, select **pipeline1** link in the **Pipeline name** column.
205205
206-
![Screenshot showing how to view the activity runs.](media/transform-data-using-databricks-notebook/databricks-notebook-activity-image23.png)
206+
1. In the **Activity runs** page, select **Output** in the **Activity name** column to view the output of each activity, and you can find the link to Databricks logs in the **Output** pane for more detailed Spark logs.
207207
208-
You can switch back to the pipeline runs view by selecting the **Pipelines** link at the top.
208+
1. You can switch back to the pipeline runs view by selecting the **All pipeline runs** link in the breadcrumb menu at the top.
209209
210210
## Verify the output
211211
212212
You can log on to the **Azure Databricks workspace**, go to **Clusters** and you can see the **Job** status as *pending execution, running, or terminated*.
213213
214-
![Screenshot showing how to view the job cluster and the job.](media/transform-data-using-databricks-notebook/databricks-notebook-activity-image24.png)
214+
:::image type="content" source="media/transform-data-using-databricks-notebook/databricks-notebook-activity-image24.png" alt-text="Screenshot showing how to view the job cluster and the job.":::
215215
216216
You can click on the **Job name** and navigate to see further details. On successful run, you can validate the parameters passed and the output of the Python notebook.
217217
218-
![Screenshot showing how to view the run details and output.](media/transform-data-using-databricks-notebook/databricks-output.png)
218+
:::image type="content" source="media/transform-data-using-databricks-notebook/databricks-output.png" alt-text="Screenshot showing how to view the run details and output.":::
219219
220220
## Next steps
221221

0 commit comments

Comments
 (0)