Skip to content

Commit 058508f

Browse files
committed
Updates for Databricks documentation changes
1 parent 482f9ee commit 058508f

File tree

2 files changed

+14
-13
lines changed

2 files changed

+14
-13
lines changed

articles/data-factory/transform-data-databricks-notebook.md

Lines changed: 13 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -6,34 +6,35 @@ ms.custom: synapse
66
author: nabhishek
77
ms.author: abnarain
88
ms.topic: conceptual
9-
ms.date: 10/03/2024
9+
ms.date: 01/16/2025
1010
ms.subservice: orchestration
1111
---
1212

1313
# Transform data by running a Databricks notebook
14+
1415
[!INCLUDE[appliesto-adf-asa-md](includes/appliesto-adf-asa-md.md)]
1516

1617
The Azure Databricks Notebook Activity in a [pipeline](concepts-pipelines-activities.md) runs a Databricks notebook in your Azure Databricks workspace. This article builds on the [data transformation activities](transform-data.md) article, which presents a general overview of data transformation and the supported transformation activities. Azure Databricks is a managed platform for running Apache Spark.
1718

18-
You can create a Databricks notebook with an ARM template using JSON, or directly through the Azure Data Factory Studio user interface. For a step-by-step walkthrough of how to create a Databricks notebook activity using the user interface, reference the tutorial [Run a Databricks notebook with the Databricks Notebook Activity in Azure Data Factory](transform-data-using-databricks-notebook.md).
19+
You can create a Databricks notebook with an ARM template using JSON, or directly through the Azure Data Factory Studio user interface. For a step-by-step walkthrough of how to create a Databricks notebook activity using the user interface, reference the tutorial [Run a Databricks notebook with the Databricks Notebook Activity in Azure Data Factory](transform-data-using-databricks-notebook.md).
1920

2021
## Add a Notebook activity for Azure Databricks to a pipeline with UI
2122

2223
To use a Notebook activity for Azure Databricks in a pipeline, complete the following steps:
2324

2425
1. Search for _Notebook_ in the pipeline Activities pane, and drag a Notebook activity to the pipeline canvas.
25-
1. Select the new Notebook activity on the canvas if it is not already selected.
26+
1. Select the new Notebook activity on the canvas if it isn't already selected.
2627
1. Select the **Azure Databricks** tab to select or create a new Azure Databricks linked service that will execute the Notebook activity.
2728

2829
:::image type="content" source="media/transform-data-databricks-notebook/notebook-activity.png" alt-text="Shows the UI for a Notebook activity.":::
2930

30-
1. Select the **Settings** tab and specify the notebook path to be executed on Azure Databricks, optional base parameters to be passed to the notebook, and any additional libraries to be installed on the cluster to execute the job.
31+
1. Select the **Settings** tab and specify the notebook path to be executed on Azure Databricks, optional base parameters to be passed to the notebook, and any other libraries to be installed on the cluster to execute the job.
3132

3233
:::image type="content" source="media/transform-data-databricks-notebook/notebook-settings.png" alt-text="Shows the UI for the Settings tab for a Notebook activity.":::
3334

3435
## Databricks Notebook activity definition
3536

36-
Here is the sample JSON definition of a Databricks Notebook Activity:
37+
Here's the sample JSON definition of a Databricks Notebook Activity:
3738

3839
```json
3940
{
@@ -73,7 +74,7 @@ definition:
7374
|type|For Databricks Notebook Activity, the activity type is DatabricksNotebook.|Yes|
7475
|linkedServiceName|Name of the Databricks Linked Service on which the Databricks notebook runs. To learn about this linked service, see [Compute linked services](compute-linked-services.md) article.|Yes|
7576
|notebookPath|The absolute path of the notebook to be run in the Databricks Workspace. This path must begin with a slash.|Yes|
76-
|baseParameters|An array of Key-Value pairs. Base parameters can be used for each activity run. If the notebook takes a parameter that is not specified, the default value from the notebook will be used. Find more on parameters in [Databricks Notebooks](https://docs.databricks.com/api/latest/jobs.html#jobsparampair).|No|
77+
|baseParameters|An array of Key-Value pairs. Base parameters can be used for each activity run. If the notebook takes a parameter that isn't specified, the default value from the notebook will be used. Find more on parameters in [Databricks Notebooks](https://docs.databricks.com/api/latest/jobs.html#jobsparampair).|No|
7778
|libraries|A list of libraries to be installed on the cluster that will execute the job. It can be an array of \<string, object>.|No|
7879

7980
## Supported libraries for Databricks activities
@@ -126,28 +127,28 @@ You can pass parameters to notebooks using *baseParameters* property in databric
126127

127128
In certain cases, you might require to pass back certain values from notebook back to the service, which can be used for control flow (conditional checks) in the service or be consumed by downstream activities (size limit is 2 MB).
128129

129-
1. In your notebook, you may call [dbutils.notebook.exit("returnValue")](/azure/databricks/notebooks/notebook-workflows#notebook-workflows-exit) and corresponding "returnValue" will be returned to the service.
130+
1. In your notebook, you can call [dbutils.notebook.exit("returnValue")](/azure/databricks/notebooks/notebook-workflows#python-1) and corresponding "returnValue" will be returned to the service.
130131

131132
2. You can consume the output in the service by using expression such as `@{activity('databricks notebook activity name').output.runOutput}`.
132133

133134
> [!IMPORTANT]
134-
> If you are passing JSON object you can retrieve values by appending property names. Example: `@{activity('databricks notebook activity name').output.runOutput.PropertyName}`
135+
> If you're passing JSON object, you can retrieve values by appending property names. Example: `@{activity('databricks notebook activity name').output.runOutput.PropertyName}`
135136
136137
## How to upload a library in Databricks
137138

138139
### You can use the Workspace UI:
139140

140-
1. [Use the Databricks workspace UI](/azure/databricks/libraries/#create-a-library)
141+
1. [Use the Databricks workspace UI](/azure/databricks/libraries/cluster-libraries#install-a-library-on-a-cluster)
141142

142-
2. To obtain the dbfs path of the library added using UI, you can use [Databricks CLI](/azure/databricks/dev-tools/cli/#install-the-cli).
143+
2. To obtain the dbfs path of the library added using UI, you can use [Databricks CLI](/azure/databricks/dev-tools/cli/fs-commands#list-the-contents-of-a-directory).
143144

144145
Typically the Jar libraries are stored under dbfs:/FileStore/jars while using the UI. You can list all through the CLI: *databricks fs ls dbfs:/FileStore/job-jars*
145146

146147
### Or you can use the Databricks CLI:
147148

148-
1. Follow [Copy the library using Databricks CLI](/azure/databricks/dev-tools/cli/#copy-a-file-to-dbfs)
149+
1. Follow [Copy the library using Databricks CLI](/azure/databricks/dev-tools/cli/fs-commands#copy-a-directory-or-a-file)
149150

150-
2. Use Databricks CLI [(installation steps)](/azure/databricks/dev-tools/cli/#install-the-cli)
151+
2. Use Databricks CLI [(installation steps)](/azure/databricks/dev-tools/cli/commands#compute-commands)
151152

152153
As an example, to copy a JAR to dbfs:
153154
`dbfs cp SparkPi-assembly-0.1.jar dbfs:/docs/sparkpi.jar`

articles/data-factory/transform-data-using-databricks-notebook.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -203,7 +203,7 @@ You can log on to the **Azure Databricks workspace**, go to **Job Runs** and you
203203
204204
You can select the **Job name** and navigate to see further details. On successful run, you can validate the parameters passed and the output of the Python notebook.
205205
206-
## Related content
206+
## Summary
207207
208208
The pipeline in this sample triggers a Databricks Notebook activity and passes a parameter to it. You learned how to:
209209

0 commit comments

Comments
 (0)