Skip to content

Commit 5f58429

Browse files
authored
Merge pull request #298825 from whhender/databricks-jobs-activity
Adding job activity article
2 parents 845fd9e + 4acbab5 commit 5f58429

File tree

2 files changed

+78
-0
lines changed

2 files changed

+78
-0
lines changed

articles/data-factory/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -699,6 +699,8 @@ items:
699699
href: transform-data-using-custom-activity.md
700700
- name: Databricks Jar activity
701701
href: transform-data-databricks-jar.md
702+
- name: Databricks Job activity
703+
href: transform-data-databricks-job.md
702704
displayName: data bricks
703705
- name: Databricks Notebook activity
704706
href: transform-data-databricks-notebook.md
Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
---
2+
title: Transform data with Databricks Job
3+
titleSuffix: Azure Data Factory & Azure Synapse
4+
description: Learn how to process or transform data by running a Databricks job in Azure Data Factory pipelines.
5+
ms.custom: synapse
6+
author: n0elleli
7+
ms.author: noelleli
8+
ms.reviewer: whhender
9+
ms.topic: how-to
10+
ms.date: 04/24/2025
11+
ms.subservice: orchestration
12+
---
13+
14+
# Transform data by running a Databricks job
15+
16+
[!INCLUDE[appliesto-adf-asa-md](includes/appliesto-adf-xxx-md.md)]
17+
18+
The Azure Databricks Job Activity (Preview) in a [pipeline](concepts-pipelines-activities.md) runs Databricks jobs in your Azure Databricks workspace, including serverless jobs. This article builds on the [data transformation activities](transform-data.md) article, which presents a general overview of data transformation and the supported transformation activities. Azure Databricks is a managed platform for running Apache Spark.
19+
20+
You can create a Databricks job directly through the Azure Data Factory Studio user interface.
21+
22+
> [!IMPORTANT]
23+
> The Azure Databricks Jobs activity is currently in preview. This information relates to a pre-release product that may be substantially modified before it's released. Microsoft makes no warranties, expressed or implied, with respect to the information provided here.
24+
25+
## Add a Job activity for Azure Databricks to a pipeline with UI
26+
27+
To use a Job activity for Azure Databricks in a pipeline, complete the following steps:
28+
29+
1. Search for _Job_ in the pipeline Activities pane, and drag a Job activity to the pipeline canvas.
30+
1. Select the new Job activity on the canvas if it isn't already selected.
31+
1. Select the **Azure Databricks** tab to select or create a new Azure Databricks linked service that executes the Job activity.
32+
1. Select the **Settings** tab and specify the job to be executed on Azure Databricks, optional base parameters to be passed to the job, and any other libraries to be installed on the cluster to execute the job.
33+
34+
## Databricks Job activity definition
35+
36+
Here's the sample JSON definition of a Databricks Job Activity:
37+
38+
```json
39+
{
40+
"activity": {
41+
"name": "MyActivity",
42+
"description": "MyActivity description",
43+
"type": "DatabricksJob",
44+
"linkedServiceName": {
45+
"referenceName": "MyDatabricksLinkedservice",
46+
"type": "LinkedServiceReference"
47+
},
48+
"typeProperties": {
49+
"jobID": "012345678910112",
50+
"jobParameters": {
51+
"testParameter": "testValue"
52+
},
53+
}
54+
}
55+
}
56+
```
57+
58+
## Databricks Job activity properties
59+
60+
The following table describes the JSON properties used in the JSON
61+
definition:
62+
63+
|Property|Description|Required|
64+
|---|---|---|
65+
|name|Name of the activity in the pipeline.|Yes|
66+
|description|Text describing what the activity does.|No|
67+
|type|For Databricks Job Activity, the activity type is DatabricksJob.|Yes|
68+
|linkedServiceName|Name of the Databricks Linked Service on which the Databricks job runs. To learn about this linked service, see [Compute linked services](compute-linked-services.md) article.|Yes|
69+
|jobId|The ID of the job to be run in the Databricks Workspace.|Yes|
70+
|jobParameters|An array of Key-Value pairs. Job parameters can be used for each activity run. If the job takes a parameter that isn't specified, the default value from the job will be used. Find more on parameters in [Databricks Jobs](https://docs.databricks.com/api/latest/jobs.html#jobsparampair).|No|
71+
72+
73+
## Passing parameters between jobs and pipelines
74+
75+
You can pass parameters to jobs using *jobParameters* property in Databricks activity.
76+

0 commit comments

Comments
 (0)