You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/resources/job.md
+55-7Lines changed: 55 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ The `databricks_job` resource allows you to manage [Databricks Jobs](https://doc
10
10
11
11
-> **Note** In Terraform configuration, it is recommended to define tasks in alphabetical order of their `task_key` arguments, so that you get consistent and readable diff. Whenever tasks are added or removed, or `task_key` is renamed, you'll observe a change in the majority of tasks. It's related to the fact that the current version of the provider treats `task` blocks as an ordered list. Alternatively, `task` block could have been an unordered set, though end-users would see the entire block replaced upon a change in single property of the task.
12
12
13
-
It is possible to create [a Databricks job](https://docs.databricks.com/data-engineering/jobs/jobs-user-guide.html) using `task` blocks. Single task is defined with the `task` block containing one of the `*_task` block, `task_key`, `libraries`, `email_notifications`, `timeout_seconds`, `max_retries`, `min_retry_interval_millis`, `retry_on_timeout` attributes and `depends_on` blocks to define cross-task dependencies.
13
+
It is possible to create [a Databricks job](https://docs.databricks.com/data-engineering/jobs/jobs-user-guide.html) using `task` blocks. Single task is defined with the `task` block containing one of the `*_task` block, `task_key`, and additional arguments described below.
14
14
15
15
```hcl
16
16
resource "databricks_job" "this" {
@@ -88,13 +88,44 @@ The resource supports the following arguments:
88
88
```
89
89
*`library` - (Optional) (Set) An optional list of libraries to be installed on the cluster that will execute the job. Please consult [libraries section](cluster.md#libraries) for [databricks_cluster](cluster.md) resource.
90
90
*`retry_on_timeout` - (Optional) (Bool) An optional policy to specify whether to retry a job when it times out. The default behavior is to not retry on timeout.
91
-
*`max_retries` - (Optional) (Integer) An optional maximum number of times to retry an unsuccessful run. A run is considered to be unsuccessful if it completes with a FAILED or INTERNAL_ERROR lifecycle state. The value -1 means to retry indefinitely and the value 0 means to never retry. The default behavior is to never retry. A run can have the following lifecycle state: PENDING, RUNNING, TERMINATING, TERMINATED, SKIPPED or INTERNAL_ERROR
91
+
*`max_retries` - (Optional) (Integer) An optional maximum number of times to retry an unsuccessful run. A run is considered to be unsuccessful if it completes with a `FAILED` or `INTERNAL_ERROR` lifecycle state. The value -1 means to retry indefinitely and the value 0 means to never retry. The default behavior is to never retry.
92
92
*`timeout_seconds` - (Optional) (Integer) An optional timeout applied to each run of this job. The default behavior is to have no timeout.
93
93
*`min_retry_interval_millis` - (Optional) (Integer) An optional minimal interval in milliseconds between the start of the failed run and the subsequent retry run. The default behavior is that unsuccessful runs are immediately retried.
94
94
*`max_concurrent_runs` - (Optional) (Integer) An optional maximum allowed number of concurrent runs of the job. Defaults to *1*.
95
-
*`email_notifications` - (Optional) (List) An optional set of email addresses notified when runs of this job begins, completes and fails. The default behavior is to not send any emails. This field is a block and is documented below.
95
+
*`email_notifications` - (Optional) (List) An optional set of email addresses notified when runs of this job begins, completes and fails. The default behavior is to not send any emails. This field is a block and is [documented below](#email_notifications-configuration-block).
96
96
*`webhook_notifications` - (Optional) (List) An optional set of system destinations (for example, webhook destinations or Slack) to be notified when runs of this job begins, completes and fails. The default behavior is to not send any notifications. This field is a block and is documented below.
97
+
*`notification_settings` - (Optional) An optional block controlling the notification settings on the job level (described below).
97
98
*`schedule` - (Optional) (List) An optional periodic schedule for this job. The default behavior is that the job runs when triggered by clicking Run Now in the Jobs UI or sending an API request to runNow. This field is a block and is documented below.
99
+
*`health` - (Optional) An optional block that specifies the health conditions for the job (described below).
100
+
101
+
### task Configuration Block
102
+
103
+
This block describes individual tasks:
104
+
105
+
*`task_key` - (Required) string specifying an unique key for a given task.
106
+
*`*_task` - (Required) one of the specific task blocks described below:
107
+
*`dbt_task`
108
+
*`notebook_task`
109
+
*`pipeline_task`
110
+
*`python_wheel_task`
111
+
*`spark_jar_task`
112
+
*`spark_python_task`
113
+
*`spark_submit_task`
114
+
*`sql_task`
115
+
*`library` - (Optional) (Set) An optional list of libraries to be installed on the cluster that will execute the job. Please consult [libraries section](cluster.md#libraries) for [databricks_cluster](cluster.md) resource.
116
+
*`depends_on` - (Optional) block specifying dependency(-ies) for a given task.
117
+
*`retry_on_timeout` - (Optional) (Bool) An optional policy to specify whether to retry a job when it times out. The default behavior is to not retry on timeout.
118
+
*`max_retries` - (Optional) (Integer) An optional maximum number of times to retry an unsuccessful run. A run is considered to be unsuccessful if it completes with a `FAILED` or `INTERNAL_ERROR` lifecycle state. The value -1 means to retry indefinitely and the value 0 means to never retry. The default behavior is to never retry. A run can have the following lifecycle state: `PENDING`, `RUNNING`, `TERMINATING`, `TERMINATED`, `SKIPPED` or `INTERNAL_ERROR`.
119
+
*`timeout_seconds` - (Optional) (Integer) An optional timeout applied to each run of this job. The default behavior is to have no timeout.
120
+
*`min_retry_interval_millis` - (Optional) (Integer) An optional minimal interval in milliseconds between the start of the failed run and the subsequent retry run. The default behavior is that unsuccessful runs are immediately retried.
121
+
*`email_notifications` - (Optional) (List) An optional set of email addresses notified when runs of this job begins, completes and fails. The default behavior is to not send any emails. This field is a block and is [documented below](#email_notifications-configuration-block).
122
+
*`health` - (Optional) block described below that specifies health conditions for a given task.
123
+
124
+
### depends_on Configuration Block
125
+
126
+
This block describes dependencies of a given task:
127
+
128
+
*`task_key` - (Required) The name of the task this task depends on.
98
129
99
130
### tags Configuration Map
100
131
`tags` - (Optional) (Map) An optional map of the tags associated with the job. Specified tags will be used as cluster tags for job clusters.
[Shared job cluster](https://docs.databricks.com/jobs.html#use-shared-job-clusters) specification. Allows multiple tasks in the same job run to reuse the cluster.
@@ -172,6 +201,7 @@ This block is used to specify Git repository information & branch/tag/commit tha
172
201
*`on_start` - (Optional) (List) list of emails to notify when the run starts.
173
202
*`on_success` - (Optional) (List) list of emails to notify when the run completes successfully.
174
203
*`on_failure` - (Optional) (List) list of emails to notify when the run fails.
204
+
*`on_duration_warning_threshold_exceeded` - (Optional) (List) list of emails to notify when the duration of a run exceeds the threshold specified by the `RUN_DURATION_SECONDS` metric in the `health` block.
175
205
*`no_alert_for_skipped_runs` - (Optional) (Bool) don't send alert for skipped runs. (It's recommended to use the corresponding setting in the `notification_settings` configuration block).
176
206
177
207
### webhook_notifications Configuration Block
@@ -181,6 +211,7 @@ Each entry in `webhook_notification` block takes a list `webhook` blocks. The fi
181
211
*`on_start` - (Optional) (List) list of notification IDs to call when the run starts. A maximum of 3 destinations can be specified.
182
212
*`on_success` - (Optional) (List) list of notification IDs to call when the run completes successfully. A maximum of 3 destinations can be specified.
183
213
*`on_failure` - (Optional) (List) list of notification IDs to call when the run fails. A maximum of 3 destinations can be specified.
214
+
*`on_duration_warning_threshold_exceeded` - (Optional) (List) list of notification IDs to call when the duration of a run exceeds the threshold specified by the `RUN_DURATION_SECONDS` metric in the `health` block.
184
215
185
216
Note that the `id` is not to be confused with the name of the alert destination. The `id` can be retrieved through the API or the URL of Databricks UI `https://<workspace host>/sql/destinations/<notification id>?o=<workspace id>`
186
217
@@ -200,13 +231,30 @@ webhook_notifications {
200
231
201
232
-> **Note** The following configuration blocks can be standalone or nested inside a `task` block
This block controls notification settings for both email & webhook notifications on a task level:
244
+
245
+
*`no_alert_for_skipped_runs` - (Optional) (Bool) don't send alert for skipped runs.
246
+
*`no_alert_for_canceled_runs` - (Optional) (Bool) don't send alert for cancelled runs.
247
+
*`alert_on_last_attempt` - (Optional) (Bool) do not send notifications to recipients specified in `on_start` for the retried runs and do not send notifications to recipients specified in `on_failure` until the last retry of the run.
248
+
249
+
### health Configuration Block
250
+
251
+
This block describes health conditions for a given job or an individual task. It consists of the following attributes:
252
+
253
+
*`rules` - (List) list of rules that are represented as objects with the following attributes:
254
+
*`metric` - (Optional) string specifying the metric to check. The only supported metric is `RUN_DURATION_SECONDS` (check [Jobs REST API documentation](https://docs.databricks.com/api/workspace/jobs/create) for the latest information).
255
+
*`op` - (Optional) string specifying the operation used to evaluate the given metric. The only supported operation is `GREATER_THAN`.
256
+
*`value` - (Optional) integer value used to compare to the given metric.
257
+
210
258
### spark_jar_task Configuration Block
211
259
212
260
*`parameters` - (Optional) (List) Parameters passed to the main method.
0 commit comments