-
Notifications
You must be signed in to change notification settings - Fork 472
Description
Description
The databricks_job Terraform resource rejects job configurations with SQL tasks that provide output to for_each tasks, even though the exact same configuration works when created directly via the Databricks Jobs API or UI.
This indicates a validation limitation in the Terraform provider that doesn't exist in the underlying Databricks API.
Error Message
Error: cannot create job: Unsupported task type for the 'output' namespace.
with module.jobs["test_job"].databricks_job.job,
on .terraform/modules/jobs/modules/jobs/main.tf line 25, in resource "databricks_job" "job":
25: resource "databricks_job" "job" {
Terraform Version
Terraform v1.13
Databricks Terraform Provider Version
databricks/databricks v1.96.0 (latest to date)
Reproduction Steps
Working Configuration (via Databricks UI/Jobs API)
Create a job with these tasks via the Databricks UI or REST API:
- SQL Task (
sql_generate_rows): Returns rows from a query - For Each Task (
for_each_row): Consumes the SQL task output
Working JSON (REST API):
{
"name": "Test SQL to ForEach",
"tasks": [
{
"task_key": "sql_generate_rows",
"sql_task": {
"query": {
"query": "SELECT variables, parameters FROM my_table"
},
"sql_warehouse_name": "my_warehouse"
},
"timeout_seconds": 3600
},
{
"task_key": "for_each_task",
"depends_on": [
{
"task_key": "sql_generate_rows"
}
],
"for_each_task": {
"task_key": "nested_notebook",
"inputs": "{{tasks.sql_generate_rows.output.rows}}",
"concurrency": 2,
"notebook_task": {
"notebook_path": "/path/to/notebook",
"base_parameters": {
"variables": "{{input.variables}}",
"parameters": "{{input.parameters}}"
}
}
}
}
]
}Failing Configuration (via Terraform)
The equivalent Terraform HCL configuration:
resource "databricks_job" "test_job" {
name = "Test SQL to ForEach"
task {
task_key = "sql_generate_rows"
timeout_seconds = 3600
sql_task {
query {
query = "SELECT variables, parameters FROM my_table"
}
sql_warehouse_name = "my_warehouse"
}
}
task {
task_key = "for_each_task"
depends_on {
task_key = "sql_generate_rows"
}
for_each_task {
inputs = "{{tasks.sql_generate_rows.output.rows}}"
concurrency = 2
task {
task_key = "nested_notebook"
notebook_task {
notebook_path = "/path/to/notebook"
base_parameters = {
variables = "{{input.variables}}"
parameters = "{{input.parameters}}"
}
}
}
}
}
}Error: Error: cannot create job: Unsupported task type for the 'output' namespace.
Expected Behavior
The Terraform provider should accept SQL task output in for_each task inputs, just like the Jobs API does. The configuration should apply successfully.
Actual Behavior
The Terraform provider validates the configuration locally and rejects it with "Unsupported task type for the 'output' namespace" error, preventing job creation even though the API would accept it.
Investigation Findings
- The Jobs API supports this: Creating the exact same job configuration via
POST /api/2.2/jobs/createworks perfectly - The Databricks UI supports this: Jobs created in the UI with SQL task outputs feeding
for_eachtasks run successfully - The Terraform provider blocks it: The HCL translation of this working configuration is rejected during validation
Root Cause
This appears to be a validation limitation in the Terraform provider's schema definition for the databricks_job resource, not an API limitation. The provider's validation layer doesn't recognize SQL tasks as valid sources for the output namespace in for_each task configurations.
Workarounds
Did not manage to find any workarounds.
Additional Context
- This is a production blocker for users who want to manage SQL+ForEach workflows via Terraform which is a common pattern.
- The feature works reliably in production via the REST API, so clearly a limitation with the provider.