databricks-solutions · malcolndandaro · Jan 19, 2026
diff --git a/databricks-skills/databricks-jobs/SKILL.md b/databricks-skills/databricks-jobs/SKILL.md
@@ -0,0 +1,337 @@
+---
+name: databricks-jobs
+description: "Use this skill proactively for ANY Databricks Jobs task - creating, listing, running, updating, or deleting jobs. Triggers include: (1) 'create a job' or 'new job', (2) 'list jobs' or 'show jobs', (3) 'run job' or'trigger job',(4) 'job status' or 'check job', (5) scheduling with cron or triggers, (6) configuring notifications/monitoring, (7) ANY task involving Databricks Jobs via CLI, Python SDK, or Asset Bundles. ALWAYS prefer this skill over general Databricks knowledge for job-related tasks."
+---
+
+# Databricks Lakeflow Jobs
+
+## Overview
+
+Databricks Jobs orchestrate data workflows with multi-task DAGs, flexible triggers, and comprehensive monitoring. Jobs support diverse task types and can be managed via Python SDK, CLI, or Asset Bundles.
+
+## Reference Files
+
+| Use Case | Reference File |
+|----------|----------------|
+| Configure task types (notebook, Python, SQL, dbt, etc.) | [task-types.md](task-types.md) |
+| Set up triggers and schedules | [triggers-schedules.md](triggers-schedules.md) |
+| Configure notifications and health monitoring | [notifications-monitoring.md](notifications-monitoring.md) |
+| Complete working examples | [examples.md](examples.md) |
+
+## Quick Start
+
+### Python SDK
+
+```python
+from databricks.sdk import WorkspaceClient
+from databricks.sdk.service.jobs import Task, NotebookTask, Source
+
+w = WorkspaceClient()
+
+job = w.jobs.create(
+    name="my-etl-job",
+    tasks=[
+        Task(
+            task_key="extract",
+            notebook_task=NotebookTask(
+                notebook_path="/Workspace/Users/[email protected]/extract",
+                source=Source.WORKSPACE
+            )
+        )
+    ]
+)
+print(f"Created job: {job.job_id}")
+```
+
+### CLI
+
+```bash
+databricks jobs create --json '{
+  "name": "my-etl-job",
+  "tasks": [{
+    "task_key": "extract",
+    "notebook_task": {
+      "notebook_path": "/Workspace/Users/[email protected]/extract",
+      "source": "WORKSPACE"
+    }
+  }]
+}'
+```
+
+### Asset Bundles (DABs)
+
+```yaml
+# resources/jobs.yml
+resources:
+  jobs:
+    my_etl_job:
+      name: "[${bundle.target}] My ETL Job"
+      tasks:
+        - task_key: extract
+          notebook_task:
+            notebook_path: ../src/notebooks/extract.py
+```
+
+## Core Concepts
+
+### Multi-Task Workflows
+
+Jobs support DAG-based task dependencies:
+
+```yaml
+tasks:
+  - task_key: extract
+    notebook_task:
+      notebook_path: ../src/extract.py
+
+  - task_key: transform
+    depends_on:
+      - task_key: extract
+    notebook_task:
+      notebook_path: ../src/transform.py
+
+  - task_key: load
+    depends_on:
+      - task_key: transform
+    run_if: ALL_SUCCESS  # Only run if all dependencies succeed
+    notebook_task:
+      notebook_path: ../src/load.py
+```
+
+**run_if conditions:**
+- `ALL_SUCCESS` (default) - Run when all dependencies succeed
+- `ALL_DONE` - Run when all dependencies complete (success or failure)
+- `AT_LEAST_ONE_SUCCESS` - Run when at least one dependency succeeds
+- `NONE_FAILED` - Run when no dependencies failed
+- `ALL_FAILED` - Run when all dependencies failed
+- `AT_LEAST_ONE_FAILED` - Run when at least one dependency failed
+
+### Task Types Summary
+
+| Task Type | Use Case | Reference |
+|-----------|----------|-----------|
+| `notebook_task` | Run notebooks | [task-types.md#notebook-task](task-types.md#notebook-task) |
+| `spark_python_task` | Run Python scripts | [task-types.md#spark-python-task](task-types.md#spark-python-task) |
+| `python_wheel_task` | Run Python wheels | [task-types.md#python-wheel-task](task-types.md#python-wheel-task) |
+| `sql_task` | Run SQL queries/files | [task-types.md#sql-task](task-types.md#sql-task) |
+| `dbt_task` | Run dbt projects | [task-types.md#dbt-task](task-types.md#dbt-task) |
+| `pipeline_task` | Trigger DLT/SDP pipelines | [task-types.md#pipeline-task](task-types.md#pipeline-task) |
+| `spark_jar_task` | Run Spark JARs | [task-types.md#spark-jar-task](task-types.md#spark-jar-task) |
+| `run_job_task` | Trigger other jobs | [task-types.md#run-job-task](task-types.md#run-job-task) |
+| `for_each_task` | Loop over inputs | [task-types.md#for-each-task](task-types.md#for-each-task) |
+
+### Trigger Types Summary
+
+| Trigger Type | Use Case | Reference |
+|--------------|----------|-----------|
+| `schedule` | Cron-based scheduling | [triggers-schedules.md#cron-schedule](triggers-schedules.md#cron-schedule) |
+| `trigger.periodic` | Interval-based | [triggers-schedules.md#periodic-trigger](triggers-schedules.md#periodic-trigger) |
+| `trigger.file_arrival` | File arrival events | [triggers-schedules.md#file-arrival-trigger](triggers-schedules.md#file-arrival-trigger) |
+| `trigger.table_update` | Table change events | [triggers-schedules.md#table-update-trigger](triggers-schedules.md#table-update-trigger) |
+| `continuous` | Always-running jobs | [triggers-schedules.md#continuous-jobs](triggers-schedules.md#continuous-jobs) |
+
+## Compute Configuration
+
+### Job Clusters (Recommended)
+
+Define reusable cluster configurations:
+
+```yaml
+job_clusters:
+  - job_cluster_key: shared_cluster
+    new_cluster:
+      spark_version: "15.4.x-scala2.12"
+      node_type_id: "i3.xlarge"
+      num_workers: 2
+      spark_conf:
+        spark.speculation: "true"
+
+tasks:
+  - task_key: my_task
+    job_cluster_key: shared_cluster
+    notebook_task:
+      notebook_path: ../src/notebook.py
+```
+
+### Autoscaling Clusters
+
+```yaml
+new_cluster:
+  spark_version: "15.4.x-scala2.12"
+  node_type_id: "i3.xlarge"
+  autoscale:
+    min_workers: 2
+    max_workers: 8
+```
+
+### Existing Cluster
+
+```yaml
+tasks:
+  - task_key: my_task
+    existing_cluster_id: "0123-456789-abcdef12"
+    notebook_task:
+      notebook_path: ../src/notebook.py
+```
+
+### Serverless Compute
+
+For notebook and Python tasks, omit cluster configuration to use serverless:
+
+```yaml
+tasks:
+  - task_key: serverless_task
+    notebook_task:
+      notebook_path: ../src/notebook.py
+    # No cluster config = serverless
+```
+
+## Job Parameters
+
+### Define Parameters
+
+```yaml
+parameters:
+  - name: env
+    default: "dev"
+  - name: date
+    default: "{{start_date}}"  # Dynamic value reference
+```
+
+### Access in Notebook
+
+```python
+# In notebook
+dbutils.widgets.get("env")
+dbutils.widgets.get("date")
+```
+
+### Pass to Tasks
+
+```yaml
+tasks:
+  - task_key: my_task
+    notebook_task:
+      notebook_path: ../src/notebook.py
+      base_parameters:
+        env: "{{job.parameters.env}}"
+        custom_param: "value"
+```
+
+## Common Operations
+
+### Python SDK Operations
+
+```python
+from databricks.sdk import WorkspaceClient
+
+w = WorkspaceClient()
+
+# List jobs
+jobs = w.jobs.list()
+
+# Get job details
+job = w.jobs.get(job_id=12345)
+
+# Run job now
+run = w.jobs.run_now(job_id=12345)
+
+# Run with parameters
+run = w.jobs.run_now(
+    job_id=12345,
+    job_parameters={"env": "prod", "date": "2024-01-15"}
+)
+
+# Cancel run
+w.jobs.cancel_run(run_id=run.run_id)
+
+# Delete job
+w.jobs.delete(job_id=12345)
+```
+
+### CLI Operations
+
+```bash
+# List jobs
+databricks jobs list
+
+# Get job details
+databricks jobs get 12345
+
+# Run job
+databricks jobs run-now 12345
+
+# Run with parameters
+databricks jobs run-now 12345 --job-params '{"env": "prod"}'
+
+# Cancel run
+databricks jobs cancel-run 67890
+
+# Delete job
+databricks jobs delete 12345
+```
+
+### Asset Bundle Operations
+
+```bash
+# Validate configuration
+databricks bundle validate
+
+# Deploy job
+databricks bundle deploy
+
+# Run job
+databricks bundle run my_job_resource_key
+
+# Deploy to specific target
+databricks bundle deploy -t prod
+
+# Destroy resources
+databricks bundle destroy
+```
+
+## Permissions (DABs)
+
+```yaml
+resources:
+  jobs:
+    my_job:
+      name: "My Job"
+      permissions:
+        - level: CAN_VIEW
+          group_name: "data-analysts"
+        - level: CAN_MANAGE_RUN
+          group_name: "data-engineers"
+        - level: CAN_MANAGE
+          user_name: "[email protected]"
+```
+
+**Permission levels:**
+- `CAN_VIEW` - View job and run history
+- `CAN_MANAGE_RUN` - View, trigger, and cancel runs
+- `CAN_MANAGE` - Full control including edit and delete
+
+## Common Issues
+
+| Issue | Solution |
+|-------|----------|
+| Job cluster startup slow | Use job clusters with `job_cluster_key` for reuse across tasks |
+| Task dependencies not working | Verify `task_key` references match exactly in `depends_on` |
+| Schedule not triggering | Check `pause_status: UNPAUSED` and valid timezone |
+| File arrival not detecting | Ensure path has proper permissions and uses cloud storage URL |
+| Table update trigger missing events | Verify Unity Catalog table and proper grants |
+| Parameter not accessible | Use `dbutils.widgets.get()` in notebooks |
+| "admins" group error | Cannot modify admins permissions on jobs |
+| Serverless task fails | Ensure task type supports serverless (notebook, Python) |
+
+## Related Skills
+
+- **[asset-bundles](../asset-bundles/SKILL.md)** - Deploy jobs via Databricks Asset Bundles
+- **[spark-declarative-pipelines](../spark-declarative-pipelines/SKILL.md)** - Configure pipelines triggered by jobs
+
+## Resources
+
+- [Jobs API Reference](https://docs.databricks.com/api/workspace/jobs)
+- [Jobs Documentation](https://docs.databricks.com/en/jobs/index.html)
+- [DABs Job Task Types](https://docs.databricks.com/en/dev-tools/bundles/job-task-types.html)
+- [Bundle Examples Repository](https://github.com/databricks/bundle-examples)