Skip to content

Single Node Jobs creation leads to Missing required field: settings.cluster_spec.new_cluster.size Databricks CLI 0.200.2 #663

@zyeiy2

Description

@zyeiy2

Setup

  1. Creating a Simple Job via Databricks UI and exporting the Job definition Json
  2. Create the job via Databricks CLI with the previous Job definition Json

Description
Azure Databricks CLI logged several events related to a process involving the creation of a job on Databricks. The Azure Databricks CLI on Linux version used was 0.200.2.

The DEFAULT profile from "/root/.databrickscfg" was used for authentication and correct configured because the other operations work well. In the .databrickscfg API Verison 2.1 and 2.0 were already tested and lead to the same result.

However, the job creation failed due to a missing required field in the cluster specification: "settings.cluster_spec.new_cluster.size." As a result, the CLI returned an error with the code "INVALID_PARAMETER_VALUE" and the message "Cluster validation error: Missing required field: settings.cluster_spec.new_cluster.size."

Running this command lead to the error above:
databricks jobs create --json @b.json

Find the content of the b.json file below.

{
    "run_as": {
        "user_name": "[email protected]"
    },
    "name": "UC Test",
    "email_notifications": {
        "no_alert_for_skipped_runs": false
    },
    "webhook_notifications": {},
    "timeout_seconds": 0,
    "max_concurrent_runs": 1,
    "tasks": [
        {
            "task_key": "Task",
            "notebook_task": {
                "notebook_path": "/Shared/z-job/NB_Start_Job",
                "source": "WORKSPACE"
            },
            "job_cluster_key": "Job_cluster",
            "timeout_seconds": 0,
            "email_notifications": {}
        }
    ],
    "job_clusters": [
        {
            "job_cluster_key": "Job_cluster",
            "new_cluster": {
                "spark_version": "12.2.x-scala2.12",
                "spark_conf": {
                    "spark.databricks.delta.preview.enabled": "true",
                    "spark.master": "local[*, 4]",
                    "spark.databricks.cluster.profile": "singleNode"
                },
                "azure_attributes": {
                    "first_on_demand": 1,
                    "availability": "ON_DEMAND_AZURE",
                    "spot_bid_max_price": -1
                },
                "node_type_id": "Standard_DS3_v2",
                "custom_tags": {
                    "ResourceClass": "SingleNode"
                },
                "spark_env_vars": {
                    "PYSPARK_PYTHON": "/databricks/python3/bin/python3"
                },
                "enable_elastic_disk": true,
                "data_security_mode": "SINGLE_USER",
                "runtime_engine": "STANDARD",
                "num_workers": 0
            }
        }
    ],
    "format": "MULTI_TASK"
}

The log of the command executed can be found here

And leads also to the error:
time=2023-08-01T07:52:58.409+02:00 level=ERROR source=root.go:96 msg="failed execution" exit_code=1 error="Cluster validation error: Missing required field: settings.cluster_spec.new_cluster.size"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions