Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions NEXT_CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
### Dependency updates

### Bundles
* Add `default-minimal` template for users who want a clean slate without sample code ([#3885](https://github.com/databricks/cli/pull/3885))
* Add validation that served_models and served_entities are not used at the same time. Add client side translation logic. ([#3880](https://github.com/databricks/cli/pull/3880))

### API Changes
1 change: 1 addition & 0 deletions acceptance/bundle/help/bundle-init/output.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ Initialize using a bundle template to get started quickly.
TEMPLATE_PATH optionally specifies which template to use. It can be one of the following:
- default-python: The default Python template for Notebooks and Lakeflow
- default-sql: The default SQL template for .sql files that run with Databricks SQL
- default-minimal: The minimal template, for advanced users
- dbt-sql: The dbt SQL template (databricks.com/blog/delivering-cost-effective-data-real-time-dbt-and-databricks)
- mlops-stacks: The Databricks MLOps Stacks template (github.com/databricks/mlops-stacks)
- pydabs: A variant of the 'default-python' template that defines resources in Python instead of YAML
Expand Down
6 changes: 6 additions & 0 deletions acceptance/bundle/templates/default-minimal/input.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"project_name": "my_default_minimal",
"include_job": "no",
"include_pipeline": "no",
"include_python": "no"
}
5 changes: 5 additions & 0 deletions acceptance/bundle/templates/default-minimal/out.test.toml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

33 changes: 33 additions & 0 deletions acceptance/bundle/templates/default-minimal/output.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@

>>> [CLI] bundle init default-minimal --config-file ./input.json --output-dir output
Welcome to the minimal Databricks Asset Bundle template!

This template creates a minimal project structure without sample code, ideal for advanced users.
(For getting started with Python or SQL code, use the default-python or default-sql templates instead.)

Your workspace at [DATABRICKS_URL] is used for initialization.
(See https://docs.databricks.com/dev-tools/cli/profiles.html for how to change your profile.)

✨ Your new project has been created in the 'my_default_minimal' directory!

To get started, refer to the project README.md file and the documentation at https://docs.databricks.com/dev-tools/bundles/index.html.

>>> [CLI] bundle validate -t dev
Name: my_default_minimal
Target: dev
Workspace:
Host: [DATABRICKS_URL]
User: [USERNAME]
Path: /Workspace/Users/[USERNAME]/.bundle/my_default_minimal/dev

Validation OK!

>>> [CLI] bundle validate -t prod
Name: my_default_minimal
Target: prod
Workspace:
Host: [DATABRICKS_URL]
User: [USERNAME]
Path: /Workspace/Users/[USERNAME]/.bundle/my_default_minimal/prod

Validation OK!
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Typings for Pylance in Visual Studio Code
# see https://github.com/microsoft/pyright/blob/main/docs/builtins.md
from databricks.sdk.runtime import *
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"recommendations": [
"databricks.databricks",
"redhat.vscode-yaml",
"ms-python.black-formatter"
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
{
"jupyter.interactiveWindow.cellMarker.codeRegex": "^# COMMAND ----------|^# Databricks notebook source|^(#\\s*%%|#\\s*\\<codecell\\>|#\\s*In\\[\\d*?\\]|#\\s*In\\[ \\])",
"jupyter.interactiveWindow.cellMarker.default": "# COMMAND ----------",
"python.testing.pytestArgs": [
"."
],
"files.exclude": {
"**/*.egg-info": true,
"**/__pycache__": true,
".pytest_cache": true,
"dist": true,
},
"files.associations": {
"**/.gitkeep": "markdown"
},

// Pylance settings (VS Code)
// Set typeCheckingMode to "basic" to enable type checking!
"python.analysis.typeCheckingMode": "off",
"python.analysis.extraPaths": ["src", "lib", "resources"],
"python.analysis.diagnosticMode": "workspace",
"python.analysis.stubPath": ".vscode",

// Pyright settings (Cursor)
// Set typeCheckingMode to "basic" to enable type checking!
"cursorpyright.analysis.typeCheckingMode": "off",
"cursorpyright.analysis.extraPaths": ["src", "lib", "resources"],
"cursorpyright.analysis.diagnosticMode": "workspace",
"cursorpyright.analysis.stubPath": ".vscode",

// General Python settings
"python.defaultInterpreterPath": "./.venv/bin/python",
"python.testing.unittestEnabled": false,
"python.testing.pytestEnabled": true,
"[python]": {
"editor.defaultFormatter": "ms-python.black-formatter",
"editor.formatOnSave": true,
},
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# my_default_minimal

The 'my_default_minimal' project was generated by using the default template.

* `src/`: Python source code for this project.
* `resources/`: Resource configurations (jobs, pipelines, etc.)
* `tests/`: Unit tests for the shared Python code.
* `fixtures/`: Fixtures for data sets (primarily used for testing).


## Getting started

Choose how you want to work on this project:

(a) Directly in your Databricks workspace, see
https://docs.databricks.com/dev-tools/bundles/workspace.

(b) Locally with an IDE like Cursor or VS Code, see
https://docs.databricks.com/dev-tools/vscode-ext.html.

(c) With command line tools, see https://docs.databricks.com/dev-tools/cli/databricks-cli.html

If you're developing with an IDE, dependencies for this project should be installed using uv:

* Make sure you have the UV package manager installed.
It's an alternative to tools like pip: https://docs.astral.sh/uv/getting-started/installation/.
* Run `uv sync --dev` to install the project's dependencies.


# Using this project using the CLI

The Databricks workspace and IDE extensions provide a graphical interface for working
with this project. It's also possible to interact with it directly using the CLI:

1. Authenticate to your Databricks workspace, if you have not done so already:
```
$ databricks configure
```
2. To deploy a development copy of this project, type:
```
$ databricks bundle deploy --target dev
```
(Note that "dev" is the default target, so the `--target` parameter
is optional here.)
This deploys everything that's defined for this project.
3. Similarly, to deploy a production copy, type:
```
$ databricks bundle deploy --target prod
```
4. To run a job or pipeline, use the "run" command:
```
$ databricks bundle run
```
5. Finally, to run tests locally, use `pytest`:
```
$ uv run pytest
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# This is a Databricks asset bundle definition for my_default_minimal.
# See https://docs.databricks.com/dev-tools/bundles/index.html for documentation.
bundle:
name: my_default_minimal
uuid: [UUID]

include:
- resources/*.yml
- resources/*/*.yml

# Variable declarations. These variables are assigned in the dev/prod targets below.
variables:
catalog:
description: The catalog to use
schema:
description: The schema to use

targets:
dev:
# The default target uses 'mode: development' to create a development copy.
# - Deployed resources get prefixed with '[dev my_user_name]'
# - Any job schedules and triggers are paused by default.
# See also https://docs.databricks.com/dev-tools/bundles/deployment-modes.html.
mode: development
default: true
workspace:
host: [DATABRICKS_URL]
variables:
catalog: hive_metastore
schema: ${workspace.current_user.short_name}
prod:
mode: production
workspace:
host: [DATABRICKS_URL]
# We explicitly deploy to /Workspace/Users/[USERNAME] to make sure we only have a single copy.
root_path: /Workspace/Users/[USERNAME]/.bundle/${bundle.name}/${bundle.target}
variables:
catalog: hive_metastore
schema: prod
permissions:
- user_name: [USERNAME]
level: CAN_MANAGE
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Test fixtures directory

Add JSON or CSV files here. In tests, use them with `load_fixture()`:

```
def test_using_fixture(load_fixture):
data = load_fixture("my_data.json")
assert len(data) >= 1
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
.databricks/
build/
dist/
__pycache__/
*.egg-info
.venv/
scratch/**
!scratch/README.md
**/explorations/**
**/!explorations/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
[project]
name = "my_default_minimal"
version = "0.0.1"
authors = [{ name = "[USERNAME]" }]
requires-python = ">=3.10,<=3.13"
dependencies = [
# Any dependencies for jobs and pipelines in this project can be added here
# See also https://docs.databricks.com/dev-tools/bundles/library-dependencies
#
# LIMITATION: for pipelines, dependencies are cached during development;
# add dependencies to the 'environment' section of your pipeline.yml file instead
]

[dependency-groups]
dev = [
"pytest",
"databricks-dlt",
"databricks-connect>=15.4,<15.5",
]

[project.scripts]
main = "my_default_minimal.main:main"

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.hatch.build.targets.wheel]
packages = ["src"]

[tool.black]
line-length = 125
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
"""This file configures pytest.
This file is in the root since it can be used for tests in any place in this
project, including tests under resources/.
"""

import os, sys, pathlib
from contextlib import contextmanager


try:
from databricks.connect import DatabricksSession
from databricks.sdk import WorkspaceClient
from pyspark.sql import SparkSession
import pytest
import json
import csv
import os
except ImportError:
raise ImportError(
"Test dependencies not found.\n\nRun tests using 'uv run pytest'. See http://docs.astral.sh/uv to learn more about uv."
)


@pytest.fixture()
def spark() -> SparkSession:
"""Provide a SparkSession fixture for tests.
Minimal example:
def test_uses_spark(spark):
df = spark.createDataFrame([(1,)], ["x"])
assert df.count() == 1
"""
return DatabricksSession.builder.getOrCreate()


@pytest.fixture()
def load_fixture(spark: SparkSession):
"""Provide a callable to load JSON or CSV from fixtures/ directory.
Example usage:
def test_using_fixture(load_fixture):
data = load_fixture("my_data.json")
assert data.count() >= 1
"""

def _loader(filename: str):
path = pathlib.Path(__file__).parent.parent / "fixtures" / filename
suffix = path.suffix.lower()
if suffix == ".json":
rows = json.loads(path.read_text())
return spark.createDataFrame(rows)
if suffix == ".csv":
with path.open(newline="") as f:
rows = list(csv.DictReader(f))
return spark.createDataFrame(rows)
raise ValueError(f"Unsupported fixture type for: {filename}")

return _loader


def _enable_fallback_compute():
"""Enable serverless compute if no compute is specified."""
conf = WorkspaceClient().config
if conf.serverless_compute_id or conf.cluster_id or os.environ.get("SPARK_REMOTE"):
return

url = "https://docs.databricks.com/dev-tools/databricks-connect/cluster-config"
print("☁️ no compute specified, falling back to serverless compute", file=sys.stderr)
print(f" see {url} for manual configuration", file=sys.stdout)

os.environ["DATABRICKS_SERVERLESS_COMPUTE_ID"] = "auto"


@contextmanager
def _allow_stderr_output(config: pytest.Config):
"""Temporarily disable pytest output capture."""
capman = config.pluginmanager.get_plugin("capturemanager")
if capman:
with capman.global_and_fixture_disabled():
yield
else:
yield


def pytest_configure(config: pytest.Config):
"""Configure pytest session."""
with _allow_stderr_output(config):
_enable_fallback_compute()

# Initialize Spark session eagerly, so it is available even when
# SparkSession.builder.getOrCreate() is used. For DB Connect 15+,
# we validate version compatibility with the remote cluster.
if hasattr(DatabricksSession.builder, "validateSession"):
DatabricksSession.builder.validateSession().getOrCreate()
else:
DatabricksSession.builder.getOrCreate()
16 changes: 16 additions & 0 deletions acceptance/bundle/templates/default-minimal/script
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
trace $CLI bundle init default-minimal --config-file ./input.json --output-dir output

cd output/my_default_minimal

# Verify that empty directories are preserved
[ -d "src" ] || exit 1
[ -d "resources" ] || exit 1

trace $CLI bundle validate -t dev
trace $CLI bundle validate -t prod

# Do not affect this repository's git behaviour #2318
mv .gitignore out.gitignore
rm -r .databricks

cd ../../
Original file line number Diff line number Diff line change
Expand Up @@ -25,5 +25,8 @@ main = "my_lakeflow_pipelines.main:main"
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.hatch.build.targets.wheel]
packages = ["src"]

[tool.black]
line-length = 125
Loading