Skip to content
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions NEXT_CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
### Dependency updates

### Bundles
* Add `default-minimal` template for users who want a clean slate without sample code ([#3885](https://github.com/databricks/cli/pull/3885))
* Add validation that served_models and served_entities are not used at the same time. Add client side translation logic. ([#3880](https://github.com/databricks/cli/pull/3880))

### API Changes
1 change: 1 addition & 0 deletions acceptance/bundle/help/bundle-init/output.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ Initialize using a bundle template to get started quickly.
TEMPLATE_PATH optionally specifies which template to use. It can be one of the following:
- default-python: The default Python template for Notebooks and Lakeflow
- default-sql: The default SQL template for .sql files that run with Databricks SQL
- default-minimal: The minimal template, for advanced users
- dbt-sql: The dbt SQL template (databricks.com/blog/delivering-cost-effective-data-real-time-dbt-and-databricks)
- mlops-stacks: The Databricks MLOps Stacks template (github.com/databricks/mlops-stacks)
- pydabs: A variant of the 'default-python' template that defines resources in Python instead of YAML
Expand Down
6 changes: 6 additions & 0 deletions acceptance/bundle/templates/default-minimal/input.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
{
"project_name": "my_default_minimal",
"include_job": "no",
"include_pipeline": "no",
"include_python": "no"
}
5 changes: 5 additions & 0 deletions acceptance/bundle/templates/default-minimal/out.test.toml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

33 changes: 33 additions & 0 deletions acceptance/bundle/templates/default-minimal/output.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@

>>> [CLI] bundle init default-minimal --config-file ./input.json --output-dir output
Welcome to the minimal Databricks Asset Bundle template!

This template creates a minimal project structure without sample code, ideal for advanced users.
(For getting started with Python or SQL code, use the default-python or default-sql templates instead.)

Your workspace at [DATABRICKS_URL] is used for initialization.
(See https://docs.databricks.com/dev-tools/cli/profiles.html for how to change your profile.)

✨ Your new project has been created in the 'my_default_minimal' directory!

To get started, refer to the project README.md file and the documentation at https://docs.databricks.com/dev-tools/bundles/index.html.

>>> [CLI] bundle validate -t dev
Name: my_default_minimal
Target: dev
Workspace:
Host: [DATABRICKS_URL]
User: [USERNAME]
Path: /Workspace/Users/[USERNAME]/.bundle/my_default_minimal/dev

Validation OK!

>>> [CLI] bundle validate -t prod
Name: my_default_minimal
Target: prod
Workspace:
Host: [DATABRICKS_URL]
User: [USERNAME]
Path: /Workspace/Users/[USERNAME]/.bundle/my_default_minimal/prod

Validation OK!
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Typings for Pylance in Visual Studio Code
# see https://github.com/microsoft/pyright/blob/main/docs/builtins.md
from databricks.sdk.runtime import *
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"recommendations": [
"databricks.databricks",
"redhat.vscode-yaml",
"ms-python.black-formatter"
]
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
{
"jupyter.interactiveWindow.cellMarker.codeRegex": "^# COMMAND ----------|^# Databricks notebook source|^(#\\s*%%|#\\s*\\<codecell\\>|#\\s*In\\[\\d*?\\]|#\\s*In\\[ \\])",
"jupyter.interactiveWindow.cellMarker.default": "# COMMAND ----------",
"python.testing.pytestArgs": [
"."
],
"files.exclude": {
"**/*.egg-info": true,
"**/__pycache__": true,
".pytest_cache": true,
"dist": true,
},
"files.associations": {
"**/.gitkeep": "markdown"
},

// Pylance settings (VS Code)
// Set typeCheckingMode to "basic" to enable type checking!
"python.analysis.typeCheckingMode": "off",
"python.analysis.extraPaths": ["src", "lib", "resources"],
"python.analysis.diagnosticMode": "workspace",
"python.analysis.stubPath": ".vscode",

// Pyright settings (Cursor)
// Set typeCheckingMode to "basic" to enable type checking!
"cursorpyright.analysis.typeCheckingMode": "off",
"cursorpyright.analysis.extraPaths": ["src", "lib", "resources"],
"cursorpyright.analysis.diagnosticMode": "workspace",
"cursorpyright.analysis.stubPath": ".vscode",

// General Python settings
"python.defaultInterpreterPath": "./.venv/bin/python",
"python.testing.unittestEnabled": false,
"python.testing.pytestEnabled": true,
"[python]": {
"editor.defaultFormatter": "ms-python.black-formatter",
"editor.formatOnSave": true,
},
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
# my_default_minimal

The 'my_default_minimal' project was generated by using the default template.

* `src/`: Python source code for this project.
* `resources/`: Resource configurations (jobs, pipelines, etc.)
* `tests/`: Unit tests for the shared Python code.
* `fixtures/`: Fixtures for data sets (primarily used for testing).


## Getting started

Choose how you want to work on this project:

(a) Directly in your Databricks workspace, see
https://docs.databricks.com/dev-tools/bundles/workspace.

(b) Locally with an IDE like Cursor or VS Code, see
https://docs.databricks.com/dev-tools/vscode-ext.html.

(c) With command line tools, see https://docs.databricks.com/dev-tools/cli/databricks-cli.html

If you're developing with an IDE, dependencies for this project should be installed using uv:

* Make sure you have the UV package manager installed.
It's an alternative to tools like pip: https://docs.astral.sh/uv/getting-started/installation/.
* Run `uv sync --dev` to install the project's dependencies.


# Using this project using the CLI

The Databricks workspace and IDE extensions provide a graphical interface for working
with this project. It's also possible to interact with it directly using the CLI:

1. Authenticate to your Databricks workspace, if you have not done so already:
```
$ databricks configure
```

2. To deploy a development copy of this project, type:
```
$ databricks bundle deploy --target dev
```
(Note that "dev" is the default target, so the `--target` parameter
is optional here.)

This deploys everything that's defined for this project.

3. Similarly, to deploy a production copy, type:
```
$ databricks bundle deploy --target prod
```

4. To run a job or pipeline, use the "run" command:
```
$ databricks bundle run
```

5. Finally, to run tests locally, use `pytest`:
```
$ uv run pytest
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# This is a Databricks asset bundle definition for my_default_minimal.
# See https://docs.databricks.com/dev-tools/bundles/index.html for documentation.
bundle:
name: my_default_minimal
uuid: [UUID]

include:
- resources/*.yml
- resources/*/*.yml

# Variable declarations. These variables are assigned in the dev/prod targets below.
variables:
catalog:
description: The catalog to use
schema:
description: The schema to use

targets:
dev:
# The default target uses 'mode: development' to create a development copy.
# - Deployed resources get prefixed with '[dev my_user_name]'
# - Any job schedules and triggers are paused by default.
# See also https://docs.databricks.com/dev-tools/bundles/deployment-modes.html.
mode: development
default: true
workspace:
host: [DATABRICKS_URL]
variables:
catalog: hive_metastore
schema: ${workspace.current_user.short_name}
prod:
mode: production
workspace:
host: [DATABRICKS_URL]
# We explicitly deploy to /Workspace/Users/[USERNAME] to make sure we only have a single copy.
root_path: /Workspace/Users/[USERNAME]/.bundle/${bundle.name}/${bundle.target}
variables:
catalog: hive_metastore
schema: prod
permissions:
- user_name: [USERNAME]
level: CAN_MANAGE
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Test fixtures directory

Add JSON or CSV files here. In tests, use them with `load_fixture()`:

```
def test_using_fixture(load_fixture):
data = load_fixture("my_data.json")
assert len(data) >= 1
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
.databricks/
build/
dist/
__pycache__/
*.egg-info
.venv/
scratch/**
!scratch/README.md
**/explorations/**
**/!explorations/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
[project]
name = "my_default_minimal"
version = "0.0.1"
authors = [{ name = "[USERNAME]" }]
requires-python = ">=3.10,<=3.13"
dependencies = [
# Any dependencies for jobs and pipelines in this project can be added here
# See also https://docs.databricks.com/dev-tools/bundles/library-dependencies
#
# LIMITATION: for pipelines, dependencies are cached during development;
# add dependencies to the 'environment' section of your pipeline.yml file instead
]

[dependency-groups]
dev = [
"pytest",
"databricks-dlt",
"databricks-connect>=15.4,<15.5",
]

[project.scripts]
main = "my_default_minimal.main:main"

[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.hatch.build.targets.wheel]
packages = ["src"]

[tool.black]
line-length = 125
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
"""This file configures pytest.

This file is in the root since it can be used for tests in any place in this
project, including tests under resources/.
"""

import os, sys, pathlib
from contextlib import contextmanager


try:
from databricks.connect import DatabricksSession
from databricks.sdk import WorkspaceClient
from pyspark.sql import SparkSession
import pytest
import json
import csv
import os
except ImportError:
raise ImportError(
"Test dependencies not found.\n\nRun tests using 'uv run pytest'. See http://docs.astral.sh/uv to learn more about uv."
)


@pytest.fixture()
def spark() -> SparkSession:
"""Provide a SparkSession fixture for tests.

Minimal example:
def test_uses_spark(spark):
df = spark.createDataFrame([(1,)], ["x"])
assert df.count() == 1
"""
return DatabricksSession.builder.getOrCreate()


@pytest.fixture()
def load_fixture(spark: SparkSession):
"""Provide a callable to load JSON or CSV from fixtures/ directory.

Example usage:

def test_using_fixture(load_fixture):
data = load_fixture("my_data.json")
assert data.count() >= 1
"""

def _loader(filename: str):
path = pathlib.Path(__file__).parent.parent / "fixtures" / filename
suffix = path.suffix.lower()
if suffix == ".json":
rows = json.loads(path.read_text())
return spark.createDataFrame(rows)
if suffix == ".csv":
with path.open(newline="") as f:
rows = list(csv.DictReader(f))
return spark.createDataFrame(rows)
raise ValueError(f"Unsupported fixture type for: {filename}")

return _loader


def _enable_fallback_compute():
"""Enable serverless compute if no compute is specified."""
conf = WorkspaceClient().config
if conf.serverless_compute_id or conf.cluster_id or os.environ.get("SPARK_REMOTE"):
return

url = "https://docs.databricks.com/dev-tools/databricks-connect/cluster-config"
print("☁️ no compute specified, falling back to serverless compute", file=sys.stderr)
print(f" see {url} for manual configuration", file=sys.stdout)

os.environ["DATABRICKS_SERVERLESS_COMPUTE_ID"] = "auto"


@contextmanager
def _allow_stderr_output(config: pytest.Config):
"""Temporarily disable pytest output capture."""
capman = config.pluginmanager.get_plugin("capturemanager")
if capman:
with capman.global_and_fixture_disabled():
yield
else:
yield


def pytest_configure(config: pytest.Config):
"""Configure pytest session."""
with _allow_stderr_output(config):
_enable_fallback_compute()

# Initialize Spark session eagerly, so it is available even when
# SparkSession.builder.getOrCreate() is used. For DB Connect 15+,
# we validate version compatibility with the remote cluster.
if hasattr(DatabricksSession.builder, "validateSession"):
DatabricksSession.builder.validateSession().getOrCreate()
else:
DatabricksSession.builder.getOrCreate()
11 changes: 11 additions & 0 deletions acceptance/bundle/templates/default-minimal/script
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
trace $CLI bundle init default-minimal --config-file ./input.json --output-dir output

cd output/my_default_minimal
trace $CLI bundle validate -t dev
trace $CLI bundle validate -t prod

# Do not affect this repository's git behaviour #2318
mv .gitignore out.gitignore
rm -r .databricks

cd ../../
Original file line number Diff line number Diff line change
Expand Up @@ -25,5 +25,8 @@ main = "my_lakeflow_pipelines.main:main"
requires = ["hatchling"]
build-backend = "hatchling.build"

[tool.hatch.build.targets.wheel]
packages = ["src"]

[tool.black]
line-length = 125
Loading