Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
100 commits
Select commit Hold shift + click to select a range
a002e2d
Release v0.217.1 (#1356)
andrewnester Apr 10, 2024
7bbc6bf
acc: add gron.py and update some scripts to use that (#4482)
denik Feb 10, 2026
f11e6d5
build(deps): bump golang.org/x/mod from 0.32.0 to 0.33.0 (#4517)
dependabot[bot] Feb 16, 2026
758b3c2
Add WAL for direct deployment state recovery
Jan 11, 2026
86253f1
Updated tests and enhanced kill caller with an offset
Jan 12, 2026
5c9344a
Updated existing tests
Jan 23, 2026
2d2a14f
test fixes
Feb 2, 2026
daf22b7
Fixes
Feb 7, 2026
7d9b2ec
fixed tests
Feb 9, 2026
48a94d1
updated tests
varundeepsaini Mar 24, 2026
ff3bd86
dedup
varundeepsaini Mar 24, 2026
1e86e3e
Update WAL corrupted entry outputs
varundeepsaini Mar 26, 2026
317a3b1
WIP
denik Mar 27, 2026
535b50e
Updated tests and enhanced kill caller with an offset
Jan 12, 2026
3dce662
Updated existing tests
Jan 23, 2026
1d55f0e
Merge simplified WAL handling into state.go
denik Mar 27, 2026
d3ee818
fixes
denik Mar 27, 2026
b50a116
fixes
denik Mar 27, 2026
2798878
rm unnecessary assert
denik Mar 27, 2026
adf57bf
Centralize state open/close lifecycle for direct engine
denik Mar 28, 2026
36c0e1f
lint
denik Mar 28, 2026
e21adfa
fixes
denik Apr 29, 2026
0f03dc5
lint
denik Apr 29, 2026
46e932a
restore test
denik Apr 30, 2026
cd76a8e
Skip state file write when WAL has no resource entries
denik Apr 30, 2026
c9f4b25
Revert per-engine test splits for no-resource deploys
denik Apr 30, 2026
5ed0438
fmt
denik Apr 30, 2026
3ad3284
Maintain stateIDs as single source of truth for resource IDs
denik Apr 30, 2026
214b847
Remove defer Close from processBundleRetInternal; align with main app…
denik Apr 30, 2026
874e6bc
Rename Close to Finalize; make plan a local var in processBundleRetIn…
denik Apr 30, 2026
47f72d8
Restore process.go structure to match main more closely
denik Apr 30, 2026
7f3b854
Fix migration count, remove unnecessary defer Finalize, fix errcheck
denik Apr 30, 2026
e59b130
Fix WAL validation: lowercase suffix, partial recovery, directory cre…
denik Apr 30, 2026
c7c679b
restore non-material changes: assertions and comment
denik Apr 30, 2026
24d8882
deduplicate UpgradeToWrite+defer Finalize in Deploy
denik May 1, 2026
6aa1dee
update out.test.toml
denik May 1, 2026
7956784
fix compilation in configsync/variables.go
denik May 4, 2026
9ad7d6a
use OpenWithData+UpgradeToWrite in migrate to avoid disk roundtrip
denik May 6, 2026
ec4ef80
use OpenWithData+UpgradeToWrite in uploadStateForYamlSync
denik May 7, 2026
bf02cb3
remove redundant defer Finalize in Deploy
denik May 10, 2026
9d83d67
move Finalize into destroyCore before files.Delete
denik May 10, 2026
137bf67
remove noise comment from bundle_apply.go
denik May 10, 2026
0630f47
fix gofumpt and test output
denik May 10, 2026
aa54390
shrink chain-10-jobs to chain-3-jobs
denik May 10, 2026
0a35900
fix test names in state_test.go: Close -> Finalize, restore SaveFinalize
denik May 10, 2026
62b131a
clean up WAL acceptance tests
denik May 10, 2026
5e25966
fix crash-after-create: handle Linux exit code 1 after KillCaller
denik May 10, 2026
9c224e9
update selftest
denik May 10, 2026
58e77d8
fix WAL acceptance test hygiene
denik May 11, 2026
acae5a9
destroyCore: warn on Finalize failure instead of aborting
denik May 11, 2026
f1b6847
deployCore: use Finalize return value instead of re-opening state
denik May 11, 2026
689e4c6
statemgmt.Load: accept state directly instead of engine
denik May 11, 2026
2878504
fmt
denik May 11, 2026
3712060
deployCore: move ParseResourcesState before PushResourcesState
denik May 11, 2026
85c4f67
simplify test
denik May 11, 2026
66241e4
update outputs
denik May 11, 2026
5efb5ab
fix Windows replacement for process kill during deployment
denik May 11, 2026
5c3ea72
formatting
denik May 11, 2026
fa01503
clean up
denik May 11, 2026
e13f5f8
rm unnecessarial SERIAL replacement
denik May 11, 2026
38e5bdf
rm noop replacement for lineage
denik May 11, 2026
19ede74
clean up
denik May 11, 2026
b87d213
testserver: replace KillCaller config with HTTP kill API
denik May 11, 2026
e805c26
remove blank line
denik May 11, 2026
705698e
wal tests: remove redundant server stubs covered by default handlers
denik May 11, 2026
cc2a6fa
wal tests: move test.toml comments to script, remove empty test.toml …
denik May 11, 2026
2ff30ee
Add databricks.yml
denik May 11, 2026
f65985d
clean up
denik May 11, 2026
b8909aa
add replace_ids.py
denik May 11, 2026
20320b1
clean up
denik May 11, 2026
a52e157
test more commands for validation
denik May 11, 2026
dfc7839
remove normal-deploy test
denik May 11, 2026
e5b688e
clean up
denik May 11, 2026
6a610c0
test recover in plan/deploy/summary
denik May 11, 2026
b929c62
clean up
denik May 11, 2026
17a953c
add assert_*.py
denik May 11, 2026
1dd8b1e
corrupted-wal-entry: use envsubst + template file for WAL generation
denik May 11, 2026
67c7b6b
kill_caller selftests: move test.toml comments to script, remove empt…
denik May 11, 2026
8f73fa9
formatting
denik May 12, 2026
b453e54
fix CI: commit missing test.tomls and fix assert_*.py permissions
denik May 12, 2026
07aeca0
fix: use TOML basic strings with \n escapes in Repls to avoid CRLF on…
denik May 12, 2026
000e9a4
refactor: merge duplicate IsDirect() blocks in dashboard.go
denik May 13, 2026
bb07c94
refactor: merge duplicate IsDirect() blocks in deploy.go
denik May 13, 2026
08d51a5
restore comment: Finalize is called even on Apply failure to save par…
denik May 13, 2026
3ca056b
update NEXT_CHANGELOG.md
denik May 13, 2026
4bc42da
rm databricksyyml
denik May 13, 2026
71380f3
add a warning on Close error
denik May 18, 2026
7825932
comment fix
denik May 18, 2026
9055392
update max entry to 10MB
denik May 18, 2026
ece1d44
update tests after rebase
denik May 18, 2026
e9f2ddc
clean up NEXT_CHANGELOG
denik May 18, 2026
a943970
don't recover from temp state file
denik May 20, 2026
f57cb5b
reuse struct
denik May 20, 2026
9ae36b7
handle WAL stat errors explicitly in Open
denik May 20, 2026
62345cf
add comments
denik May 20, 2026
1108a39
track lineage in memory
denik May 20, 2026
fce827e
remove single use inline functions reloadState and validateWALheader
denik May 20, 2026
34e7071
update test outpout
denik May 20, 2026
277c65d
fix NEXT_CHANGELOG.md conflict markers
denik May 20, 2026
f171b75
allow GetResourceEntry in read mode
denik May 20, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions NEXT_CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,5 +13,6 @@
* Make sure warnings asking for approval are understood by agents ([#5239](https://github.com/databricks/cli/pull/5239))
* Support `replace_existing: true` on `postgres_branches` and `postgres_endpoints` so bundles can manage the implicitly-created production branch and primary read-write endpoint of a Lakebase project.
* Add `postgres_catalogs` resource to bind a Unity Catalog catalog to a Postgres database on a Lakebase Autoscaling branch ([#5265](https://github.com/databricks/cli/pull/5265)).
* engine/direct: Changes to state file now persisted to .wal file right away instead of being saved in the end ([#5149](https://github.com/databricks/cli/pull/5149))

### Dependency updates
12 changes: 12 additions & 0 deletions acceptance/bin/assert_exists.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/usr/bin/env python3
import os, sys

errors = 0

for filename in sys.argv[1:]:
if not os.path.exists(filename):
sys.stderr.write(f"Unexpected: {filename} does not exist.\n")
errors += 1

if errors:
sys.exit(1)
12 changes: 12 additions & 0 deletions acceptance/bin/assert_not_exists.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#!/usr/bin/env python3
import os, sys

errors = 0

for filename in sys.argv[1:]:
if os.path.exists(filename):
sys.stderr.write(f"Unexpected: {filename} exists.\n")
errors += 1

if errors:
sys.exit(1)
39 changes: 39 additions & 0 deletions acceptance/bin/kill_after.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
#!/usr/bin/env python3
"""Set up a kill rule on the testserver for the current test token.

Usage: kill_after.py PATTERN OFFSET TIMES

PATTERN HTTP method and path, e.g. "POST /api/2.2/jobs/create"
OFFSET number of requests to let through before killing starts
TIMES number of times to kill the caller

The rule is scoped to the current DATABRICKS_TOKEN so it only affects
the test that registers it, even when tests share a server.
"""

import json
import os
import sys
import urllib.request

host = os.environ.get("DATABRICKS_HOST", "")
token = os.environ.get("DATABRICKS_TOKEN", "")

if not host:
print("DATABRICKS_HOST not set", file=sys.stderr)
sys.exit(1)

if len(sys.argv) != 4:
print(f"usage: {sys.argv[0]} PATTERN OFFSET TIMES", file=sys.stderr)
sys.exit(1)

pattern, offset, times = sys.argv[1], int(sys.argv[2]), int(sys.argv[3])

data = json.dumps({"pattern": pattern, "offset": offset, "times": times}).encode()
req = urllib.request.Request(
f"{host}/__testserver/kill",
data=data,
headers={"Content-Type": "application/json", "Authorization": f"Bearer {token}"},
method="POST",
)
urllib.request.urlopen(req)
37 changes: 37 additions & 0 deletions acceptance/bundle/deploy/wal/chain-3-jobs/databricks.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
bundle:
name: wal-chain-test

resources:
jobs:
# Linear chain: job_01 -> job_02 -> job_03
# Execution order: job_01 first, job_03 last
job_01:
name: "job-01"
description: "first in chain"
tasks:
- task_key: "task"
spark_python_task:
python_file: ./test.py
new_cluster:
spark_version: 15.4.x-scala2.12
node_type_id: i3.xlarge
job_02:
name: "job-02"
description: "depends on ${resources.jobs.job_01.id}"
tasks:
- task_key: "task"
spark_python_task:
python_file: ./test.py
new_cluster:
spark_version: 15.4.x-scala2.12
node_type_id: i3.xlarge
job_03:
name: "job-03"
description: "depends on ${resources.jobs.job_02.id}"
tasks:
- task_key: "task"
spark_python_task:
python_file: ./test.py
new_cluster:
spark_version: 15.4.x-scala2.12
node_type_id: i3.xlarge
3 changes: 3 additions & 0 deletions acceptance/bundle/deploy/wal/chain-3-jobs/out.test.toml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

110 changes: 110 additions & 0 deletions acceptance/bundle/deploy/wal/chain-3-jobs/output.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
=== First deploy (crashes on job_03) ===

>>> errcode [CLI] bundle deploy
Uploading bundle files to /Workspace/Users/[USERNAME]/.bundle/wal-chain-test/default/files...
Deploying resources...
[PROCESS_KILLED]

Exit code: [KILLED]

=== WAL content after crash ===
{
"cli_version": "[DEV_VERSION]",
"lineage": "[UUID]",
"serial": 1,
"state_version": 2
}
{
"k": "resources.jobs.job_01",
"v": {
"__id__": "[JOB_01_ID]",
"state": {
"deployment": {
"kind": "BUNDLE",
"metadata_file_path": "/Workspace/Users/[USERNAME]/.bundle/wal-chain-test/default/state/metadata.json"
},
"description": "first in chain",
"edit_mode": "UI_LOCKED",
"format": "MULTI_TASK",
"max_concurrent_runs": 1,
"name": "job-01",
"queue": {
"enabled": true
},
"tasks": [
{
"new_cluster": {
"node_type_id": "[NODE_TYPE_ID]",
"spark_version": "15.4.x-scala2.12"
},
"spark_python_task": {
"python_file": "/Workspace/Users/[USERNAME]/.bundle/wal-chain-test/default/files/test.py"
},
"task_key": "task"
}
]
}
}
}
{
"k": "resources.jobs.job_02",
"v": {
"__id__": "[JOB_02_ID]",
"depends_on": [
{
"label": "${resources.jobs.job_01.id}",
"node": "resources.jobs.job_01"
}
],
"state": {
"deployment": {
"kind": "BUNDLE",
"metadata_file_path": "/Workspace/Users/[USERNAME]/.bundle/wal-chain-test/default/state/metadata.json"
},
"description": "depends on [JOB_01_ID]",
"edit_mode": "UI_LOCKED",
"format": "MULTI_TASK",
"max_concurrent_runs": 1,
"name": "job-02",
"queue": {
"enabled": true
},
"tasks": [
{
"new_cluster": {
"node_type_id": "[NODE_TYPE_ID]",
"spark_version": "15.4.x-scala2.12"
},
"spark_python_task": {
"python_file": "/Workspace/Users/[USERNAME]/.bundle/wal-chain-test/default/files/test.py"
},
"task_key": "task"
}
]
}
}
}

=== Number of jobs saved in WAL ===
2

=== Bundle summary (reads from WAL) ===
Name: wal-chain-test
Target: default
Workspace:
User: [USERNAME]
Path: /Workspace/Users/[USERNAME]/.bundle/wal-chain-test/default
Resources:
Jobs:
job_01:
Name: job-01
URL: [DATABRICKS_URL]/jobs/[JOB_01_ID]?o=[NUMID]
job_02:
Name: job-02
URL: [DATABRICKS_URL]/jobs/[JOB_02_ID]?o=[NUMID]
job_03:
Name: job-03
URL: (not deployed)

=== WAL after successful deploy ===
WAL deleted (expected)
24 changes: 24 additions & 0 deletions acceptance/bundle/deploy/wal/chain-3-jobs/script
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
# Linear chain: job_01 -> job_02 -> job_03
# Let first 2 jobs/create succeed, then kill on the 3rd
kill_after.py "POST /api/2.2/jobs/create" 2 1

echo "=== First deploy (crashes on job_03) ==="
trace errcode $CLI bundle deploy

echo ""
echo "=== WAL content after crash ==="
jq -S . .databricks/bundle/default/resources.json.wal 2>/dev/null || echo "No WAL file"

echo ""
echo "=== Number of jobs saved in WAL ==="
grep -c '"k":"resources.jobs' .databricks/bundle/default/resources.json.wal 2>/dev/null || echo "0"

echo ""
echo "=== Bundle summary (reads from WAL) ==="
$CLI bundle summary

echo ""
echo "=== WAL after successful deploy ==="
cat .databricks/bundle/default/resources.json.wal 2>/dev/null || echo "WAL deleted (expected)"

replace_ids.py
1 change: 1 addition & 0 deletions acceptance/bundle/deploy/wal/chain-3-jobs/test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
print("test")
23 changes: 23 additions & 0 deletions acceptance/bundle/deploy/wal/corrupted-wal-entry/databricks.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
bundle:
name: wal-corrupted-test

resources:
jobs:
valid_job:
name: "valid-job"
tasks:
- task_key: "task-a"
spark_python_task:
python_file: ./test.py
new_cluster:
spark_version: 15.4.x-scala2.12
node_type_id: i3.xlarge
another_valid:
name: "another-valid"
tasks:
- task_key: "task-b"
spark_python_task:
python_file: ./test.py
new_cluster:
spark_version: 15.4.x-scala2.12
node_type_id: i3.xlarge

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

34 changes: 34 additions & 0 deletions acceptance/bundle/deploy/wal/corrupted-wal-entry/output.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@

>>> cat .databricks/bundle/default/resources.json.wal
{"lineage":"test-lineage-123","serial":6}
{"k":"resources.jobs.valid_job","v":{"__id__":"","state":{"name":"valid-job"}}}
{"k":"resources.jobs.another_valid","v":{"__id__":"","state":{"name":"another-valid"}}}
{"k":"resources.jobs.partial_write","v":{"__id__":"33","state":{"name":"partial-

>>> [CLI] bundle deploy
Warn: Skipping corrupted WAL entry at [TEST_TMP_DIR]/.databricks/bundle/default/resources.json.wal:4: unexpected end of JSON input
Warn: Saved 1 corrupted WAL entries to [TEST_TMP_DIR]/.databricks/bundle/default/resources.json.wal.corrupted
Uploading bundle files to /Workspace/Users/[USERNAME]/.bundle/wal-corrupted-test/default/files...
Deploying resources...
Updating deployment state...
Deployment complete!

>>> [CLI] bundle summary
Name: wal-corrupted-test
Target: default
Workspace:
User: [USERNAME]
Path: /Workspace/Users/[USERNAME]/.bundle/wal-corrupted-test/default
Resources:
Jobs:
another_valid:
Name: another-valid
URL: [DATABRICKS_URL]/jobs/[NUMID]?o=[NUMID]
valid_job:
Name: valid-job
URL: [DATABRICKS_URL]/jobs/[NUMID]?o=[NUMID]

>>> cat .databricks/bundle/default/resources.json.wal.corrupted
{"k":"resources.jobs.partial_write","v":{"__id__":"33","state":{"name":"partial-
=== WAL after successful deploy ===
WAL deleted (expected)
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
{
"state_version": 1,
"cli_version": "0.0.0",
"lineage": "test-lineage-123",
"serial": 5,
"state": {}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
{"lineage":"test-lineage-123","serial":6}
{"k":"resources.jobs.valid_job","v":{"__id__":"$JOB1","state":{"name":"valid-job"}}}
{"k":"resources.jobs.another_valid","v":{"__id__":"$JOB2","state":{"name":"another-valid"}}}
{"k":"resources.jobs.partial_write","v":{"__id__":"33","state":{"name":"partial-
22 changes: 22 additions & 0 deletions acceptance/bundle/deploy/wal/corrupted-wal-entry/script
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Create pre-existing jobs in the testserver so WAL recovery triggers DoUpdate (reset) instead of DoCreate
JOB1=$($CLI jobs create --json '{"name":"valid-job"}' | jq -r '.job_id')
JOB2=$($CLI jobs create --json '{"name":"another-valid"}' | jq -r '.job_id')
echo "$JOB1:JOB1_ID" >> ACC_REPLS
echo "$JOB2:JOB2_ID" >> ACC_REPLS

mkdir -p .databricks/bundle/default
cp resources.json .databricks/bundle/default/

envsubst < resources.json.wal.tmpl > .databricks/bundle/default/resources.json.wal

trace cat .databricks/bundle/default/resources.json.wal
trace $CLI bundle deploy
trace $CLI bundle summary
trace cat .databricks/bundle/default/resources.json.wal.corrupted

printf "\n=== WAL after successful deploy ===\n"
if [ -f ".databricks/bundle/default/resources.json.wal" ]; then
echo "WAL exists (unexpected)"
else
echo "WAL deleted (expected)"
fi
1 change: 1 addition & 0 deletions acceptance/bundle/deploy/wal/corrupted-wal-entry/test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
print("test")
25 changes: 25 additions & 0 deletions acceptance/bundle/deploy/wal/crash-after-create/databricks.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
bundle:
name: wal-crash-test

resources:
jobs:
job_a:
name: "test-job-a"
description: "first job"
tasks:
- task_key: "task-a"
spark_python_task:
python_file: ./test.py
new_cluster:
spark_version: 15.4.x-scala2.12
node_type_id: i3.xlarge
job_b:
name: "test-job-b"
description: "depends on ${resources.jobs.job_a.id}"
tasks:
- task_key: "task-b"
spark_python_task:
python_file: ./test.py
new_cluster:
spark_version: 15.4.x-scala2.12
node_type_id: i3.xlarge
4 changes: 4 additions & 0 deletions acceptance/bundle/deploy/wal/crash-after-create/out.test.toml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading