Skip to content

Conversation

@anton-107
Copy link
Contributor

@anton-107 anton-107 commented Sep 29, 2025

Changes

This PR introduces a local file-based cache layer for the Databricks CLI to improve performance of repeated operations.

New libs/cache package:

  • Standalone generic function GetOrCompute[T any](ctx, cache, fingerprint, compute) that works with any type
  • File-based cache implementation (fileCache) storing JSON-encoded data
  • SHA256-based fingerprinting for cache keys from any struct
  • Automatic cleanup of expired cache files on initialization
  • Fail-open behavior: cache errors never block operations, just trigger recomputation
  • Cache isolation by CLI version: ~/.cache/databricks///

Cache Modes:

  • Measurement mode (default): Cache disabled, but still measures potential savings via telemetry
  • Enabled mode: Set DATABRICKS_CACHE_ENABLED=true to actually use cached values

New CLI command:

  • databricks cache clear - Removes all cached files across all CLI versions

Bundle Integration:

  • New InitializeCache() mutator to set up cache in bundle initialization phase
  • PopulateCurrentUser now uses cache for CurrentUser.Me() API call

Why

This is the first attempt to speed up subsequent databricks bundle commands that a bundle developer runs while developing a bundle

Tests

  • changed existing acceptance tests to use a dedicated cache folder
  • added new acceptance test for overall caching functionality and telemetry
  • added new acceptance test for clearing the cache
  • added unit tests for libs/cache

"bundle_mode": "TYPE_UNSPECIFIED",
"workspace_artifact_path_type": "WORKSPACE_FILE_SYSTEM"
"workspace_artifact_path_type": "WORKSPACE_FILE_SYSTEM",
"local_cache_measurements_ms": [...redacted...]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(optional) maybe worth recording the keys for the int map.

// TestFingerprintStability tests that the fingerprintToHash function returns the same hash for the same input.
func TestFingerprintStability(t *testing.T) {
fingerprint1 := struct {
Key string `json:"key"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if there's an embedded struct with an override for the same field:

struct foo {
  A string
  B string
}

struct bar {
  foo
  A string
}

Do bar{A:"abc"} and bar{foo{A:"abc"}} compute to the same has or different?

This is a scenario that arises in dashboards so atleast we should document what happens in a unit test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be a separate hash, but why dashboards matter here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but why dashboards matter here?

Maybe someday we might want to cache dashboards... Not a real usecase but atleast worth pointing out that the hashing is not bulletproof.


// getAuthorizationHeader extracts the Authorization header from the workspace client configuration.
// If it fails to authenticate, it returns an empty string.
func (b *Bundle) getAuthorizationHeader() string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When doing the cache miss analysis, it'll be interesting to corellate that with auth type. I would expect oauth-u2m to have more misses due to short lived API tokens.

=== First call in a session is expected to be a cache miss:
[DEBUG_TIMESTAMP] Debug: [Local Cache] using cache key: [SHA256_HASH]
[DEBUG_TIMESTAMP] Debug: [Local Cache] failed to stat cache file: (redacted)
[DEBUG_TIMESTAMP] Debug: [Local Cache] cache miss, computing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should include function name that is being computed? This maybe more relevant in the future where is more than one function.

account Databricks Account Commands
api Perform Databricks API call
auth Authentication related commands
cache Local cache related commands
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the description it's not obvious that is user-level cache (and not whatever we have in .databricks). Perhaps make it explicit "user-level cache"? cc @juliacrawf-db

@andrewnester andrewnester requested a review from denik December 10, 2025 09:06
@pietern pietern changed the title [local cache] Initial implementation of the local cache layer Initial implementation of the local cache layer Dec 10, 2025
Use: "cache",
Short: "Local cache related commands",
Long: "Manage local cache used by the Databricks CLI for improved performance",
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the short/long for this.

Copy link
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewnester Please take a look at the PR summary as well.

The implementation has drifted from the original summary.

@andrewnester andrewnester added this pull request to the merge queue Dec 10, 2025
Merged via the queue into main with commit 100a4a6 Dec 10, 2025
19 checks passed
@andrewnester andrewnester deleted the anton-107/cache-control-flow branch December 10, 2025 11:23
@eng-dev-ecosystem-bot
Copy link
Collaborator

Commit: 100a4a6

Run: 20096926907

Env 🔄​flaky 💚​RECOVERED 🙈​SKIP ✅​pass 🙈​skip Time
💚​ aws linux 10 1 413 627 47:06
💚​ aws windows 10 1 415 625 43:06
💚​ aws-ucws linux 10 1 574 506 64:19
🔄​ aws-ucws windows 3 9 1 574 504 87:30
🔄​ azure linux 5 4 3 409 625 61:12
🔄​ azure windows 4 2 3 414 623 57:22
🔄​ azure-ucws linux 3 3 3 569 504 62:34
🔄​ azure-ucws windows 3 3 3 571 502 59:14
💚​ gcp linux 4 3 394 634 39:34
💚​ gcp windows 4 3 396 632 43:22
22 interesting tests: 14 flaky, 7 RECOVERED, 1 SKIP
Test Name aws linux aws windows aws-ucws linux aws-ucws windows azure linux azure windows azure-ucws linux azure-ucws windows gcp linux gcp windows
🔄​ TestAccept 💚​R 💚​R 💚​R 🔄​f 💚​R 💚​R 🔄​f 🔄​f 💚​R 💚​R
🔄​ TestAccept/bundle/integration_whl/base ✅​p ✅​p ✅​p 🔄​f ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p
🔄​ TestAccept/bundle/integration_whl/base/DATABRICKS_BUNDLE_ENGINE=direct ✅​p ✅​p ✅​p 🔄​f ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p
🔄​ TestAccept/bundle/resources/clusters/run/spark_python_task ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p ✅​p ✅​p
🔄​ TestAccept/bundle/resources/clusters/run/spark_python_task/DATABRICKS_BUNDLE_ENGINE=direct ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p ✅​p ✅​p
🙈​ TestAccept/bundle/resources/permissions 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
🔄​ TestAccept/bundle/resources/permissions/factcheck ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f 🙈​s 🙈​s
🔄​ TestAccept/bundle/resources/permissions/factcheck/DATABRICKS_BUNDLE_ENGINE=terraform ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p ✅​p 🔄​f
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions 💚​R 💚​R 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=direct 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/with_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions 💚​R 💚​R 💚​R 💚​R 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S 🙈​S
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=direct 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/resources/permissions/jobs/destroy_without_mgmtperms/without_permissions/DATABRICKS_BUNDLE_ENGINE=terraform 💚​R 💚​R 💚​R 💚​R
🔄​ TestAccept/bundle/resources/secret_scopes/permissions ✅​p ✅​p ✅​p ✅​p 🔄​f 🔄​f ✅​p ✅​p 🙈​s 🙈​s
🔄​ TestAccept/bundle/resources/secret_scopes/permissions/DATABRICKS_BUNDLE_ENGINE=terraform ✅​p ✅​p ✅​p ✅​p 🔄​f 🔄​f ✅​p ✅​p
🔄​ TestAccept/bundle/run/app-with-job 💚​R 💚​R 💚​R 💚​R 💚​R 🔄​f 💚​R 💚​R 💚​R 💚​R
💚​ TestAccept/bundle/run/app-with-job/DATABRICKS_BUNDLE_ENGINE=direct 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R 💚​R
🔄​ TestAccept/bundle/run/app-with-job/DATABRICKS_BUNDLE_ENGINE=terraform 💚​R 💚​R 💚​R 💚​R 💚​R 🔄​f 💚​R 💚​R 💚​R 💚​R
🔄​ TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=no/NBOOK=yes/PY=yes ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p ✅​p ✅​p ✅​p ✅​p
🔄​ TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=yes/NBOOK=no/PY=no ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p ✅​p ✅​p ✅​p ✅​p
🔄​ TestAccept/bundle/templates/default-python/combinations/classic/DATABRICKS_BUNDLE_ENGINE=terraform/DLT=yes/NBOOK=no/PY=yes ✅​p ✅​p ✅​p ✅​p 🔄​f ✅​p ✅​p ✅​p ✅​p ✅​p
Top 50 slowest tests (at least 2 minutes):
duration env testname
22:23 azure linux TestAccept/bundle/integration_whl/interactive_single_user/DATABRICKS_BUNDLE_ENGINE=terraform
20:19 azure linux TestAccept/bundle/integration_whl/interactive_single_user/DATABRICKS_BUNDLE_ENGINE=direct
18:42 azure windows TestAccept/bundle/integration_whl/interactive_cluster_dynamic_version/DATABRICKS_BUNDLE_ENGINE=terraform/DATA_SECURITY_MODE=USER_ISOLATION
18:00 azure windows TestAccept/bundle/integration_whl/interactive_single_user/DATABRICKS_BUNDLE_ENGINE=terraform
16:58 azure windows TestAccept/bundle/integration_whl/interactive_cluster/DATABRICKS_BUNDLE_ENGINE=terraform
14:15 azure linux TestAccept/bundle/integration_whl/interactive_cluster_dynamic_version/DATABRICKS_BUNDLE_ENGINE=direct/DATA_SECURITY_MODE=USER_ISOLATION
14:03 gcp windows TestAccept/bundle/integration_whl/interactive_cluster_dynamic_version/DATABRICKS_BUNDLE_ENGINE=terraform/DATA_SECURITY_MODE=SINGLE_USER
12:23 azure-ucws linux TestAccept/bundle/integration_whl/interactive_cluster_dynamic_version/DATABRICKS_BUNDLE_ENGINE=terraform/DATA_SECURITY_MODE=SINGLE_USER
12:16 azure-ucws linux TestAccept/bundle/integration_whl/interactive_cluster/DATABRICKS_BUNDLE_ENGINE=direct
11:26 aws-ucws windows TestAccept/bundle/resources/model_serving_endpoints/running-endpoint/DATABRICKS_BUNDLE_ENGINE=terraform
11:18 aws-ucws windows TestAccept/bundle/resources/model_serving_endpoints/running-endpoint/DATABRICKS_BUNDLE_ENGINE=direct
11:16 azure linux TestAccept/bundle/resources/permissions/factcheck/DATABRICKS_BUNDLE_ENGINE=terraform
11:14 aws-ucws linux TestAccept/bundle/integration_whl/base/DATABRICKS_BUNDLE_ENGINE=direct
11:02 gcp windows TestAccept/bundle/integration_whl/interactive_single_user/DATABRICKS_BUNDLE_ENGINE=terraform
10:41 gcp windows TestAccept/bundle/integration_whl/base/DATABRICKS_BUNDLE_ENGINE=direct
10:13 gcp linux TestAccept/bundle/integration_whl/interactive_single_user/DATABRICKS_BUNDLE_ENGINE=terraform
10:03 aws-ucws linux TestAccept/bundle/integration_whl/interactive_single_user/DATABRICKS_BUNDLE_ENGINE=terraform
9:59 gcp linux TestAccept/bundle/integration_whl/interactive_cluster_dynamic_version/DATABRICKS_BUNDLE_ENGINE=terraform/DATA_SECURITY_MODE=SINGLE_USER
9:54 azure-ucws linux TestAccept/bundle/resources/model_serving_endpoints/running-endpoint/DATABRICKS_BUNDLE_ENGINE=direct
9:38 azure-ucws linux TestAccept/bundle/resources/permissions/factcheck/DATABRICKS_BUNDLE_ENGINE=terraform
9:35 aws-ucws windows TestAccept/bundle/integration_whl/interactive_single_user/DATABRICKS_BUNDLE_ENGINE=terraform
9:08 azure-ucws windows TestAccept/bundle/integration_whl/base/DATABRICKS_BUNDLE_ENGINE=direct
9:08 azure-ucws windows TestAccept/bundle/integration_whl/base/DATABRICKS_BUNDLE_ENGINE=terraform
9:02 aws windows TestAccept/bundle/integration_whl/interactive_cluster_dynamic_version/DATABRICKS_BUNDLE_ENGINE=terraform/DATA_SECURITY_MODE=USER_ISOLATION
8:47 aws linux TestAccept/bundle/integration_whl/interactive_cluster_dynamic_version/DATABRICKS_BUNDLE_ENGINE=terraform/DATA_SECURITY_MODE=SINGLE_USER
8:43 aws-ucws linux TestAccept/bundle/integration_whl/interactive_cluster_dynamic_version/DATABRICKS_BUNDLE_ENGINE=terraform/DATA_SECURITY_MODE=SINGLE_USER
8:39 aws linux TestAccept/bundle/integration_whl/interactive_single_user/DATABRICKS_BUNDLE_ENGINE=terraform
8:39 aws windows TestAccept/bundle/integration_whl/interactive_cluster_dynamic_version/DATABRICKS_BUNDLE_ENGINE=terraform/DATA_SECURITY_MODE=SINGLE_USER
8:32 azure linux TestAccept/bundle/integration_whl/interactive_cluster_dynamic_version/DATABRICKS_BUNDLE_ENGINE=terraform/DATA_SECURITY_MODE=SINGLE_USER
8:31 gcp linux TestAccept/bundle/integration_whl/base/DATABRICKS_BUNDLE_ENGINE=terraform
8:28 aws windows TestAccept/bundle/integration_whl/interactive_cluster/DATABRICKS_BUNDLE_ENGINE=direct
8:28 aws-ucws linux TestAccept/bundle/resources/model_serving_endpoints/running-endpoint/DATABRICKS_BUNDLE_ENGINE=terraform
8:19 aws-ucws linux TestAccept/bundle/resources/model_serving_endpoints/running-endpoint/DATABRICKS_BUNDLE_ENGINE=direct
8:15 azure-ucws linux TestAccept/bundle/resources/clusters/deploy/update-after-create/DATABRICKS_BUNDLE_ENGINE=direct
8:12 gcp windows TestAccept/bundle/integration_whl/base/DATABRICKS_BUNDLE_ENGINE=terraform
8:06 azure-ucws windows TestAccept/bundle/integration_whl/interactive_single_user/DATABRICKS_BUNDLE_ENGINE=terraform
8:06 aws-ucws windows TestAccept/bundle/integration_whl/interactive_cluster_dynamic_version/DATABRICKS_BUNDLE_ENGINE=terraform/DATA_SECURITY_MODE=SINGLE_USER
8:05 azure-ucws linux TestAccept/bundle/resources/model_serving_endpoints/running-endpoint/DATABRICKS_BUNDLE_ENGINE=terraform
8:04 aws-ucws windows TestAccept/bundle/integration_whl/interactive_cluster/DATABRICKS_BUNDLE_ENGINE=terraform
8:03 aws linux TestAccept/bundle/resources/clusters/run/spark_python_task/DATABRICKS_BUNDLE_ENGINE=terraform
8:02 azure linux TestAccept/bundle/integration_whl/base/DATABRICKS_BUNDLE_ENGINE=terraform
7:59 azure windows TestAccept/bundle/integration_whl/interactive_cluster_dynamic_version/DATABRICKS_BUNDLE_ENGINE=direct/DATA_SECURITY_MODE=USER_ISOLATION
7:58 aws-ucws windows TestAccept/bundle/integration_whl/base/DATABRICKS_BUNDLE_ENGINE=terraform
7:55 aws-ucws windows TestAccept/bundle/integration_whl/base/DATABRICKS_BUNDLE_ENGINE=direct
7:50 aws-ucws linux TestAccept/bundle/integration_whl/base/DATABRICKS_BUNDLE_ENGINE=terraform
7:49 aws windows TestAccept/bundle/integration_whl/interactive_cluster_dynamic_version/DATABRICKS_BUNDLE_ENGINE=direct/DATA_SECURITY_MODE=SINGLE_USER
7:44 azure-ucws linux TestAccept/bundle/integration_whl/base/DATABRICKS_BUNDLE_ENGINE=direct
7:40 gcp windows TestAccept/bundle/integration_whl/interactive_cluster/DATABRICKS_BUNDLE_ENGINE=terraform
7:40 azure-ucws windows TestAccept/bundle/resources/model_serving_endpoints/running-endpoint/DATABRICKS_BUNDLE_ENGINE=direct
7:35 gcp windows TestAccept/bundle/integration_whl/interactive_cluster_dynamic_version/DATABRICKS_BUNDLE_ENGINE=terraform/DATA_SECURITY_MODE=USER_ISOLATION

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants