Add vector_search_indexes resource (direct engine)#5123
Open
janniklasrose wants to merge 47 commits into
Open
Conversation
1943af9 to
87018ce
Compare
janniklasrose
added a commit
that referenced
this pull request
Apr 30, 2026
) ## Changes Persist `endpoint_uuid` in state and detect identity drift on `vector_search_endpoints`. The endpoint name is stable but its UUID changes if the endpoint is deleted and recreated by name (e.g. via the workspace UI). Without persisting the UUID: - The bundle silently rebound permissions to a different backing endpoint without recreating the endpoint resource. - Anything else referencing `endpoint_uuid` (most importantly the permissions object_id, but also indexes added on top in the next PR) raced the recreate. `VectorSearchEndpointState` now embeds `vectorsearch.CreateEndpoint` and adds `EndpointUuid`. `DoCreate` records the UUID from the create response; `DoUpdate` copies it from `entry.RemoteState` so unrelated updates (e.g. `min_qps`) don't blank it out. `OverrideChangeDesc` classifies `endpoint_uuid` drift as `Recreate` when saved differs from remote, `Skip` otherwise. `drift/recreated_same_name` flips from a "badness snapshot" (which captured the old behavior of permissions silently rebinding) to the recreate behavior, with a permissions block on the endpoint to verify the cascade rebinds correctly. `drift/min_qps/out.plan.direct.json` regenerates to include the new `endpoint_uuid` skip entry in the detailed plan. ## Why Splitting this out of the larger `vector_search_indexes` PR ([#5123](#5123)) so it can land independently. The index PR builds on the persisted UUID for orphan detection, but the endpoint UUID work stands on its own and is useful regardless. ## Tests - `make fmtfull`, `make checks`, `make lintfull` — clean. - `make test` — green (`libs/apps/runlocal` needed `NODE_OPTIONS=` for the harness leak; unrelated). `bundle/internal/schema TestRequiredAnnotationsForNewFields` panics, which is failing on `main` for unrelated reasons. - `go test ./acceptance -run 'TestAccept/bundle/resources/vector_search_endpoints'` — all green, including the flipped `drift/recreated_same_name`. _This PR was written by Claude Code._
2b22f02 to
44ade3f
Compare
5 tasks
Hardcode the 3-part index name so the diff against main is purely vector_search additions. The test still demonstrates leaf-only prefixing on a 3-part identifier; the cross-resource reference path is covered elsewhere. Co-authored-by: Isaac
RemapState was hardcoding IndexSubtype to the empty string, which would classify any remote with a populated subtype as drift on the next plan and force a needless recreate. Pass through remote.IndexSubtype like the other read-back fields. Co-authored-by: Isaac
The Vector Search index API has no rename or update path, so any config-side change has to round-trip through delete + create. Add name and index_subtype to recreate_on_changes so the planner picks them up the same way it already does for endpoint_name, index_type, primary_key, and the spec blocks. Co-authored-by: Isaac
The leaf-prefix logic splits on the last dot in the 3-part UC name and
prepends the user prefix to whatever follows. If the name still has
literal ${...} tokens (e.g. ${var.catalog}.${var.schema}.${var.index}),
that split lands inside the trailing ref expression and rewrites the
variable name itself. Detect unresolved refs and bail; users who want
the dev prefix in this case can compose it into the variable.
Co-authored-by: Isaac
CreateIndex rejects any combination where the spec block doesn't match the index_type (e.g. DELTA_SYNC with direct_access_index_spec set, or DIRECT_ACCESS with neither block at all). Add a fast validator that reports those mismatches at validate time so the failure surfaces before the deploy starts running. Co-authored-by: Isaac
This reverts commit edeceb2.
The out.test.toml format changed in #5146 ("acc: Format out.test.toml in diff-friendly and copypaste-friendly way"), and refschema picked up index_subtype and endpoint_uuid from the resource model. Pure regen from running ./task generate-refschema and ./task test-update. Co-authored-by: Isaac
Previously lookupEndpointUuid swallowed all non-404 errors and returned "",
which would feed empty remoteUuid into OverrideChangeDesc and propose a
destructive Recreate ("endpoint replaced out-of-band") on transient or
permission errors. The Recreate is dangerous: Delta Sync re-runs the
embedding pipeline, and Direct Access loses all upserted vectors.
Now the helper returns (string, error): 404 maps to ("", nil) — the orphan
signal — and any other error is propagated through DoRead/DoCreate so the
plan fails loudly instead of misclassifying it as drift.
Document the OverrideChangeDesc divergence from vector_search_endpoint
(which requires remoteUuid != ""): for indexes, an empty remoteUuid is the
orphan signal, and the lookup contract guarantees that case is unambiguous.
Add a Badness-marked test that deploys a bundle with both a vector_search_endpoint and a vector_search_index referencing it, then changes the endpoint_type to trigger an endpoint Recreate. The plan correctly recreates the endpoint but leaves the dependent index unchanged, so on a real workspace the endpoint delete would either fail (indexes still attached) or orphan the index. Root cause is in the planner (bundle/direct/bundle_plan.go): there is no logic to propagate Recreate from a dependency to its dependents. This is a framework-level concern that affects more than just VS, so it's deferred to a follow-up. The Badness entry documents the gap.
Add a Badness-marked validate test showing that the name_prefix preset
does not rewrite a vector_search_indexes.*.endpoint_name literal that
points at a bundle-managed (and therefore prefixed) endpoint. The output
shows vs_endpoint -> prefix_vs_endpoint while vs_index_literal still
targets the unprefixed name vs_endpoint.
The DABs idiom is to use ${resources.vector_search_endpoints.X.name}
(captured by vs_index_ref in the same fixture). That form resolves
correctly to the prefixed name at plan/deploy time, so users have a
working pattern. The literal form silently breaks though, and the
preset has enough information to rewrite it; tracked as Badness for a
follow-up fix in apply_presets.go.
Mirror the existing vector_search_endpoint bind test: pre-create both endpoint and index, bind the index into the bundle, deploy, unbind, and destroy. Verifies the index survives unbind+destroy as expected. Required by bundle/direct/dresources/README.md for new resource types.
Drop the Terraform-provider justification (already implied by "direct engine only") and the long list of internal mechanics. Keep the entry focused on what customers see.
CreateIndex returns immediately with metadata of an index whose embedding pipeline is still provisioning; queries against an index that isn't ready fail. Implement WaitAfterCreate so dependent resources (and the next plan) see a usable index. 75-minute timeout matches the terraform provider. Co-authored-by: Isaac
Previously most vector_search_indexes tests created the endpoint
out-of-band via the CLI and only declared the index in the bundle.
Move the endpoint into the same databricks.yml so the index can
reference it via ${resources.vector_search_endpoints.my_endpoint.name},
matching the pattern users will write and shrinking the script's
manual cleanup. Bundle destroy now tears down both resources.
Co-authored-by: Isaac
Vector search indexes have no update API. Previously DoUpdate was a
no-op, which meant a future SDK field that wasn't declared in
recreate_on_changes/ignore_remote_changes would be classified as
Update by the planner and silently no-op at deploy time.
Drop the no-op DoUpdate so the framework's existing check at
bundle_plan.go errors loudly ("resource does not support update
action but plan produced update") if a plan ever produces Update
for this resource. Add a reflection-based unit test that catches
the same gap earlier, mirroring the pattern in app_test.go.
Co-authored-by: Isaac
This reverts commit b8483e7d82eadd2bb15f126a25d786bd402f829a.
Main reverted vector_search_endpoints UUID persistence in #5193, so the endpoint plan no longer carries a synthetic endpoint_uuid change to be classified as Skip via OverrideChangeDesc. Regenerate the with_endpoint plan output to match. Co-authored-by: Isaac
The 3-part UC name (catalog.schema.index) is the API primary key: CreateIndex addresses by name and DoCreate returns it as the deployment id. Prefixing it changed which remote object the bundle addressed, not just its display label. Mirrors #5209's same change for vector_search_endpoints. Drop the leaf-only prefix loop and the vectorSearchIndexPrefixPos helper in apply_presets.go, add VectorSearchIndex to the no-rename carve-out in apply_target_mode_test.go, and remove the now-obsolete TestVectorSearchIndexNamePrefixing. Co-authored-by: Isaac
… remote - WaitAfterCreate now takes id per #5258; the saved config.Name is the same as id, so the body is unchanged. - SDK v0.132.0 (#5237) returns delta_sync_index_spec.columns_to_sync (and the new columns_to_index field) on read. Drop the ignore_remote_changes rule and propagate both from remote in RemapState. Removes the drift/columns_to_sync acceptance test which was asserting the now-stale request-only behavior. Co-authored-by: Isaac
Per denik's PR comment: explain that ForceSendFields is an SDK marshaling concern (which zero-valued fields to wire-serialize) that has no meaning on the read path, so copying it from the response struct would not be useful. Co-authored-by: Isaac
The test was a Badness fixture capturing the gap where a literal endpoint_name on a VS index would not follow the endpoint's name prefix. Now that neither VS endpoints (#5209) nor VS indexes are prefixed, the literal form correctly points at the (unprefixed) endpoint, and all three branches of the fixture produce identical output. Co-authored-by: Isaac
generate-schema picked up the missing placeholder for index_subtype after the SDK bump; previously this field wasn't in the resource and the schema_test caught the gap on rebase. Co-authored-by: Isaac
7bf9e4c to
67b417f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changes
Adds
vector_search_indexesas a first-class DABs resource on the direct engine, alongside the existingvector_search_endpoints. Direct engine only — vector search has no Terraform provider.What's included:
bundle/config/resources/vector_search_index.go(withgrants) andbundle/direct/dresources/vector_search_index.go(state, lifecycle, drift classification).RemapStateround-tripsindex_subtypeso a populated remote subtype isn't classified as drift on the next plan.table.recreate_on_changesfor immutable spec fields (name,endpoint_name,index_type,index_subtype,primary_key,delta_sync_index_spec,direct_access_index_spec);delta_sync_index_spec.columns_to_syncmarkedignore_remote_changes(request-only field — see follow-up note below). The index API has no rename or update path, so any config-side change has to round-trip through delete + create.endpoint_uuidof the endpoint it was created against.DoReadlooks up the current endpoint UUID by name; if the endpoint was deleted out-of-band the lookup returns""andOverrideChangeDescclassifies the saved-vs-remote mismatch asRecreate. Builds on the endpoint UUID persistence merged in Persist endpoint UUID for vector_search_endpoints drift detection #5127.WaitAfterDeleteadapter method (sibling toWaitAfterCreate/WaitAfterUpdate). For VS indexes it pollsGetIndexuntil 404 (15-minute cap).apply.RecreaterunsDoDelete → DeleteState → WaitAfterDelete → DoCreate → SaveState → WaitAfterCreate, so a wait-time failure leaves the bundle consistent. Replaces the priorSaveState("", nil, nil)placeholder that producedinvalid state: empty idplanning failures on partial recreate.bundle/phases/. The message intentionally covers both Delta Sync ("re-runs the embedding pipeline") and Direct Access ("upserted vectors lost") in one paragraph — picking a type-specific message from the bundle config would be wrong on type changes (DELTA_SYNC→DIRECT_ACCESSrecreates would describe the destination type while the actual teardown is of the source type).catalog.schema.name, since catalog and schema are external references (the previous behavior produced invalid names likedev_jan_main.default.my_index). The mutator skips names that still carry literal${...}tokens, since the leaf split would otherwise inject the prefix inside the trailing ref expression itself.Ready: trueimmediately, matching the convention used by every other slow resource the testserver fakes (endpoints →ONLINE, database instances →AVAILABLE, apps →RUNNING).index_type/ spec-block consistency is intentionally not validated client-side — the CreateIndex API rejects mismatched combinations at deploy time, and replicating that check in DABs would just duplicate backend logic.Why
The direct engine recently gained
vector_search_endpoints(#4887). This PR extends the support to indexes, which were the missing half. Along the way it surfaces and fixes a number of issues:recreatedeploys hit it every time. Without a wait, every recreate failed on the immediate Create.apply.Recreatewas writing a malformed empty-ID state entry as its "delete state" step, which then poisoned the next plan withinvalid state: empty id.Follow-ups
delta_sync_index_spec.columns_to_syncis request-only in the SDK today: the field is accepted onCreatebut theGetresponse doesn't echo it back, which is why we mark itignore_remote_changeshere. There's an open backend PR to exposecolumns_to_syncon the read path; once the SDK is regenerated against that, we can drop theignore_remote_changesentry and let normal drift detection handle the field.vector_search_endpoints.budget_policy_iddrift (effective vs. requested) and the SDK doc-comment forvector_search_endpoints.usage_policy_idare intentionally not in this PR — both will be addressed by the next SDK bump and the corresponding./task generate-schemaregen.Tests
./task fmt,./task checks,./task lint— all clean../task test— unit tests green acrossbundle/....TestVectorSearchIndexNameWithUnresolvedRefsLeftAloneinapply_target_mode_test.goexercises the leaf-prefix skip on${var.catalog}.${var.schema}.${var.index}.acceptance/bundle/resources/vector_search_indexes/:basic,drift/columns_to_sync,drift/deleted_remotely,drift/orphaned_endpoint,recreate/index_type,recreate/mixed_types,grants/select.recreate/index_type/out.requests.recreate.direct.json) capturesGET → DELETE → GET → POSTwith--getenabled inprint_requests.py. The middleGETis theWaitAfterDeletepoll; if a future change drops the wait the regenerated capture loses that line and the test fails.acceptance/bundle/validate/presets_name_prefixcovers the leaf-only name prefix on a 3-part index name.acceptance/bundle/invariant/configs/vector_search_index.yml.tmplexercises the resource through the invariant matrix; the testserver enforces endpoint existence on index create.--profile tmpagainst staging across initial deploy / drift / recreate / destroy.This PR was written by Claude Code.