[Performance] Cache SCIM and permission-assignment list calls to reduce API requests during plan by tagirb · Pull Request #5535 · databricks/terraform-provider-databricks

tagirb · 2026-03-27T14:45:22Z

[Performance] Cache SCIM and permission-assignment list calls to reduce API requests during plan

For large IAM deployments, terraform plan was issuing one individual API call per resource Read —
e.g. 620 GET /Users/{id} calls for 620 databricks_user resources, 9,001 per-group member
lookups for 9,001 databricks_group_member resources, and so on. Under high concurrency this also
caused 429 rate-limit errors.

Root cause: Each resource's Read handler called the API independently with no sharing between
parallel goroutines.

Fix: Each affected resource type now has a shared in-memory cache populated by a single list
call. All goroutines for resources of the same type share the one result. Mutations
(Create/Update/Delete) invalidate the cache so the next Read re-fetches fresh data.

Concurrency safety: Caches use sync.RWMutex (concurrent reads on a warm cache) combined with
golang.org/x/sync/singleflight (deduplicating in-flight cold-cache fetches so exactly one
goroutine calls the API while all others wait and share the result).

Per-resource changes:

Resource	Before	After
`databricks_mws_permission_assignment` (~4,400)	1 `ListByWorkspaceId` per resource	1 per `workspace_id`
`databricks_permission_assignment`	1 `List` per resource	1 per host
`databricks_group` (~1,466)	1 `GET /Groups/{id}` per resource	1 `GET /Groups?attributes=id,...` total
`databricks_user` (~620)	1 `GET /Users/{id}` per resource	1 `GET /Users?attributes=id,...` total
`databricks_group_member` (~9,001)	up to 1,466 concurrent `GET /Groups/{id}?attributes=members` → 429s	1 `GET /Groups?attributes=id,members` total

All caches fall back to individual reads for resources absent from the list response (e.g. created
concurrently after the cache was populated).

Changes

Shared in-memory list caches for five IAM resource types, reducing total SCIM/IAM API calls during
a plan cycle from O(N) to O(1) per resource type. No schema, provider interface, or user-facing
behaviour changes.

Tests

make test run locally
relevant change in docs/ folder — no user-facing schema or behaviour changes; existing resource documentation remains accurate
covered with integration tests in internal/acceptance — existing acceptance tests in scim/group_test.go, scim/user_test.go, scim/group_member_test.go cover full CRUD paths for all changed resources
using Go SDK — not applicable; all changed resources use the SDKv2 client.Scim() helper, not the Go SDK IAM client
using TF Plugin Framework — not applicable; all changed resources (databricks_group, databricks_user, databricks_group_member, databricks_permission_assignment, databricks_mws_permission_assignment) are SDKv2 resources
has entry in NEXT_CHANGELOG.md file

For large IAM deployments (thousands of databricks_group_member, databricks_group, databricks_user, and mws_permission_assignment resources), terraform plan was issuing one API call per resource Read — resulting in thousands of redundant requests and 429 rate-limit errors. Each resource type now uses a shared in-memory cache backed by sync.RWMutex (concurrent warm reads) + singleflight (deduplicated cold-cache fetches). A single list call populates the cache for all resources of that type; subsequent reads are served from memory. Mutations (Create/Update/Delete) invalidate the relevant cache entry. Changes: - access/resource_permission_assignment.go: permAssignmentCache (1 List call per host instead of 1 per resource) - mws/resource_mws_permission_assignment.go: workspaceAssignmentsCache (1 ListByWorkspaceId per workspace_id instead of 1 per resource) - scim/resource_group.go + scim/groups.go: groupsListCache (1 GET /Groups?attributes=id,... instead of N GET /Groups/{id}) - scim/resource_user.go + scim/users.go: usersListCache (1 GET /Users?attributes=id,... instead of N GET /Users/{id}) - scim/resource_group_member.go: bulk groupCache (1 GET /Groups?attributes=id,members instead of N per-group reads, eliminating concurrent request storms that caused 429 errors)

github-actions · 2026-03-27T15:01:07Z

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/terraform

Inputs:

PR number: 5535
Commit SHA: 6f0c9e51a6ea82d65df7ac44a577933deab12e19

Checks will be approved automatically on success.

tagirb requested review from a team as code owners March 27, 2026 14:45

tagirb requested review from renaudhartert-db and removed request for a team March 27, 2026 14:45

tagirb mentioned this pull request Mar 27, 2026

[Enhancement] Implement caching for workspace permission assignments #5523

Closed

6 tasks

tagirb force-pushed the add-scim-caching branch from 94a1d73 to 6f0c9e5 Compare March 27, 2026 15:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance] Cache SCIM and permission-assignment list calls to reduce API requests during plan#5535

[Performance] Cache SCIM and permission-assignment list calls to reduce API requests during plan#5535
tagirb wants to merge 1 commit intodatabricks:mainfrom
tagirb:add-scim-caching

tagirb commented Mar 27, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tagirb commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!