AuthZService: improve authz caching #9

akshayutture-augment · 2025-11-14T23:42:14Z

Test 2

Summary by CodeRabbit

Refactor
- Optimized authorization service with enhanced caching mechanisms to improve permission checking performance
- Streamlined remote authorization client with improved cache management
- Added efficient denial caching for faster authorization decisions
Tests
- Expanded test coverage for cache-backed authorization and permission decision logic

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Replicated from ai-code-review-evaluation/grafana-coderabbit#2

* remove the use of client side cache for in-proc authz client Co-authored-by: Gabriel MABILLE <[email protected]> * add a permission denial cache, fetch perms if not in either of the caches Co-authored-by: Gabriel MABILLE <[email protected]> * Clean up tests Co-authored-by: Ieva <[email protected]> * Cache tests Co-authored-by: Ieva <[email protected]> * Add test to list + cache Co-authored-by: Ieva <[email protected]> * Add outdated cache test Co-authored-by: Ieva <[email protected]> * Re-organize metrics Co-authored-by: Ieva <[email protected]> --------- Co-authored-by: Gabriel MABILLE <[email protected]>

akshayutture-augment · 2025-11-14T23:48:43Z

@coderabbitai review

akshayutture-augment · 2025-11-19T15:41:20Z

@coderabbitai review

akshayutture-augment · 2025-11-22T17:47:44Z

@coderabbitai review

coderabbitai · 2025-11-22T17:47:49Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2025-11-22T17:48:09Z

Walkthrough

The pull request introduces permission denial caching to the RBAC authorization service. It replaces in-process RBAC client construction with a remote client using NoopCache, adds a utility function for generating permission denial cache keys, and integrates caching logic into authorization decision paths with early-deny mechanisms and cache-first lookups in Check and List methods.

Changes

Cohort / File(s)	Summary
Client & Cache Infrastructure `pkg/services/authz/rbac.go`	Replaces in-process RBAC client construction with remote authz client using NoopCache and tracer. Adds NoopCache type implementing Get/Set/Delete methods returning not-found/no-op responses.
Cache Key Generation `pkg/services/authz/rbac/cache.go`	Adds new public function `userPermDenialCacheKey` that constructs cache keys including name and parent parameters for permission denial caching.
Authorization Decision Caching `pkg/services/authz/rbac/service.go`	Introduces `permDenialCache` field to Service struct, initializes it in NewService. Adds `getCachedIdentityPermissions` helper method and integrates caching into Check method (early-deny path) and List method (cache-first approach before full permission evaluation).
Test Coverage `pkg/services/authz/rbac/service_test.go`	Removes one test case for getUserPermissions cache behavior. Adds TestService_CacheCheck and TestService_CacheList to validate cache usage, fallback behavior, outdated cache handling, and cache-deny entries. Initializes permDenialCache in test setup.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Client
    participant Check as Check Method
    participant PermCache as permDenialCache
    participant Permissions as Full Permission<br/>Evaluation
    
    Client->>Check: Check(ctx, identity, action...)
    
    rect rgb(200, 220, 255)
    Note over Check,PermCache: Early Deny Path
    Check->>PermCache: Query denial cache
    opt Cache Hit (Denied)
        PermCache-->>Check: Return true (denied)
        Check-->>Client: Deny (fast path)
    end
    end
    
    rect rgb(220, 255, 220)
    Note over Check,Permissions: Cached Lookup Path
    Check->>PermCache: getCachedIdentityPermissions
    opt Cache Hit (Permissions Exist)
        PermCache-->>Check: Return cached permissions map
        Check-->>Client: Authorize/Deny (from cache)
    end
    end
    
    rect rgb(255, 240, 200)
    Note over Check,Permissions: Cache Miss - Full Evaluation
    Check->>Permissions: Full permission evaluation
    Permissions-->>Check: Return permissions
    Check->>PermCache: Store in cache
    PermCache-->>Check: ACK
    Check-->>Client: Authorize/Deny
    end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

pkg/services/authz/rbac/service.go: Significant logic density with new caching paths integrated into Check and List methods, plus new helper method getCachedIdentityPermissions requiring careful verification of all identity type branches and cache key generation integration.
Cache coherency: Verify that permDenialCache initialization and userPermDenialCacheKey parameter alignment (name, parent fields) are consistent across all call sites.
Test migration: Confirm removed test case behavior is adequately covered by new cache tests and that test setup properly initializes all cache fields.

Poem

🐰 Permission denied? Cached quick!

NoopCache whispers its silent trick,

Early denials skip the fuss,

While cached permits speed up the trust.

Authorization hops with delight! 🏃‍♂️✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 7.69% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: improving authz caching in the AuthZService by introducing new caching mechanisms (permDenialCache, getCachedIdentityPermissions) and optimizing permission checks.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch authz-service-improve-caching-pr

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

pkg/services/authz/rbac/service.go (1)
93-122: Scope denial caching to user-like identities to avoid cross-identity bleed

The early permDenialCache check in Check is a good optimization, but as implemented it uses userPermDenialCacheKey for all identity types and only keys on namespace, UserUID, action, name, and parent folder. Because validateSubject drops the identity type when parsing the subject (e.g. both anonymous:0 and user:0 yield userUID == "0"), a denial cached for one identity type can be incorrectly reused for another that shares the same UID string.

That can lead to unexpected hard denies for different principals until the short TTL expires (e.g., a user with UID "0" inheriting a denial cached for the anonymous subject "anonymous:0").

A low‑risk fix is to only use the denial cache for user/service‑account identities:
 func (s *Service) Check(ctx context.Context, req *authzv1.CheckRequest) (*authzv1.CheckResponse, error) {
@@
-	span.SetAttributes(
+	span.SetAttributes(
@@
-	permDenialKey := userPermDenialCacheKey(checkReq.Namespace.Value, checkReq.UserUID, checkReq.Action, checkReq.Name, checkReq.ParentFolder)
-	if _, ok := s.permDenialCache.Get(ctx, permDenialKey); ok {
-		s.metrics.permissionCacheUsage.WithLabelValues("true", checkReq.Action).Inc()
-		s.metrics.requestCount.WithLabelValues("false", "true", req.GetVerb(), req.GetGroup(), req.GetResource()).Inc()
-		return &authzv1.CheckResponse{Allowed: false}, nil
-	}
+	usePermDenialCache := types.IsIdentityType(checkReq.IdentityType, types.TypeUser, types.TypeServiceAccount)
+	if usePermDenialCache {
+		permDenialKey := userPermDenialCacheKey(checkReq.Namespace.Value, checkReq.UserUID, checkReq.Action, checkReq.Name, checkReq.ParentFolder)
+		if _, ok := s.permDenialCache.Get(ctx, permDenialKey); ok {
+			s.metrics.permissionCacheUsage.WithLabelValues("true", checkReq.Action).Inc()
+			s.metrics.requestCount.WithLabelValues("false", "true", req.GetVerb(), req.GetGroup(), req.GetResource()).Inc()
+			return &authzv1.CheckResponse{Allowed: false}, nil
+		}
+	}
@@
-	if !allowed {
-		s.permDenialCache.Set(ctx, permDenialKey, true)
-	}
+	if !allowed && usePermDenialCache {
+		permDenialKey := userPermDenialCacheKey(checkReq.Namespace.Value, checkReq.UserUID, checkReq.Action, checkReq.Name, checkReq.ParentFolder)
+		s.permDenialCache.Set(ctx, permDenialKey, true)
+	}
Alternatively, you could include the identity type in the denial cache key, but gating by type keeps the function name (userPermDenialCacheKey) and existing key structure intact.

Also applies to: 153-156, 56-61, 83-88

🧹 Nitpick comments (3)

pkg/services/authz/rbac/cache.go (1)

30-32: Denial cache key helper is consistent; consider a more explicit prefix

Key composition matches the existing userPermCacheKey, but because this is semantically different (explicit denials), using a distinct prefix (e.g. .perm_denial_…) would make debugging/log inspection easier and future‑proofs you if these keys ever end up in a shared cache.
pkg/services/authz/rbac/service_test.go (1)
1291-1323: Strengthen TestService_CacheList by asserting no store calls

This test correctly verifies that List can be served entirely from permCache, but it doesn’t assert that the backing store wasn’t hit. You can increase confidence in the cache‑only behavior by asserting the fake store’s call count stays at zero:
 	t.Run("List based on cached permissions", func(t *testing.T) {
 		s := setupService()
 		ctx := types.WithAuthInfo(context.Background(), callingService)
 		userID := &store.UserIdentifiers{UID: "test-uid", ID: 1}
@@
-		s.identityStore = &fakeIdentityStore{}
+		s.identityStore = &fakeIdentityStore{}
+
+		store, ok := s.store.(*fakeStore)
+		require.True(t, ok)
@@
 		require.NoError(t, err)
 		require.ElementsMatch(t, resp.Items, []string{"dash1", "dash2"})
 		require.ElementsMatch(t, resp.Folders, []string{"fold1"})
+
+		require.Zero(t, store.calls)
 	})
This keeps the test focused on verifying that a warm cache prevents any backend reads.
pkg/services/authz/rbac/service.go (1)
161-197: Differentiate cache miss from backend errors in getCachedIdentityPermissions callers

List (and similarly Check) currently treats any error from getCachedIdentityPermissions as a simple cache miss:
permissions, err := s.getCachedIdentityPermissions(...)
if err == nil {
    // use cache
} else {
    // treat as miss, hit DB
    permissions, err = s.getIdentityPermissions(...)
}
But getCachedIdentityPermissions can fail both with cache.ErrNotFound (true miss) and with real backend errors (e.g., from GetUserIdentifiers). In the latter case you’ll end up executing the same expensive chain (GetUserIdentifiers, getUserBasicRole, etc.) twice: once in the cache path, then again via getIdentityPermissions.

It would be cleaner and cheaper under failure to distinguish these:
 permissions, err := s.getCachedIdentityPermissions(ctx, listReq.Namespace, listReq.IdentityType, listReq.UserUID, listReq.Action)
-if err == nil {
-	s.metrics.permissionCacheUsage.WithLabelValues("true", listReq.Action).Inc()
-} else {
-	s.metrics.permissionCacheUsage.WithLabelValues("false", listReq.Action).Inc()
-
-	permissions, err = s.getIdentityPermissions(ctx, listReq.Namespace, listReq.IdentityType, listReq.UserUID, listReq.Action)
-	if err != nil {
-		// ...
-		return nil, err
-	}
-}
+if err == nil {
+	s.metrics.permissionCacheUsage.WithLabelValues("true", listReq.Action).Inc()
+} else {
+	if !errors.Is(err, cache.ErrNotFound) {
+		// propagate real errors instead of retrying the whole path
+		ctxLogger.Error("could not get cached user permissions", "subject", req.GetSubject(), "error", err)
+		s.metrics.requestCount.WithLabelValues("true", "true", req.GetVerb(), req.GetGroup(), req.GetResource()).Inc()
+		return nil, err
+	}
+	s.metrics.permissionCacheUsage.WithLabelValues("false", listReq.Action).Inc()
+
+	permissions, err = s.getIdentityPermissions(ctx, listReq.Namespace, listReq.IdentityType, listReq.UserUID, listReq.Action)
+	if err != nil {
+		// ...
+		return nil, err
+	}
+}
(Analogous handling could be applied in Check.)

This keeps the “cache miss => fall back to DB” behavior while avoiding duplicate backend work and clearer logging when the cache lookup itself fails for non‑miss reasons.

Also applies to: 342-368

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e650fa7 and d9dc93c.

📒 Files selected for processing (4)

pkg/services/authz/rbac.go (3 hunks)
pkg/services/authz/rbac/cache.go (1 hunks)
pkg/services/authz/rbac/service.go (6 hunks)
pkg/services/authz/rbac/service_test.go (3 hunks)

🔇 Additional comments (2)

pkg/services/authz/rbac/service_test.go (1)

1325-1345: Setup wiring for permDenialCache looks good

setupService now initializes permDenialCache with shortCacheTTL, matching the runtime Service configuration and keeping denial entries short‑lived to reduce the risk of stale negative decisions. No issues here.

pkg/services/authz/rbac.go (1)

101-105: Client-side cache configuration and NoopCache implementation are reasonable

Using a NoopCache for the in‑proc client avoids redundant caching on top of the server‑side caches, while the remote client now benefits from a small local cache (Expiry: 30s) to reduce RPC volume. The NoopCache implementation correctly satisfies the cache.Cache interface by always returning cache.ErrNotFound from Get and nil from Set/Delete, so existing client code should see it as a normal cache miss.

If you find yourself needing to tune cache behavior later, consider threading a cache.Cache (or config) into newRemoteRBACClient instead of hard‑coding the LocalCache parameters, but the current setup is fine for now.

Please double‑check against the current authzlib.WithCacheClientOption and cache.Cache interfaces that there are no additional behavioral expectations beyond returning ErrNotFound for a miss, especially around Delete.

Also applies to: 160-170, 239-251

coderabbitai · 2025-11-22T17:54:15Z

pkg/services/authz/rbac/service_test.go

+func TestService_CacheCheck(t *testing.T) {
+	callingService := authn.NewAccessTokenAuthInfo(authn.Claims[authn.AccessTokenClaims]{
+		Claims: jwt.Claims{
+			Subject:  types.NewTypeID(types.TypeAccessPolicy, "some-service"),
+			Audience: []string{"authzservice"},
+		},
+		Rest: authn.AccessTokenClaims{Namespace: "org-12"},
+	})
+
+	ctx := types.WithAuthInfo(context.Background(), callingService)
+	userID := &store.UserIdentifiers{UID: "test-uid", ID: 1}
+
+	t.Run("Allow based on cached permissions", func(t *testing.T) {
+		s := setupService()
+
+		s.idCache.Set(ctx, userIdentifierCacheKey("org-12", "test-uid"), *userID)
+		s.permCache.Set(ctx, userPermCacheKey("org-12", "test-uid", "dashboards:read"), map[string]bool{"dashboards:uid:dash1": true})
+
+		resp, err := s.Check(ctx, &authzv1.CheckRequest{
+			Namespace: "org-12",
+			Subject:   "user:test-uid",
+			Group:     "dashboard.grafana.app",
+			Resource:  "dashboards",
+			Verb:      "get",
+			Name:      "dash1",
+		})
+		require.NoError(t, err)
+		assert.True(t, resp.Allowed)
+	})
+	t.Run("Fallback to the database on cache miss", func(t *testing.T) {
+		s := setupService()
+
+		// Populate database permission but not the cache
+		store := &fakeStore{
+			userID:          userID,
+			userPermissions: []accesscontrol.Permission{{Action: "dashboards:read", Scope: "dashboards:uid:dash2"}},
+		}
+
+		s.store = store
+		s.permissionStore = store
+
+		s.idCache.Set(ctx, userIdentifierCacheKey("org-12", "test-uid"), *userID)
+
+		resp, err := s.Check(ctx, &authzv1.CheckRequest{
+			Namespace: "org-12",
+			Subject:   "user:test-uid",
+			Group:     "dashboard.grafana.app",
+			Resource:  "dashboards",
+			Verb:      "get",
+			Name:      "dash2",
+		})
+		require.NoError(t, err)
+		assert.True(t, resp.Allowed)
+	})
+	t.Run("Fallback to the database on outdated cache", func(t *testing.T) {
+		s := setupService()
+
+		store := &fakeStore{
+			userID:          userID,
+			userPermissions: []accesscontrol.Permission{{Action: "dashboards:read", Scope: "dashboards:uid:dash2"}},
+		}
+
+		s.store = store
+		s.permissionStore = store
+
+		s.idCache.Set(ctx, userIdentifierCacheKey("org-12", "test-uid"), *userID)
+		// The cache does not have the permission for dash2 (outdated)
+		s.permCache.Set(ctx, userPermCacheKey("org-12", "test-uid", "dashboards:read"), map[string]bool{"dashboards:uid:dash1": true})
+
+		resp, err := s.Check(ctx, &authzv1.CheckRequest{
+			Namespace: "org-12",
+			Subject:   "user:test-uid",
+			Group:     "dashboard.grafana.app",
+			Resource:  "dashboards",
+			Verb:      "get",
+			Name:      "dash2",
+		})
+		require.NoError(t, err)
+		assert.True(t, resp.Allowed)
+	})
+	t.Run("Should deny on explicit cache deny entry", func(t *testing.T) {
+		s := setupService()
+
+		s.idCache.Set(ctx, userIdentifierCacheKey("org-12", "test-uid"), *userID)
+
+		// Explicitly deny access to the dashboard
+		s.permDenialCache.Set(ctx, userPermDenialCacheKey("org-12", "test-uid", "dashboards:read", "dash1", "fold1"), true)
+
+		// Allow access to the dashboard to prove this is not checked
+		s.permCache.Set(ctx, userPermCacheKey("org-12", "test-uid", "dashboards:read"), map[string]bool{"dashboards:uid:dash1": false})
+
+		resp, err := s.Check(ctx, &authzv1.CheckRequest{
+			Namespace: "org-12",
+			Subject:   "user:test-uid",
+			Group:     "dashboard.grafana.app",
+			Resource:  "dashboards",
+			Verb:      "get",
+			Name:      "dash1",
+			Folder:    "fold1",
+		})
+		require.NoError(t, err)
+		assert.False(t, resp.Allowed)
+	})
+}


⚠️ Potential issue | 🟡 Minor

TestService_CacheCheck denial test comment doesn’t match data

In the “Should deny on explicit cache deny entry” subtest, the comment says the perm cache “allows access”, but the map stores false for the scope, so it actually represents a denial, and the value is irrelevant because the denial cache short‑circuits first. If you want this test to prove that permDenialCache overrides an allowing permission cache, flip the value to true:

- // Allow access to the dashboard to prove this is not checked - s.permCache.Set(ctx, userPermCacheKey("org-12", "test-uid", "dashboards:read"), map[string]bool{"dashboards:uid:dash1": false}) + // Allow access to the dashboard to prove this is not checked + s.permCache.Set(ctx, userPermCacheKey("org-12", "test-uid", "dashboards:read"), map[string]bool{"dashboards:uid:dash1": true})

That way the test clearly verifies that an explicit deny entry wins even when the positive permission cache would otherwise allow.

🤖 Prompt for AI Agents

In pkg/services/authz/rbac/service_test.go around lines 893 to 996, the "Should deny on explicit cache deny entry" subtest's comment says the perm cache "allows access" but the test data sets the scope value to false (a deny), which contradicts the intent; change the cached permission map value for "dashboards:uid:dash1" from false to true so the permCache represents an allowing permission and the test then verifies that permDenialCache still overrides an allow.

ai-code-review-evaluations deleted a comment from coderabbitai bot Nov 18, 2025

coderabbitai bot reviewed Nov 22, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AuthZService: improve authz caching #9

AuthZService: improve authz caching #9

Uh oh!

akshayutture-augment commented Nov 14, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

akshayutture-augment commented Nov 14, 2025

Uh oh!

akshayutture-augment commented Nov 19, 2025

Uh oh!

akshayutture-augment commented Nov 22, 2025

Uh oh!

coderabbitai bot commented Nov 22, 2025

Uh oh!

coderabbitai bot commented Nov 22, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Nov 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

AuthZService: improve authz caching #9

Are you sure you want to change the base?

AuthZService: improve authz caching #9

Uh oh!

Conversation

akshayutture-augment commented Nov 14, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

akshayutture-augment commented Nov 14, 2025

Uh oh!

akshayutture-augment commented Nov 19, 2025

Uh oh!

akshayutture-augment commented Nov 22, 2025

Uh oh!

coderabbitai bot commented Nov 22, 2025

Uh oh!

coderabbitai bot commented Nov 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Nov 22, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

akshayutture-augment commented Nov 14, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Nov 22, 2025 •

edited

Loading