Skip to content

Conversation

@akshayutture-augment
Copy link

@akshayutture-augment akshayutture-augment commented Nov 14, 2025

Test 2

Summary by CodeRabbit

  • Refactor

    • Optimized authorization service with enhanced caching mechanisms to improve permission checking performance
    • Streamlined remote authorization client with improved cache management
    • Added efficient denial caching for faster authorization decisions
  • Tests

    • Expanded test coverage for cache-backed authorization and permission decision logic

✏️ Tip: You can customize this high-level summary in your review settings.


Replicated from ai-code-review-evaluation/grafana-coderabbit#2

* remove the use of client side cache for in-proc authz client

Co-authored-by: Gabriel MABILLE <[email protected]>

* add a permission denial cache, fetch perms if not in either of the caches

Co-authored-by: Gabriel MABILLE <[email protected]>

* Clean up tests

Co-authored-by: Ieva <[email protected]>

* Cache tests

Co-authored-by: Ieva <[email protected]>

* Add test to list + cache

Co-authored-by: Ieva <[email protected]>

* Add outdated cache test

Co-authored-by: Ieva <[email protected]>

* Re-organize metrics

Co-authored-by: Ieva <[email protected]>

---------

Co-authored-by: Gabriel MABILLE <[email protected]>
@akshayutture-augment
Copy link
Author

@coderabbitai review

@akshayutture-augment
Copy link
Author

@coderabbitai review

1 similar comment
@akshayutture-augment
Copy link
Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Nov 22, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link

coderabbitai bot commented Nov 22, 2025

Walkthrough

The pull request introduces permission denial caching to the RBAC authorization service. It replaces in-process RBAC client construction with a remote client using NoopCache, adds a utility function for generating permission denial cache keys, and integrates caching logic into authorization decision paths with early-deny mechanisms and cache-first lookups in Check and List methods.

Changes

Cohort / File(s) Summary
Client & Cache Infrastructure
pkg/services/authz/rbac.go
Replaces in-process RBAC client construction with remote authz client using NoopCache and tracer. Adds NoopCache type implementing Get/Set/Delete methods returning not-found/no-op responses.
Cache Key Generation
pkg/services/authz/rbac/cache.go
Adds new public function userPermDenialCacheKey that constructs cache keys including name and parent parameters for permission denial caching.
Authorization Decision Caching
pkg/services/authz/rbac/service.go
Introduces permDenialCache field to Service struct, initializes it in NewService. Adds getCachedIdentityPermissions helper method and integrates caching into Check method (early-deny path) and List method (cache-first approach before full permission evaluation).
Test Coverage
pkg/services/authz/rbac/service_test.go
Removes one test case for getUserPermissions cache behavior. Adds TestService_CacheCheck and TestService_CacheList to validate cache usage, fallback behavior, outdated cache handling, and cache-deny entries. Initializes permDenialCache in test setup.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant Client
    participant Check as Check Method
    participant PermCache as permDenialCache
    participant Permissions as Full Permission<br/>Evaluation
    
    Client->>Check: Check(ctx, identity, action...)
    
    rect rgb(200, 220, 255)
    Note over Check,PermCache: Early Deny Path
    Check->>PermCache: Query denial cache
    opt Cache Hit (Denied)
        PermCache-->>Check: Return true (denied)
        Check-->>Client: Deny (fast path)
    end
    end
    
    rect rgb(220, 255, 220)
    Note over Check,Permissions: Cached Lookup Path
    Check->>PermCache: getCachedIdentityPermissions
    opt Cache Hit (Permissions Exist)
        PermCache-->>Check: Return cached permissions map
        Check-->>Client: Authorize/Deny (from cache)
    end
    end
    
    rect rgb(255, 240, 200)
    Note over Check,Permissions: Cache Miss - Full Evaluation
    Check->>Permissions: Full permission evaluation
    Permissions-->>Check: Return permissions
    Check->>PermCache: Store in cache
    PermCache-->>Check: ACK
    Check-->>Client: Authorize/Deny
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

  • pkg/services/authz/rbac/service.go: Significant logic density with new caching paths integrated into Check and List methods, plus new helper method getCachedIdentityPermissions requiring careful verification of all identity type branches and cache key generation integration.
  • Cache coherency: Verify that permDenialCache initialization and userPermDenialCacheKey parameter alignment (name, parent fields) are consistent across all call sites.
  • Test migration: Confirm removed test case behavior is adequately covered by new cache tests and that test setup properly initializes all cache fields.

Poem

🐰 Permission denied? Cached quick!

NoopCache whispers its silent trick,

Early denials skip the fuss,

While cached permits speed up the trust.

Authorization hops with delight! 🏃‍♂️✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 7.69% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: improving authz caching in the AuthZService by introducing new caching mechanisms (permDenialCache, getCachedIdentityPermissions) and optimizing permission checks.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch authz-service-improve-caching-pr

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/services/authz/rbac/service.go (1)

93-122: Scope denial caching to user-like identities to avoid cross-identity bleed

The early permDenialCache check in Check is a good optimization, but as implemented it uses userPermDenialCacheKey for all identity types and only keys on namespace, UserUID, action, name, and parent folder. Because validateSubject drops the identity type when parsing the subject (e.g. both anonymous:0 and user:0 yield userUID == "0"), a denial cached for one identity type can be incorrectly reused for another that shares the same UID string.

That can lead to unexpected hard denies for different principals until the short TTL expires (e.g., a user with UID "0" inheriting a denial cached for the anonymous subject "anonymous:0").

A low‑risk fix is to only use the denial cache for user/service‑account identities:

 func (s *Service) Check(ctx context.Context, req *authzv1.CheckRequest) (*authzv1.CheckResponse, error) {
@@
-	span.SetAttributes(
+	span.SetAttributes(
@@
-	permDenialKey := userPermDenialCacheKey(checkReq.Namespace.Value, checkReq.UserUID, checkReq.Action, checkReq.Name, checkReq.ParentFolder)
-	if _, ok := s.permDenialCache.Get(ctx, permDenialKey); ok {
-		s.metrics.permissionCacheUsage.WithLabelValues("true", checkReq.Action).Inc()
-		s.metrics.requestCount.WithLabelValues("false", "true", req.GetVerb(), req.GetGroup(), req.GetResource()).Inc()
-		return &authzv1.CheckResponse{Allowed: false}, nil
-	}
+	usePermDenialCache := types.IsIdentityType(checkReq.IdentityType, types.TypeUser, types.TypeServiceAccount)
+	if usePermDenialCache {
+		permDenialKey := userPermDenialCacheKey(checkReq.Namespace.Value, checkReq.UserUID, checkReq.Action, checkReq.Name, checkReq.ParentFolder)
+		if _, ok := s.permDenialCache.Get(ctx, permDenialKey); ok {
+			s.metrics.permissionCacheUsage.WithLabelValues("true", checkReq.Action).Inc()
+			s.metrics.requestCount.WithLabelValues("false", "true", req.GetVerb(), req.GetGroup(), req.GetResource()).Inc()
+			return &authzv1.CheckResponse{Allowed: false}, nil
+		}
+	}
@@
-	if !allowed {
-		s.permDenialCache.Set(ctx, permDenialKey, true)
-	}
+	if !allowed && usePermDenialCache {
+		permDenialKey := userPermDenialCacheKey(checkReq.Namespace.Value, checkReq.UserUID, checkReq.Action, checkReq.Name, checkReq.ParentFolder)
+		s.permDenialCache.Set(ctx, permDenialKey, true)
+	}

Alternatively, you could include the identity type in the denial cache key, but gating by type keeps the function name (userPermDenialCacheKey) and existing key structure intact.

Also applies to: 153-156, 56-61, 83-88

🧹 Nitpick comments (3)
pkg/services/authz/rbac/cache.go (1)

30-32: Denial cache key helper is consistent; consider a more explicit prefix

Key composition matches the existing userPermCacheKey, but because this is semantically different (explicit denials), using a distinct prefix (e.g. .perm_denial_…) would make debugging/log inspection easier and future‑proofs you if these keys ever end up in a shared cache.

pkg/services/authz/rbac/service_test.go (1)

1291-1323: Strengthen TestService_CacheList by asserting no store calls

This test correctly verifies that List can be served entirely from permCache, but it doesn’t assert that the backing store wasn’t hit. You can increase confidence in the cache‑only behavior by asserting the fake store’s call count stays at zero:

 	t.Run("List based on cached permissions", func(t *testing.T) {
 		s := setupService()
 		ctx := types.WithAuthInfo(context.Background(), callingService)
 		userID := &store.UserIdentifiers{UID: "test-uid", ID: 1}
@@
-		s.identityStore = &fakeIdentityStore{}
+		s.identityStore = &fakeIdentityStore{}
+
+		store, ok := s.store.(*fakeStore)
+		require.True(t, ok)
@@
 		require.NoError(t, err)
 		require.ElementsMatch(t, resp.Items, []string{"dash1", "dash2"})
 		require.ElementsMatch(t, resp.Folders, []string{"fold1"})
+
+		require.Zero(t, store.calls)
 	})

This keeps the test focused on verifying that a warm cache prevents any backend reads.

pkg/services/authz/rbac/service.go (1)

161-197: Differentiate cache miss from backend errors in getCachedIdentityPermissions callers

List (and similarly Check) currently treats any error from getCachedIdentityPermissions as a simple cache miss:

permissions, err := s.getCachedIdentityPermissions(...)
if err == nil {
    // use cache
} else {
    // treat as miss, hit DB
    permissions, err = s.getIdentityPermissions(...)
}

But getCachedIdentityPermissions can fail both with cache.ErrNotFound (true miss) and with real backend errors (e.g., from GetUserIdentifiers). In the latter case you’ll end up executing the same expensive chain (GetUserIdentifiers, getUserBasicRole, etc.) twice: once in the cache path, then again via getIdentityPermissions.

It would be cleaner and cheaper under failure to distinguish these:

 permissions, err := s.getCachedIdentityPermissions(ctx, listReq.Namespace, listReq.IdentityType, listReq.UserUID, listReq.Action)
-if err == nil {
-	s.metrics.permissionCacheUsage.WithLabelValues("true", listReq.Action).Inc()
-} else {
-	s.metrics.permissionCacheUsage.WithLabelValues("false", listReq.Action).Inc()
-
-	permissions, err = s.getIdentityPermissions(ctx, listReq.Namespace, listReq.IdentityType, listReq.UserUID, listReq.Action)
-	if err != nil {
-		// ...
-		return nil, err
-	}
-}
+if err == nil {
+	s.metrics.permissionCacheUsage.WithLabelValues("true", listReq.Action).Inc()
+} else {
+	if !errors.Is(err, cache.ErrNotFound) {
+		// propagate real errors instead of retrying the whole path
+		ctxLogger.Error("could not get cached user permissions", "subject", req.GetSubject(), "error", err)
+		s.metrics.requestCount.WithLabelValues("true", "true", req.GetVerb(), req.GetGroup(), req.GetResource()).Inc()
+		return nil, err
+	}
+	s.metrics.permissionCacheUsage.WithLabelValues("false", listReq.Action).Inc()
+
+	permissions, err = s.getIdentityPermissions(ctx, listReq.Namespace, listReq.IdentityType, listReq.UserUID, listReq.Action)
+	if err != nil {
+		// ...
+		return nil, err
+	}
+}

(Analogous handling could be applied in Check.)

This keeps the “cache miss => fall back to DB” behavior while avoiding duplicate backend work and clearer logging when the cache lookup itself fails for non‑miss reasons.

Also applies to: 342-368

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e650fa7 and d9dc93c.

📒 Files selected for processing (4)
  • pkg/services/authz/rbac.go (3 hunks)
  • pkg/services/authz/rbac/cache.go (1 hunks)
  • pkg/services/authz/rbac/service.go (6 hunks)
  • pkg/services/authz/rbac/service_test.go (3 hunks)
🔇 Additional comments (2)
pkg/services/authz/rbac/service_test.go (1)

1325-1345: Setup wiring for permDenialCache looks good

setupService now initializes permDenialCache with shortCacheTTL, matching the runtime Service configuration and keeping denial entries short‑lived to reduce the risk of stale negative decisions. No issues here.

pkg/services/authz/rbac.go (1)

101-105: Client-side cache configuration and NoopCache implementation are reasonable

Using a NoopCache for the in‑proc client avoids redundant caching on top of the server‑side caches, while the remote client now benefits from a small local cache (Expiry: 30s) to reduce RPC volume. The NoopCache implementation correctly satisfies the cache.Cache interface by always returning cache.ErrNotFound from Get and nil from Set/Delete, so existing client code should see it as a normal cache miss.

If you find yourself needing to tune cache behavior later, consider threading a cache.Cache (or config) into newRemoteRBACClient instead of hard‑coding the LocalCache parameters, but the current setup is fine for now.

Please double‑check against the current authzlib.WithCacheClientOption and cache.Cache interfaces that there are no additional behavioral expectations beyond returning ErrNotFound for a miss, especially around Delete.

Also applies to: 160-170, 239-251

Comment on lines +893 to +996
func TestService_CacheCheck(t *testing.T) {
callingService := authn.NewAccessTokenAuthInfo(authn.Claims[authn.AccessTokenClaims]{
Claims: jwt.Claims{
Subject: types.NewTypeID(types.TypeAccessPolicy, "some-service"),
Audience: []string{"authzservice"},
},
Rest: authn.AccessTokenClaims{Namespace: "org-12"},
})

ctx := types.WithAuthInfo(context.Background(), callingService)
userID := &store.UserIdentifiers{UID: "test-uid", ID: 1}

t.Run("Allow based on cached permissions", func(t *testing.T) {
s := setupService()

s.idCache.Set(ctx, userIdentifierCacheKey("org-12", "test-uid"), *userID)
s.permCache.Set(ctx, userPermCacheKey("org-12", "test-uid", "dashboards:read"), map[string]bool{"dashboards:uid:dash1": true})

resp, err := s.Check(ctx, &authzv1.CheckRequest{
Namespace: "org-12",
Subject: "user:test-uid",
Group: "dashboard.grafana.app",
Resource: "dashboards",
Verb: "get",
Name: "dash1",
})
require.NoError(t, err)
assert.True(t, resp.Allowed)
})
t.Run("Fallback to the database on cache miss", func(t *testing.T) {
s := setupService()

// Populate database permission but not the cache
store := &fakeStore{
userID: userID,
userPermissions: []accesscontrol.Permission{{Action: "dashboards:read", Scope: "dashboards:uid:dash2"}},
}

s.store = store
s.permissionStore = store

s.idCache.Set(ctx, userIdentifierCacheKey("org-12", "test-uid"), *userID)

resp, err := s.Check(ctx, &authzv1.CheckRequest{
Namespace: "org-12",
Subject: "user:test-uid",
Group: "dashboard.grafana.app",
Resource: "dashboards",
Verb: "get",
Name: "dash2",
})
require.NoError(t, err)
assert.True(t, resp.Allowed)
})
t.Run("Fallback to the database on outdated cache", func(t *testing.T) {
s := setupService()

store := &fakeStore{
userID: userID,
userPermissions: []accesscontrol.Permission{{Action: "dashboards:read", Scope: "dashboards:uid:dash2"}},
}

s.store = store
s.permissionStore = store

s.idCache.Set(ctx, userIdentifierCacheKey("org-12", "test-uid"), *userID)
// The cache does not have the permission for dash2 (outdated)
s.permCache.Set(ctx, userPermCacheKey("org-12", "test-uid", "dashboards:read"), map[string]bool{"dashboards:uid:dash1": true})

resp, err := s.Check(ctx, &authzv1.CheckRequest{
Namespace: "org-12",
Subject: "user:test-uid",
Group: "dashboard.grafana.app",
Resource: "dashboards",
Verb: "get",
Name: "dash2",
})
require.NoError(t, err)
assert.True(t, resp.Allowed)
})
t.Run("Should deny on explicit cache deny entry", func(t *testing.T) {
s := setupService()

s.idCache.Set(ctx, userIdentifierCacheKey("org-12", "test-uid"), *userID)

// Explicitly deny access to the dashboard
s.permDenialCache.Set(ctx, userPermDenialCacheKey("org-12", "test-uid", "dashboards:read", "dash1", "fold1"), true)

// Allow access to the dashboard to prove this is not checked
s.permCache.Set(ctx, userPermCacheKey("org-12", "test-uid", "dashboards:read"), map[string]bool{"dashboards:uid:dash1": false})

resp, err := s.Check(ctx, &authzv1.CheckRequest{
Namespace: "org-12",
Subject: "user:test-uid",
Group: "dashboard.grafana.app",
Resource: "dashboards",
Verb: "get",
Name: "dash1",
Folder: "fold1",
})
require.NoError(t, err)
assert.False(t, resp.Allowed)
})
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

TestService_CacheCheck denial test comment doesn’t match data

In the “Should deny on explicit cache deny entry” subtest, the comment says the perm cache “allows access”, but the map stores false for the scope, so it actually represents a denial, and the value is irrelevant because the denial cache short‑circuits first. If you want this test to prove that permDenialCache overrides an allowing permission cache, flip the value to true:

-		// Allow access to the dashboard to prove this is not checked
-		s.permCache.Set(ctx, userPermCacheKey("org-12", "test-uid", "dashboards:read"), map[string]bool{"dashboards:uid:dash1": false})
+		// Allow access to the dashboard to prove this is not checked
+		s.permCache.Set(ctx, userPermCacheKey("org-12", "test-uid", "dashboards:read"), map[string]bool{"dashboards:uid:dash1": true})

That way the test clearly verifies that an explicit deny entry wins even when the positive permission cache would otherwise allow.

🤖 Prompt for AI Agents
In pkg/services/authz/rbac/service_test.go around lines 893 to 996, the "Should
deny on explicit cache deny entry" subtest's comment says the perm cache "allows
access" but the test data sets the scope value to false (a deny), which
contradicts the intent; change the cached permission map value for
"dashboards:uid:dash1" from false to true so the permCache represents an
allowing permission and the test then verifies that permDenialCache still
overrides an allow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants