Skip to content

feat: Role based Cadence-web#1108

Open
ribaraka wants to merge 22 commits intocadence-workflow:masterfrom
ribaraka:rbac
Open

feat: Role based Cadence-web#1108
ribaraka wants to merge 22 commits intocadence-workflow:masterfrom
ribaraka:rbac

Conversation

@ribaraka
Copy link
Copy Markdown
Contributor

@ribaraka ribaraka commented Dec 8, 2025

Motivation: cadence-workflow/cadence#6706
Plan&Findings: cadence-workflow/cadence#7508

This PR adds JWT-based auth plumbing to Cadence Web and introduces auth-aware workflow action gating.

  • Auth strategy: JWT auth is resolved through config, with token validity/expiry, groups, admin flag, and gRPC metadata derived from the cadence-authorization cookie.
  • Dynamic config: auth behavior is routed through dynamic config resolvers: CADENCE_WEB_AUTH_STRATEGY validates auth mode, DOMAIN_ACCESS defines how domain access is determined,
  • Auth endpoints: POST /api/auth/token to set the HttpOnly cookie, DELETE /api/auth/token to clear it, GET /api/auth/me to expose public auth context.
  • API middleware now provides auth context, user info, and gRPC metadata for all route handlers that use routeHandlersDefaultMiddlewares.
  • Workflow/domain actions: start/signal/terminate/etc. are disabled with “Not authorized” when the token lacks write access;
  • Login/logout UI: navbar shows JWT paste modal when unauthenticated.

@ribaraka ribaraka force-pushed the rbac branch 2 times, most recently from 319273d to 42b3e1c Compare December 9, 2025 03:49
@ribaraka ribaraka changed the title Role based Cadence-web feat: Role based Cadence-web Dec 9, 2025
@ribaraka ribaraka force-pushed the rbac branch 3 times, most recently from 165520e to b1131fc Compare December 9, 2025 04:36
@ribaraka ribaraka force-pushed the rbac branch 3 times, most recently from 84cb2e7 to 9bd8868 Compare December 18, 2025 15:59
@ribaraka ribaraka marked this pull request as ready for review December 18, 2025 15:59
@ribaraka
Copy link
Copy Markdown
Contributor Author

start page

rbac off:
image

rbac on:
image

@ribaraka
Copy link
Copy Markdown
Contributor Author

user menu (un-auth user):
image

input to enter a token:
image

added an admin token (domains are appeared + pop message):
image

user menu (after login):
image

added a reader user:
image

reader user inside a domain:
image

added a writer user:
image

writer user inside a domain:
image

user redirected to "start page" when token is expired:
image

@ribaraka
Copy link
Copy Markdown
Contributor Author

Unauthorised error messages are shown when user access protected domain
image

@ribaraka
Copy link
Copy Markdown
Contributor Author

@Assem-Hafez @demirkayaender, when you available, please review my initial version of the feature to ship with this PR.

@Assem-Uber
Copy link
Copy Markdown
Contributor

@ribaraka Thanks for opening the PR. Is this aligned with the requirements shared here?

@ribaraka
Copy link
Copy Markdown
Contributor Author

ribaraka commented Jan 5, 2026

@Assem-Uber, yes it is 😃

@cjrolo
Copy link
Copy Markdown

cjrolo commented Jan 23, 2026

Hello, is it possible to get a review on this PR? Thanks!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive Role-Based Access Control (RBAC) support to the Cadence Web UI. The implementation aligns with Cadence backend JWT authentication by extracting tokens from cookies, forwarding them via gRPC metadata, and using JWT claims (Admin flag, groups) to control UI visibility and enable/disable actions.

Changes:

  • Authentication infrastructure with JWT cookie handling (cadence-authorization), token validation, and expiration tracking
  • Domain access control filtering domains by READ_GROUPS/WRITE_GROUPS metadata and disabling workflow actions based on write permissions
  • Login/logout UI in the navigation bar with automatic session expiration handling and token refresh support

Reviewed changes

Copilot reviewed 33 out of 33 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
src/utils/auth/auth-context.ts Server-side auth context resolution with JWT decoding and cookie reading
src/utils/auth/auth-shared.ts Shared auth types and domain access logic based on group membership
src/utils/auth/tests/auth-context.test.ts Comprehensive tests for auth utilities
src/app/api/auth/token/route.ts POST/DELETE endpoints for setting/clearing auth cookie
src/app/api/auth/me/route.ts GET endpoint exposing public auth context
src/utils/route-handlers-middleware/middlewares/user-info.ts Middleware to populate gRPC metadata from auth token
src/utils/route-handlers-middleware/config/route-handlers-default-middlewares.config.ts Reordered middlewares so userInfo runs before grpcClusterMethods
src/hooks/use-user-info/use-user-info.ts Client hook for fetching current auth state
src/hooks/use-domain-access/use-domain-access.ts Client hook combining user info and domain metadata to determine access
src/views/domains-page/helpers/get-all-domains.ts Filters domains list by canRead permission
src/views/domains-page/domains-page.tsx Passes auth context to domain loading
src/views/redirect-domain/redirect-domain.tsx Uses auth context for domain access checks
src/views/workflow-actions/workflow-actions.tsx Checks write permission and disables actions accordingly
src/views/workflow-actions/workflow-actions-menu/workflow-actions-menu.tsx Shows "Not authorized" for actions when write access is false
src/views/domain-page/domain-page-start-workflow-button/domain-page-start-workflow-button.tsx Disables start workflow button when write access is denied
src/views/domain-workflows/domain-workflows.tsx Conditionally fetches cluster info based on auth state
src/components/app-nav-bar/app-nav-bar.tsx Adds login/logout UI, token modal, and automatic expiration handling
src/components/auth-token-modal/auth-token-modal.tsx Modal for pasting JWT tokens
src/components/snackbar-provider/snackbar-provider.tsx Changed duration from infinite to medium
README.md Documents RBAC configuration and JWT cookie mechanism
src/config/dynamic/dynamic.config.ts Adds CADENCE_WEB_RBAC_ENABLED config

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +20 to +43
export default function AuthTokenModal({ isOpen, onClose, onSubmit }: Props) {
const [token, setToken] = useState('');
const [isSubmitting, setIsSubmitting] = useState(false);
const [error, setError] = useState<string | null>(null);

const handleSubmit = async () => {
if (!token.trim()) {
setError('Please paste a JWT token first');
return;
}

setIsSubmitting(true);
setError(null);
try {
await onSubmit(token.trim());
setToken('');
} catch (e) {
setError(
e instanceof Error ? e.message : 'Failed to save authentication token'
);
} finally {
setIsSubmitting(false);
}
};
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The modal's internal state (token, error) is not reset when the modal is closed or when it's reopened. If a user encounters an error, closes the modal, and reopens it, the previous error message will still be displayed. Consider adding a useEffect to reset the state when isOpen changes to false, or call an onClose callback that resets the state.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added reset effect to prevent stale state on reopen.

Comment on lines 41 to 52
it('skips cluster fetch for non-admin token when RBAC is disabled', async () => {
await act(async () => {
await setup({
rbacEnabled: false,
isAuthenticated: true,
isAdmin: false,
skipClusterRequest: true,
});
} catch (error) {
if (error instanceof Error) {
renderErrorMessage = error.message;
}
}
});

expect(renderErrorMessage).toEqual('Failed to fetch cluster info');
expect(await screen.findByText('Basic Workflows')).toBeInTheDocument();
});
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test case "skips cluster fetch for non-admin token when RBAC is disabled" seems to contradict the expected behavior. The test expects that when RBAC is disabled but the user is authenticated as non-admin, the cluster fetch is skipped. However, when RBAC is disabled, access control should not be enforced, meaning cluster info should be fetchable regardless of authentication status. The condition (!isRbacEnabled && !isAuthenticated) only allows the fetch when BOTH RBAC is disabled AND the user is NOT authenticated, which doesn't make logical sense - why would an authenticated user have less access than an unauthenticated one when RBAC is off?

Expected behavior should be: when RBAC is disabled, always allow the fetch; when RBAC is enabled, only allow for admins.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fair point. but this test case is obsolete. I deleted it

Comment on lines +18 to +42
export async function POST(request: NextRequest) {
try {
const body = await request.json();
if (!body?.token || typeof body.token !== 'string') {
return NextResponse.json(
{ message: 'A valid token is required' },
{ status: 400, headers: { 'Cache-Control': 'no-store' } }
);
}

const normalizedToken = body.token.trim().replace(/^bearer\s+/i, '');
if (!normalizedToken) {
return NextResponse.json(
{ message: 'A valid token is required' },
{ status: 400, headers: { 'Cache-Control': 'no-store' } }
);
}

const response = NextResponse.json({ ok: true });
response.headers.set('Cache-Control', 'no-store');
response.cookies.set(CADENCE_AUTH_COOKIE_NAME, normalizedToken, {
...COOKIE_OPTIONS,
secure: getCookieSecureAttribute(request),
});
return response;
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The POST endpoint accepts any string as a JWT token without validation. While the token is validated later when used, accepting and storing potentially malformed or malicious tokens could lead to issues. Consider adding basic JWT format validation (e.g., checking for three dot-separated base64url-encoded parts) before setting the cookie. This would provide early feedback to users if they paste an invalid token and prevent unnecessary storage of invalid tokens.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added basic JWT format validation (e.g., checking for three dot-separated base64url-encoded parts) before setting the cookie.

Comment on lines +96 to +114
const logout = useCallback(
async (reason: 'manual' | 'expired') => {
if (logoutInFlightRef.current) return;
logoutInFlightRef.current = true;
logoutReasonRef.current = reason;
try {
await request('/api/auth/token', { method: 'DELETE' });
} catch (e) {
logoutReasonRef.current = null;
const message = e instanceof Error ? e.message : 'Failed to sign out';
enqueue({ message });
} finally {
setIsModalOpen(false);
await refetch();
router.refresh();
router.replace('/domains');
logoutInFlightRef.current = false;
}
/>
},
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the logout function's catch block, logoutReasonRef.current is set to null when there's an error (line 104). However, this happens before the finally block refetches auth data. If the DELETE request fails but the cookie is somehow cleared or expires anyway, the useEffect on line 118 will detect the authentication state change and try to show a message based on logoutReasonRef.current, which will be null, resulting in a "Session expired" message even for manual logouts that encountered errors. Consider only resetting the reason after the state transition is complete, or handle the error case explicitly in the effect.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoided reseting logoutReasonRef.current on error catching

Comment on lines +148 to +163
useEffect(() => {
if (!isRbacEnabled || !isAuthenticated || expiresAtMs === undefined) return;
const timeoutMs = expiresAtMs - Date.now();
if (timeoutMs <= 0) {
void logout('expired');
return;
}

const id = window.setTimeout(() => {
void logout('expired');
}, timeoutMs);

return () => {
window.clearTimeout(id);
};
}, [expiresAtMs, isAuthenticated, isRbacEnabled, logout]);
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The token expiration handling creates a timeout to automatically logout when the token expires. However, there's a potential issue if expiresAtMs changes before the timeout fires (e.g., if a new token is set). The cleanup function correctly clears the old timeout, but if the token is refreshed with a new expiration, the user will be immediately logged out if the old token had already expired (line 151-154 checks timeoutMs <= 0). This could create a race condition where a user logs in with a new token but gets immediately logged out if the check happens before the new expiresAtMs is set. Consider adding a check that expiresAtMs is for the current token or add additional guards.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a guard so auto‑logout only fires if the expiration is still current

README.md Outdated
Comment on lines +38 to +50
#### RBAC Authentication (JWT cookie)

When `CADENCE_WEB_RBAC_ENABLED=true`, cadence-web authenticates using a cookie:

- Cookie name: `cadence-authorization`
- Cookie value: raw JWT string

To integrate an upstream proxy / IdP, set the cookie for the cadence-web origin:

```
Set-Cookie: cadence-authorization=<JWT>; Path=/; HttpOnly; SameSite=Lax; Secure
```
You can also set/clear the cookie via `POST /api/auth/token` and `DELETE /api/auth/token`; or use `Login with JWT` button in the UI.
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RBAC documentation could be more comprehensive. It mentions the cookie mechanism but doesn't specify:

  1. What JWT claims are required/recognized (Admin, groups/Groups, name, sub, exp)
  2. How domain-level READ_GROUPS and WRITE_GROUPS metadata should be configured
  3. The format expected for groups (array or comma/space-separated string)
  4. How the JWT should be signed (what algorithms are accepted, if any validation occurs)

Consider adding an example JWT payload structure and domain metadata configuration to help users integrate their identity providers correctly.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

@ribaraka ribaraka Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provided in the auth-strategy-design.md doc

Comment on lines +1 to +17
'use client';
import { useQuery } from '@tanstack/react-query';

import { type PublicAuthContext } from '@/utils/auth/auth-shared';
import request from '@/utils/request';
import { type RequestError } from '@/utils/request/request-error';

export default function useUserInfo() {
return useQuery<PublicAuthContext, RequestError>({
queryKey: ['auth', 'me'],
queryFn: async () => {
const res = await request('/api/auth/me', { method: 'GET' });
return res.json();
},
staleTime: 30_000,
});
}
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new hooks useUserInfo and useDomainAccess do not have unit tests. Other hooks in this codebase (e.g., use-config-value) have corresponding test files in __tests__ directories. These hooks contain non-trivial logic including conditional query execution, error handling, and memoization that should be tested. Consider adding test coverage for:

  • useUserInfo: API call, error handling, staleTime behavior
  • useDomainAccess: RBAC-enabled/disabled paths, error states, loading states, access calculation logic

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added unit tests

Comment on lines +1 to +59
'use client';
import { useMemo } from 'react';

import { useQuery } from '@tanstack/react-query';

import { getDomainAccessForUser } from '@/utils/auth/auth-shared';
import getDomainDescriptionQueryOptions from '@/views/shared/hooks/use-domain-description/get-domain-description-query-options';
import { type UseDomainDescriptionParams } from '@/views/shared/hooks/use-domain-description/use-domain-description.types';

import useUserInfo from '../use-user-info/use-user-info';

export default function useDomainAccess(params: UseDomainDescriptionParams) {
const userInfoQuery = useUserInfo();
const isRbacEnabled = userInfoQuery.data?.rbacEnabled === true;

const domainQuery = useQuery({
...getDomainDescriptionQueryOptions(params),
enabled: isRbacEnabled,
});

const access = useMemo(() => {
if (userInfoQuery.isError) {
return { canRead: false, canWrite: false };
}

if (!userInfoQuery.data) {
return undefined;
}

if (!userInfoQuery.data.rbacEnabled) {
return { canRead: true, canWrite: true };
}

if (domainQuery.data) {
return getDomainAccessForUser(domainQuery.data, userInfoQuery.data);
}

if (domainQuery.isError) {
return { canRead: false, canWrite: false };
}

return undefined;
}, [
domainQuery.data,
domainQuery.isError,
userInfoQuery.data,
userInfoQuery.isError,
]);

const isLoading =
userInfoQuery.isLoading || (isRbacEnabled && domainQuery.isLoading);

return {
access,
isLoading,
isError: userInfoQuery.isError || domainQuery.isError,
userInfoQuery,
};
}
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The useDomainAccess hook lacks unit tests. This hook contains complex conditional logic for determining domain access based on RBAC settings, authentication state, and domain metadata. Test coverage should include scenarios like: RBAC enabled vs disabled, authenticated vs unauthenticated users, admin vs non-admin users, domain query success/failure, and the various access permission combinations (canRead/canWrite).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added unit tests

const isAuthenticated = authInfo?.isAuthenticated === true;

const shouldFetchClusterInfo =
Boolean(authInfo) && (isAdmin || (!isRbacEnabled && !isAuthenticated));
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition for shouldFetchClusterInfo appears to have incorrect logic. When RBAC is disabled (!isRbacEnabled) and a user is NOT authenticated (!isAuthenticated), the cluster info should still be fetched. However, the current logic (isAdmin || (!isRbacEnabled && !isAuthenticated)) means:

  • If user IS authenticated but not an admin, and RBAC is disabled, the cluster fetch is skipped
  • This seems backwards - when RBAC is disabled, all users should be able to fetch cluster info regardless of authentication

The condition should likely be: isAdmin || !isRbacEnabled || !isAuthenticated or more simply: when RBAC is disabled, always fetch; when RBAC is enabled, only fetch for admins.

Suggested change
Boolean(authInfo) && (isAdmin || (!isRbacEnabled && !isAuthenticated));
Boolean(authInfo) && (!isRbacEnabled || isAdmin);

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

placement={PLACEMENT.bottom}
overrides={overrides.snackbar}
defaultDuration={DURATION.infinite}
defaultDuration={DURATION.medium}
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the snackbar duration from DURATION.infinite to DURATION.medium is a significant behavioral change. Infinite duration means snackbars stay on screen until manually dismissed, while medium duration auto-dismisses them after a few seconds. This change affects all snackbar messages throughout the application, not just auth-related ones. This should be explicitly mentioned in the PR description as it's not directly related to RBAC functionality and impacts user experience globally. Consider whether this is intentional or if auth-specific messages should have different duration settings.

Suggested change
defaultDuration={DURATION.medium}
defaultDuration={DURATION.infinite}

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverted

groups?: unknown;
name?: string;
sub?: string;
[key: string]: unknown;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this or it is added for future additions?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was to keep the type forward‑compatible with any other custom claims coming from IdP. But I think I better make it strict claim validation - only lowercase

const res = await request('/api/auth/me', { method: 'GET' });
return res.json();
},
staleTime: 30_000,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this caching affect the logout behavior ?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it affects the logout behavior.. I set it to 0.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also was playing with front and I found another bug: on the layout breakpoint, when I logout from normal zoom, in "the after zoom" it collapsed into the hamburger menu and still kept the user data.

I fixed it as well 💪

const existingMetadata = isObjectOfStringKeyValue(ctx.grpcMetadata)
? ctx.grpcMetadata
: {};
ctx.grpcMetadata = {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modifying context is anti-pattern, this is disallowed to make reasoning about ctx easier.

Use the userInfo middleware to return plain user information. Use this information in grpcMetadata middleware.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed ctx mutation and split across middlewares

import { type UserInfoMiddlewareContext } from './user-info.types';

const userInfo: MiddlewareFunction<
['userInfo', UserInfoMiddlewareContext]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mixing user information and auth details can complicate implementing different auth strategies. Would recommend keeping user info only for user information such as id/name/email etc. And have auth middleware providing auth type, isAuthEnabled, isAuthenticated etc.

Beside that having rbacEnabled, expiresAtMs in user info doesn't make sense.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good suggestion!

placement={PLACEMENT.bottom}
overrides={overrides.snackbar}
defaultDuration={DURATION.infinite}
defaultDuration={DURATION.medium}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is changing duration for all snackbars, as i recall duration can be set during invocation of the snackbar

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reverted this changes and put duration for each snaclbar seperatly

};

const parseBooleanFlag = (value: string) =>
value?.toLowerCase() === 'true' || value === '1';
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets keep it simple and only check for 'true'

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true got replaced by jwt strategy. so no need for this code

env: 'CADENCE_ADMIN_SECURITY_TOKEN',
default: '',
},
CADENCE_WEB_RBAC_ENABLED: {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing this to CADENCE_WEB_AUTH_STRATEGY with disabled & JWT provide more flexibility.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

'utf8'
);

return JSON.parse(decodedPayload) as CadenceJwtClaims;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The typing here is in correct, we need to verify the claim types before casting, can be done manually or using zod

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added zod

} catch (e) {
const message =
e instanceof Error ? e.message : 'Failed to save authentication token';
enqueue({ message });
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Green snackbar is not suitable for errors.
Use the red variant of it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added red variant

@Assem-Uber
Copy link
Copy Markdown
Contributor

@ribaraka Sorry for the delay in getting back to you! I had some time off that kept me from looking into this sooner.

I’ve shared some feedback on the approach above, but please note that this feature will require a design document and discussion for broader review. Let me know today if you would like to continue that process; otherwise, I can pick it up and kick off the design documentation."

The design document should cover the following:

  • High level design for UI auth life cycle.
  • Behavior in different setups.
  • Integration with backend and responsibilities of each.
  • Security (Security measures that was taken in account)
  • Usage (How users are going to get started with different setups)

@ribaraka
Copy link
Copy Markdown
Contributor Author

Thank you for your review, @Assem-Uber!
I'd like to proceed further! I'll prepare a design doc soon.

@ribaraka ribaraka requested a review from Assem-Uber January 28, 2026 19:09
@ribaraka
Copy link
Copy Markdown
Contributor Author

@Assem-Uber, I have added the design doc, please reveiew

@ribaraka
Copy link
Copy Markdown
Contributor Author

Pasted image Pasted image (2)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 40 out of 40 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

src/views/domains-page/helpers/get-all-domains.ts:69

  • The failedClusters logic is incorrect. At line 68, it tries to find ANY rejected result in the results array, not the specific rejection for the current cluster. This means all clusters in failedClusters will have the same rejection (the first one found by find()), rather than their own specific rejection. The logic should match each cluster config with its corresponding result by index or cluster name. For example: rejection: results[CLUSTERS_CONFIGS.indexOf(config)] (if results are in the same order) or find by matching some cluster identifier.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +41 to +46
const isAdvancedVisibilityEnabled = useMemo(() => {
return isClusterAdvancedVisibilityEnabled(data);
}, [data]);
// Non-admin authenticated users may not be allowed to call describeCluster,
// but can still have access to advanced visibility APIs.
if (isAuthenticatedNonAdmin) {
return true;
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic assumes that non-admin authenticated users always have access to advanced visibility APIs (line 44-45), but this assumption might not hold in all Cadence deployments. If the backend has separate authorization for the describeCluster and advanced visibility APIs, a non-admin user might have cluster describe permissions but not advanced visibility permissions, or vice versa. This could lead to failed API calls when the UI tries to use advanced visibility features. Consider adding error handling in the advanced workflows component to gracefully fall back to basic workflows if the advanced visibility calls fail due to authorization.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

@ribaraka ribaraka Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

advanced visibility is available for view/read users, at least I confirmed it by Cadence CLI.

The advanced visibility in web is only enabled if describeCluster succeeds and reports the advancedVisibilityEnabled feature. But to describeCluster is only available for admins, here's a bit of intricate situation for non-admin user. I've come up with some query probe: web calls the workflows API and enables advanced UI only if that call succeeds, otherwise it falls back to basic view. This avoids relying on describeCluster (admin-only) call and prevents false-positive advanced rendering on clusters where this feature is not enabled

userName: 'worker',
id: 'worker',
isAuthenticated: true,
token: undefined,
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test expects token: undefined in the result object, but the PublicAuthContext type does not include a token field - only isAuthenticated. The getPublicAuthContext function correctly omits the token from the returned object, so this test assertion is checking for a field that should not exist in the result. Remove the token: undefined expectation from the test.

Suggested change
token: undefined,

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment on lines +106 to +114
return {
authEnabled,
token: effectiveToken,
groups,
isAdmin,
userName,
id,
expiresAtMs,
};
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When auth is disabled (authEnabled: false), the function still forwards the token (if present) from the cookie. This creates a security inconsistency: even though auth is marked as disabled in the UI layer (no access controls enforced), the token is still being sent to the backend in gRPC metadata via getGrpcMetadataFromAuth. This could lead to confusion where the UI shows unrestricted access but the backend still enforces authorization based on the token. Consider whether tokens should be forwarded when auth is explicitly disabled, or document this behavior clearly if it's intentional for backward compatibility.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

web should ignore the cookie and should not forward it to backend, when auth strategy is disabled

Comment on lines +177 to +197
useEffect(() => {
if (!isAuthEnabled || !isAuthenticated || expiresAtMs === undefined) return;
const timeoutMs = expiresAtMs - Date.now();
const logoutIfCurrent = () => {
if (latestExpiresAtRef.current !== expiresAtMs) return;
void logout('expired');
};

if (timeoutMs <= 0) {
const id = window.setTimeout(logoutIfCurrent, 0);
return () => {
window.clearTimeout(id);
};
}

const id = window.setTimeout(logoutIfCurrent, timeoutMs);

return () => {
window.clearTimeout(id);
};
}, [expiresAtMs, isAuthenticated, isAuthEnabled, logout]);
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a race condition in the token expiry timeout handling. If the component unmounts and remounts quickly, or if expiresAtMs changes, multiple timeout callbacks could be scheduled. While each callback checks latestExpiresAtRef.current !== expiresAtMs before logging out, there's a window between when the timeout fires and when the check happens where the ref could change. Additionally, if a logout is already in progress (logoutInFlightRef.current === true), scheduling another timeout for the same expiry time is redundant. Consider adding a cleanup ref to track the scheduled timeout ID and clearing it when a new one is scheduled, or checking logoutInFlightRef before scheduling the logout.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

@ribaraka ribaraka Feb 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expiryTimeoutIdRef now tracks the scheduled expiry timeout and is cleared before re-scheduling. Also added guard with logoutInFlightRef both before scheduling and inside the timeout callback. This should prevent duplicate and stale auto-logout calls

Comment on lines +98 to +103
if (readGroups.length === 0 && writeGroups.length === 0) {
return {
canRead: false,
canWrite: false,
};
}
Copy link

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When authentication is enabled and a non-admin user has groups but a domain has no READ_GROUPS or WRITE_GROUPS configured (lines 98-103), access is denied entirely. While the developer confirmed this is intentional (per previous feedback), this creates a potential operational issue: if someone forgets to configure groups on a domain or temporarily removes them, all non-admin users lose access immediately. Consider logging a warning or providing better visibility when domains have auth enabled globally but no group configuration, as this might indicate a misconfiguration rather than an intentional restriction.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like too much. The deny behaviour is agreed in the discussion. Besides the admin could see domains without any group. @Assem-Uber, I can add this point as a future task (in the issue), if it makse sense.


export async function setAuthToken(request: NextRequest) {
try {
const requestBody = await request.json();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Edge Case: POST /api/auth/token sets cookie for expired JWTs

The setAuthToken handler validates the JWT format (three dot-delimited base64url segments) but does not decode or check the exp claim. An already-expired token is accepted and persisted as an HttpOnly cookie. The user gets correct feedback (the subsequent /api/auth/me call returns isValidToken: false), but the round-trip is wasted and leaves a useless cookie on the client.

Consider decoding and checking exp server-side before setting the cookie to provide immediate 400 feedback.

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Assem-Uber, can I create a follow-up task for this issue?

@ribaraka ribaraka requested a review from Assem-Uber March 18, 2026 03:31
Stanislav Bychkov added 20 commits March 26, 2026 23:24
- UI RBAC aligned with Cadence JWT auth: tokens come from cookie (cadence-authorization) or env (CADENCE_WEB_JWT_TOKEN), are forwarded on all gRPC calls, and claims/groups drive what the UI shows/enables.
- Auth endpoints: POST /api/auth/token to set the HttpOnly cookie, DELETE /api/auth/token to clear it, GET /api/auth/me to expose public auth context.
- User context middleware populates gRPC metadata and user info for all route handlers.
- Domain visibility: getAllDomains filters by READ_GROUPS/WRITE_GROUPS. Redirects respect the filtered list.
- Workflow/domain actions: start/signal/terminate/etc. are disabled with “Not authorized” when the token lacks write access;
- Login/logout UI: navbar shows JWT paste modal when unauthenticated.

Signed-off-by: Stanislav Bychkov <stanislb@netapp.com>
simplified:
- deduplicate splitGroupList
- nav bar subtitle cleanup
- remove env CADENCE_WEB_JWT_TOKEN and its usage
- Adjust secure-cookie detection to be proxy/request driven and correctly parse `x-forwarded-proto`, avoiding `NODE_ENV` forcing secure cookies.
- Adopt Bearer-prefix stripping when setting the auth cookie
- Added Cache-Control: no-store headers to prevent stale auth responses:
- Treat tokens that fail JWT decoding as unauthenticated
- Avoid extra domain fetch when RBAC is off
- Replaced CADENCE_WEB_RBAC_ENABLED with CADENCE_WEB_AUTH_STRATEGY (disabled|jwt).
- Renamed internal flag from rbacEnabled to authEnabled across the entire code.
- Made JWT claims strictly lowercase and removed the index signature.
- Added Zod validation.
- Added a basic JWT format check (header.payload.signature)
- Restored global snackbar duration.
- Error snackbars render red.
- Added guards for token expiry timers, kept logout reason for correct messaging, and added a key to avoid stale data on responsive layout changes.
- Middleware Refactor (no ctx mutation)
- Added grpc-metadata middleware that derives gRPC metadata from authInfo.
- Updated default middleware order to authInfo → userInfo → grpcMetadata → grpcClusterMethods.
- Added tests for hook, middleware, authEnabled and new auth info
- red snackbars for expired token.
- non-admin users not allowed to hit DescribeCluster (admin API).
- tests
- gRPC metadata forwarding now requires authEnabled=true
- Added timeout cleanup tracking via expiryTimeoutIdRef
- map failed domain clusters by matching result index
- tests
- Removed duplicate cached export
- Made one canonical cached wrapper
- Added the queryKey fix (props.cluster instead of props) to prevent unnecessary refetches
- Drop the useMemo for a trivial boolean derivation.
- Remove redundant Boolean(authInfo)
- Simplified the isClusterAdvancedVisibilityEnabled block
- Add the auth lifecycle hook to unload the app-nav-bar
- Drop the useMemo for a trivial boolean derivation.
- Remove redundant Boolean(authInfo)
- Simplified the isClusterAdvancedVisibilityEnabled block
- Add the lifecicle hook for auth (to unload the app-nav-bar)
- Domain group parsing is now strict to plain delimited strings (no JSON).
- Added a return type to getDomainAccessForUser
- Added isAuthenticated to BaseAuthContext, made PublicAuthContext an alias, renamed UserAuthContext to PrivatAuthContext for clearer intent.
- Simplified getPublicAuthContext to a rest-spread (token strip).
- Derived CadenceJwtClaims from Zod schema
- Removed redundant optional chaining, typeof guards.
- error on providing invalid/expired token
- Added tests for authentication route handlers.
- admins skip the domain description fetch
- stable key based on authentication state only
- removing the domain hook
- removed the explicit unauthenticated fallback
- aligning auth-token route
- fix centralize workflow action auth in dynamic config
…r non-Node runtimes

The DOMAIN_ACCESS resolver imports grpc-client which depends on
Node-only modules (fs, path). Static imports in instrumentation.ts
caused webpack to trace the entire gRPC dependency tree at compile
time, failing for non-Node compilation targets.
Signed-off-by: Stanislav Bychkov <stanislb@netapp.com>
Signed-off-by: Stanislav Bychkov <stanislav.bychkov@netapp.com>
Signed-off-by: Stanislav Bychkov <stanislav.bychkov@netapp.com>
@gitar-bot
Copy link
Copy Markdown

gitar-bot bot commented Mar 30, 2026

Code Review ⚠️ Changes requested 5 resolved / 8 findings

Implements role-based access control for Cadence-web with authentication refactoring and improved test coverage, but three security concerns must be addressed: the x-forwarded-proto header is trusted without proxy validation, useUserInfo's staleTime: 0 causes excessive refetches, and POST /api/auth/token accepts expired JWTs without decoding the exp claim.

⚠️ Security: x-forwarded-proto trusted without proxy validation

📄 src/app/api/auth/token/route.ts:22-27 📄 src/components/app-nav-bar/app-nav-bar.tsx:157 📄 src/components/app-nav-bar/app-nav-bar.tsx:164 📄 src/utils/auth/schemas/cadence-jwt-claims-schema.ts:6

In getCookieSecureAttribute, the x-forwarded-proto header is unconditionally trusted to determine the cookie's Secure flag. If the app is deployed without a reverse proxy that strips or overwrites this header, a client can send x-forwarded-proto: http over an HTTPS connection to force secure: false, causing the auth cookie to be sent over plaintext connections. Conversely, x-forwarded-proto: https over HTTP would set Secure but the cookie would never be sent back.

Consider either: (a) documenting this as a deployment requirement, (b) adding a config option to trust/ignore the header, or (c) defaulting to secure: true in production and only relaxing for development.

⚠️ Performance: staleTime: 0 on useUserInfo causes excessive refetches

📄 src/views/shared/hooks/use-user-info/use-user-info.ts:15 📄 src/components/app-nav-bar/hooks/use-auth-lifecycle.ts:10 📄 src/views/domain-workflows/domain-workflows.tsx:23 📄 src/views/shared/hooks/use-user-info/use-user-info.ts:9

The useUserInfo hook sets staleTime: 0, which overrides the project-wide default of staleTime: Infinity (set in react-query-provider.tsx specifically to prevent immediate client-side refetches in Next.js). Because refetchOnWindowFocus defaults to true in React Query, every tab focus event triggers a new /api/auth/me request. Additionally, every component mount treats cached data as stale and refetches. Since AppNavBar (always mounted at layout level) and DomainWorkflows both consume this hook, this creates unnecessary duplicate requests on page loads.

Consider using a reasonable staleTime (e.g., 30-60 seconds) or Infinity to align with the project default. The expiry-based logout timer in AppNavBar already handles token expiration, and the explicit refetch() calls in saveToken/logout ensure freshness after mutations.

Suggested fix
// use-user-info.ts
export default function useUserInfo() {
  return useQuery<PublicAuthContext, RequestError>({
    queryKey: ['auth-me'],
    queryFn: async () => {
      const res = await request('/api/auth/me', { method: 'GET' });
      return res.json();
    },
    staleTime: 60_000, // 1 minute; mutations call refetch() explicitly
  });
}
💡 Edge Case: POST /api/auth/token sets cookie for expired JWTs

📄 src/route-handlers/auth-token/set-auth-token.ts:21 📄 src/route-handlers/auth-token/schemas/token-request-body-schema.ts:3

The setAuthToken handler validates the JWT format (three dot-delimited base64url segments) but does not decode or check the exp claim. An already-expired token is accepted and persisted as an HttpOnly cookie. The user gets correct feedback (the subsequent /api/auth/me call returns isValidToken: false), but the round-trip is wasted and leaves a useless cookie on the client.

Consider decoding and checking exp server-side before setting the cookie to provide immediate 400 feedback.

✅ 5 resolved
Bug: doLogout finally-block redirects even when DELETE fails

📄 src/components/app-nav-bar/use-auth-lifecycle.ts:100-111
In doLogout, the finally block unconditionally calls refetch(), router.refresh(), router.replace('/domains'), and resets logoutInFlightRef. If the DELETE request fails (catch branch), logoutReasonRef is set to null, but the redirect still fires. This means:

  1. A failed logout still redirects the user to /domains and refreshes the page, even though they're still authenticated.
  2. If refetch() itself throws inside finally, logoutInFlightRef.current = false is never reached, permanently locking out all future logout attempts.

The finally block should only execute redirect/refresh logic on success, and the logoutInFlightRef reset should be guaranteed (e.g. in its own try/finally).

Edge Case: Authenticated users get NO_ACCESS on domains with no groups

📄 src/utils/auth/auth-shared.ts:60-62
In getDomainAccessForUser, when a domain has neither READ_GROUPS nor WRITE_GROUPS configured (both empty), any authenticated non-admin user gets NO_ACCESS. This is a deny-by-default posture which may be surprising: domains that haven't been set up for RBAC become invisible to all non-admin users when auth is enabled. If the intent is that unconfigured domains should be readable by all authenticated users, this needs a change. If it's intentional, consider adding a test case to document this contract (the existing test at auth-context.test.ts:404 covers unauthenticated users but not authenticated non-admins with empty groups).

Edge Case: POST /api/auth/token accepts already-expired JWTs

📄 src/app/api/auth/token/route.ts:49-61 📄 src/utils/auth/auth-context.ts:74-76 📄 src/components/app-nav-bar/app-nav-bar.tsx:164 📄 src/components/app-nav-bar/app-nav-bar.tsx:171
The POST handler validates JWT structure (3 base64url segments) but does not decode or check the exp claim. An expired JWT is stored in the cookie, and the user only discovers it's invalid on the next page load when resolveAuthContext silently drops the token. This could be confusing UX — the user sees { ok: true } but is immediately treated as unauthenticated.

Note: The client-side saveToken in use-auth-lifecycle.ts does refetch and check isAuthenticated after saving, showing an error snackbar. So the UX impact is limited to the API contract being misleading (returns 200 for expired tokens).

Quality: Redundant try/catch that only re-throws in handleSaveToken

📄 src/components/app-nav-bar/app-nav-bar.tsx:95 📄 src/components/app-nav-bar/app-nav-bar.tsx:106
The handleSaveToken callback wraps its body in try { ... } catch (e) { throw e; }, which has no effect — the exception propagates identically without the try/catch. This adds visual noise and may confuse readers into thinking error handling was intended but left incomplete.

Quality: Grammar issue in doc heading

📄 docs/auth-strategy-design.md:223
The heading was changed to ### One of possible Authorization Model which is grammatically incorrect. It should read ### One Possible Authorization Model or ### Example Authorization Model.

🤖 Prompt for agents
Code Review: Implements role-based access control for Cadence-web with authentication refactoring and improved test coverage, but three security concerns must be addressed: the x-forwarded-proto header is trusted without proxy validation, useUserInfo's staleTime: 0 causes excessive refetches, and POST /api/auth/token accepts expired JWTs without decoding the exp claim.

1. ⚠️ Security: x-forwarded-proto trusted without proxy validation
   Files: src/app/api/auth/token/route.ts:22-27, src/components/app-nav-bar/app-nav-bar.tsx:157, src/components/app-nav-bar/app-nav-bar.tsx:164, src/utils/auth/schemas/cadence-jwt-claims-schema.ts:6

   In `getCookieSecureAttribute`, the `x-forwarded-proto` header is unconditionally trusted to determine the cookie's `Secure` flag. If the app is deployed without a reverse proxy that strips or overwrites this header, a client can send `x-forwarded-proto: http` over an HTTPS connection to force `secure: false`, causing the auth cookie to be sent over plaintext connections. Conversely, `x-forwarded-proto: https` over HTTP would set `Secure` but the cookie would never be sent back.
   
   Consider either: (a) documenting this as a deployment requirement, (b) adding a config option to trust/ignore the header, or (c) defaulting to `secure: true` in production and only relaxing for development.

2. ⚠️ Performance: staleTime: 0 on useUserInfo causes excessive refetches
   Files: src/views/shared/hooks/use-user-info/use-user-info.ts:15, src/components/app-nav-bar/hooks/use-auth-lifecycle.ts:10, src/views/domain-workflows/domain-workflows.tsx:23, src/views/shared/hooks/use-user-info/use-user-info.ts:9

   The `useUserInfo` hook sets `staleTime: 0`, which overrides the project-wide default of `staleTime: Infinity` (set in `react-query-provider.tsx` specifically to prevent immediate client-side refetches in Next.js). Because `refetchOnWindowFocus` defaults to `true` in React Query, every tab focus event triggers a new `/api/auth/me` request. Additionally, every component mount treats cached data as stale and refetches. Since `AppNavBar` (always mounted at layout level) and `DomainWorkflows` both consume this hook, this creates unnecessary duplicate requests on page loads.
   
   Consider using a reasonable staleTime (e.g., 30-60 seconds) or `Infinity` to align with the project default. The expiry-based logout timer in `AppNavBar` already handles token expiration, and the explicit `refetch()` calls in `saveToken`/`logout` ensure freshness after mutations.

   Suggested fix:
   // use-user-info.ts
   export default function useUserInfo() {
     return useQuery<PublicAuthContext, RequestError>({
       queryKey: ['auth-me'],
       queryFn: async () => {
         const res = await request('/api/auth/me', { method: 'GET' });
         return res.json();
       },
       staleTime: 60_000, // 1 minute; mutations call refetch() explicitly
     });
   }

3. 💡 Edge Case: POST /api/auth/token sets cookie for expired JWTs
   Files: src/route-handlers/auth-token/set-auth-token.ts:21, src/route-handlers/auth-token/schemas/token-request-body-schema.ts:3

   The `setAuthToken` handler validates the JWT format (three dot-delimited base64url segments) but does not decode or check the `exp` claim. An already-expired token is accepted and persisted as an HttpOnly cookie. The user gets correct feedback (the subsequent `/api/auth/me` call returns `isValidToken: false`), but the round-trip is wasted and leaves a useless cookie on the client.
   
   Consider decoding and checking `exp` server-side before setting the cookie to provide immediate 400 feedback.

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants