feat: feature flags by maskarb · Pull Request #650 · ambient-code/platform

maskarb · 2026-02-17T18:39:20Z

No description provided.

Document the decision to use Unleash for feature flags with workspace-scoped ConfigMap overrides. Includes architecture diagrams, evaluation logic, API endpoints, and implementation patterns. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add Unleash deployment script (e2e/scripts/deploy-unleash.sh) - Add feature flags documentation (docs/feature-flags/) - Add Makefile targets for Unleash deployment - Update constitution with Principle XI: Feature Flag Discipline - Update component READMEs with feature flag configuration Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add Unleash client integration (featureflags/featureflags.go) - Add feature flags handlers (handlers/featureflags.go, featureflags_admin.go) - Implement workspace-scoped ConfigMap overrides - API endpoints: list, evaluate, enable, disable, override management - Proxy to Unleash Admin API with backend credentials Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…errides - Add Unleash provider with environment context field - Add feature flags admin UI in Workspace Settings tab - Implement batch save pattern (local state, Save/Discard buttons) - Add Next.js API routes to proxy backend feature flags endpoints - Add useWorkspaceFlag hook for workspace-scoped flag evaluation - Add Switch UI component and feature flags service layer Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

bobbravo2 · 2026-02-17T19:20:37Z

@maskarb once we get the blockers and 🔴 fixed from the review, this is getting really close to landing. The testing path(s) might be deferrable for future PRs, too, however given this is a key platform functionality.. I'm inclined to ask for at least a critical path test or two 👍🏾

Add comprehensive unit tests for the feature flag admin API handlers covering: - Authentication requirements (401 Unauthorized) - ConfigMap create/update/delete operations - Unleash API error handling (503 when not configured) - Override precedence logic (workspace > Unleash > default) - Edge cases (nil ConfigMap data, concurrent operations) Also includes error handling fix to not expose Unleash response body in error messages (security improvement). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Platform team can now control which feature flags are visible in the workspace admin UI by adding a tag (scope: workspace) in Unleash: - Flags with the tag appear in the workspace admin UI - Flags without the tag are platform-only (Unleash UI only) - Tag type/value configurable via environment variables: UNLEASH_WORKSPACE_TAG_TYPE (default: "scope") UNLEASH_WORKSPACE_TAG_VALUE (default: "workspace") Also updates ADR-0006 with flag visibility governance documentation. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

The Unleash SDK sends usage metrics and client registration to the proxy URL. These were missing, causing no metrics to appear in Unleash. Added endpoints: - POST /api/feature-flags/client/metrics - forwards usage metrics - POST /api/feature-flags/client/register - forwards client registration Note: "Impression data" must also be enabled on flags in Unleash UI for detailed impression tracking. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Rename sessionId to sessionID in IsEnabledWithContext (ST1003) - Use type conversion instead of struct literal for EnvState (S1016) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

github-actions · 2026-02-18T15:12:52Z

Claude Code Review

Summary

This PR implements Unleash-based feature flag management with workspace-scoped ConfigMap overrides. The implementation adds comprehensive feature flag capabilities including:

Backend Unleash client integration with optional initialization
Workspace-scoped ConfigMap overrides (per-namespace feature flag control)
Tag-based filtering for workspace-configurable flags
Admin UI for workspace administrators
Frontend Unleash SDK integration with proxy endpoints
Comprehensive unit tests (1000+ lines)

Overall Assessment: ✅ Approve with Minor Suggestions

The code follows established patterns from CLAUDE.md and context files. Security, error handling, and authentication are implemented correctly. Excellent test coverage.

Issues by Severity

🚫 Blocker Issues

None - No blocking issues found.

🔴 Critical Issues

None - No critical issues found.

🟡 Major Issues

1. Backend Service Account Usage for ConfigMap Operations

Location: components/backend/handlers/featureflags_admin.go

Issue: All ConfigMap operations (lines 88, 400, 414, 432, 469, 489, 525, 539, 557, 594, 608, 626) use K8sClient (backend service account) instead of user-scoped clients.

Current Pattern:

cm, err := K8sClient.CoreV1().ConfigMaps(namespace).Get(ctx, FeatureFlagOverridesConfigMap, metav1.GetOptions{})

Expected Pattern (per CLAUDE.md:525-556):

reqK8s, _ := GetK8sClientsForRequest(c)
// Validate RBAC permissions for ConfigMap access
ssar := &authv1.SelfSubjectAccessReview{
    Spec: authv1.SelfSubjectAccessReviewSpec{
        ResourceAttributes: &authv1.ResourceAttributes{
            Resource:  "configmaps",
            Verb:      "get", // or "create", "update", "delete"
            Namespace: namespace,
            Name:      FeatureFlagOverridesConfigMap,
        },
    },
}
// Then use backend service account AFTER validation

Why this matters: Per ADR-0002 and k8s-client-usage.md, user operations should validate RBAC before using elevated permissions. The current implementation validates project access (line 140) but doesn't check ConfigMap-specific permissions.

Recommendation: Add RBAC checks for ConfigMap operations, or document why backend service account is appropriate here (similar to CR writes in sessions.go:417).

2. Potential Token Leak in Admin Proxy

Location: components/backend/handlers/featureflags_admin.go:649

Issue: Admin API token is set directly in header without redaction in logs.

req.Header.Set("Authorization", getUnleashAdminToken())

Risk: If HTTP client logs requests or errors include headers, the admin token could leak.

Recommendation: Ensure unleashAdminRequest errors don't expose the Authorization header (similar to server/server.go:22-34 token redaction pattern). Add test case verifying error messages don't contain tokens.

🔵 Minor Issues

1. Inconsistent Error Message Format

Location: components/backend/handlers/featureflags_admin.go:199

Issue: Log message includes response body which may contain sensitive data:

log.Printf("Unleash Admin API returned %d: %s", resp.StatusCode, string(body))

Recommendation: Don't log full response bodies. Use:

log.Printf("Unleash Admin API returned %d", resp.StatusCode)

Note: Already fixed on line 361 - just needs consistency.

2. Missing `enabled` Parameter Validation

Location: components/backend/handlers/featureflags_admin.go:280

Issue: No validation that flagName exists in Unleash before checking ConfigMap.

Current behavior: Returns 200 with enabled: false for non-existent flags.

Recommendation: Query Unleash first to verify flag exists, return 404 if not found (similar to GetFeatureFlag:362-367).

3. Frontend Type Safety

Location: components/frontend/src/components/providers/feature-flag-provider.tsx:18

Issue: Uses placeholder key when Unleash not configured:

const UNLEASH_CLIENT_KEY = process.env.NEXT_PUBLIC_UNLEASH_CLIENT_KEY || 'placeholder-not-configured';

Issue: This violates the "Zero any Types" rule (frontend-development.md:15-34) in spirit - we're passing invalid data to the SDK.

Recommendation: Conditionally render FlagProvider only when key exists, or use a more explicit fallback that won't accidentally authenticate.

4. React Query Stale Time Configuration

Location: components/frontend/src/services/queries/use-feature-flags-admin.ts:26-27

refetchInterval: 30000, // Refresh every 30s
staleTime: 10000, // Consider data stale after 10s

Issue: Data refetches every 30s but is considered stale after 10s. This means 20s windows where data shows as stale but hasn't refetched yet.

Recommendation: Align timings or increase staleTime to match refetchInterval for consistent UX.

5. Missing Loading State in Feature Flag Provider

Location: components/frontend/src/components/providers/feature-flag-provider.tsx:38-42

Issue: Renders children immediately during SSR/before mount, meaning early renders see all flags as disabled.

Current:

if (!mounted || !baseUrl) {
    return <>{children}</>;
}

Recommendation: Add loading state or document that flags default to false during hydration.

6. Code Duplication in Enable/Disable Handlers

Location: components/backend/handlers/featureflags_admin.go:504-640

Issue: EnableFeatureFlag (lines 504-571) and DisableFeatureFlag (lines 573-640) are 95% identical.

Recommendation: Extract common logic:

func setFeatureFlagOverride(c *gin.Context, enabled bool) {
    // ... shared logic ...
    cm.Data[flagName] = strconv.FormatBool(enabled)
    // ...
}

func EnableFeatureFlag(c *gin.Context) {
    setFeatureFlagOverride(c, true)
}

func DisableFeatureFlag(c *gin.Context) {
    setFeatureFlagOverride(c, false)
}

Positive Highlights

✅ Excellent Security Practices

User token authentication enforced on all endpoints (lines 140-145, 272-277, 318-323, etc.)
No token logging - proper use of len(token) pattern
Type-safe ConfigMap access - proper nil checks (lines 427, 485, 552, 621)
Proxy pattern - Admin credentials never exposed to frontend

✅ Comprehensive Testing

1078 lines of unit tests covering authentication, ConfigMap operations, error handling, edge cases
Proper test isolation using fake K8s clients
Security fix during testing - Response body redaction (commit message notes)

✅ Error Handling

Proper IsNotFound handling (lines 89, 401, 470, 526, 594)
Graceful degradation when Unleash not configured (lines 154-176)
Structured error responses with appropriate HTTP status codes

✅ Frontend Best Practices

React Query hooks for all data operations (no manual fetch)
Shadcn UI components exclusively (Switch, Table, Badge, etc.)
Proper loading/error states in feature-flags-settings.tsx
Batch save pattern - local state with Save/Discard (lines 54-80)

✅ Documentation Quality

ADR-0006 provides excellent architectural context and decision rationale
Tag-based governance for flag visibility well-documented
Constitution integration - links to Principle XI
Deployment scripts with platform detection (OpenShift vs Kubernetes)

Recommendations

Priority 1: Security & RBAC

Add ConfigMap RBAC validation or document why backend SA is appropriate
Verify no token leaks in unleashAdminRequest error messages
Remove response body from log line 199

Priority 2: Code Quality

Extract shared logic from Enable/Disable handlers
Add flag existence validation in EvaluateFeatureFlag
Align React Query timing configurations

Priority 3: Polish

Document frontend flag hydration behavior
Consider explicit fallback handling for missing Unleash key

Architecture Compliance

✅ Follows CLAUDE.md patterns:

User token authentication (Backend Development Context)
Error handling patterns (error-handling.md)
React Query usage (react-query-usage.md)
Shadcn UI components (frontend-development.md)

✅ Follows ADR-0002:

User token extraction and validation
Project context validation middleware

✅ Follows Project Structure:

Proper package organization (handlers/, featureflags/)
Service layer separation (frontend: api/ + queries/)
Colocated components (workspace-sections/)

Test Coverage Assessment

✅ Excellent coverage:

Authentication requirements (401)
ConfigMap operations (create/update/delete)
Unleash API error handling (503)
Override precedence logic
Edge cases (nil data, concurrent ops)

Suggested additions:

Integration test for full flag evaluation flow
Frontend unit tests for FeatureFlagProvider
E2E test for workspace admin toggling flags

Final Verdict

Approve with suggestions ✅

This is high-quality code that follows established patterns. The major issues are about RBAC validation consistency and potential token leakage - both addressable with small changes. The comprehensive testing and documentation demonstrate excellent engineering practices.

Before merge:

Address ConfigMap RBAC validation (Major Issue Outcome: Reduce Refinement Time with agent System #1)
Verify no token leaks in admin proxy errors (Major Issue Epic: RAT Architecture & Design #2)
Remove response body logging (Minor Issue Outcome: Reduce Refinement Time with agent System #1)

Post-merge improvements:

Refactor Enable/Disable handlers to reduce duplication
Add integration tests for full evaluation flow
Document flag hydration behavior

🤖 Generated with Claude Code

🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

CLAUDE.md - Master project instructions, development standards
backend-development.md - Go backend, K8s integration patterns
frontend-development.md - NextJS, Shadcn UI, React Query patterns
security-standards.md - Auth, RBAC, token handling
k8s-client-usage.md - User token vs service account patterns
error-handling.md - Consistent error patterns
react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

- Update feature flag tests to expect "unleash" source (SDK refactor) - Remove unused instrumentationHook from next.config.js - Add debug logging for Unleash metrics proxy - Remove unused refreshInterval from feature flag provider Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

github-actions · 2026-02-18T16:01:46Z

Claude Code Review

Summary

This PR introduces comprehensive feature flag management using Unleash with workspace-scoped overrides. The implementation follows established patterns with good separation between platform-wide and workspace-specific controls. Code quality is generally high with proper use of user token authentication, React Query, and Shadcn UI components.

Issues by Severity

🚫 Blocker Issues

None - No blocking issues found.

🔴 Critical Issues

Potential Token Leakage in Admin API Calls (components/backend/handlers/featureflags_admin.go:649)
- Line 649: req.Header.Set("Authorization", getUnleashAdminToken())
- Issue: Admin token is being set directly without redaction in logs
- Security Risk: If HTTP client logging is enabled, tokens could be exposed
- Fix: Add explicit log redaction for Unleash admin API calls
- Pattern Violation: Violates security-standards.md token handling rules
Using Backend Service Account Without User Permission Validation (components/backend/handlers/featureflags_admin.go:400-433)
- Lines 400-433 in SetFeatureFlagOverride, 525-558 in EnableFeatureFlag, 594-627 in DisableFeatureFlag
- Issue: Uses K8sClient (service account) to create/update ConfigMaps without RBAC check
- Security Risk: User token is retrieved (line 140-145) but not used for ConfigMap operations
- Fix Pattern: Should use user-scoped client for ConfigMap operations OR perform explicit RBAC check before using service account
- Pattern Violation: Violates k8s-client-usage.md - "User operations must use GetK8sClientsForRequest"

🟡 Major Issues

Missing Type Safety in Frontend API Responses (components/frontend/src/services/api/feature-flags-admin.ts)
- Lines 52-55: API client assumes response structure without validation
- Issue: response.features || [] assumes features exists without type guard
- Impact: Could break if API returns unexpected structure
- Fix: Add response validation or proper error handling
- Pattern Note: Frontend standards require proper type checking
Environment Variable Caching May Miss Runtime Updates (components/backend/handlers/featureflags_admin.go:654-686)
- Package-level variables cache env vars (lines 27-33)
- Getters lazy-load once (lines 654-686)
- Issue: If env vars change at runtime (K8s ConfigMap updates), changes won't be detected
- Impact: Requires pod restart to pick up Unleash config changes
- Recommendation: Document this behavior or add runtime refresh capability
Frontend Route Handlers Missing Error Response Types (components/frontend/src/app/api/feature-flags/*/route.ts)
- All Next.js route handlers (6 files)
- Issue: Error responses are generic, not following typed error pattern
- Impact: Frontend can't distinguish between error types
- Fix: Add proper error response types and status code handling
Potential Race Condition in Feature Flag State (components/frontend/src/components/workspace-sections/feature-flags-settings.tsx:63-75)
- Lines 63-75: useEffect resets local state on flagsKey change
- Issue: If flags refetch while user is editing, unsaved changes could be lost
- Impact: Poor UX - user loses changes during auto-refresh
- Fix: Add confirmation dialog or skip refresh when hasChanges=true

🔵 Minor Issues

Inconsistent Error Logging Patterns (components/backend/handlers/featureflags_admin.go)
- Lines 150, 184, 198: Some errors log, some don't before returning
- Issue: Inconsistent with error-handling.md pattern ("always log before returning")
- Fix: Ensure all error paths have contextual log statements
Missing Context Cancellation (components/backend/handlers/featureflags_admin.go:136, 267, etc.)
- Multiple functions use context.Background() instead of request context
- Issue: Can't cancel operations if HTTP request is cancelled
- Fix: Use c.Request.Context() instead of context.Background()
Hardcoded HTTP Timeout (components/backend/handlers/featureflags_admin.go:644)
- Line 644: Timeout: 10 * time.Second is hardcoded
- Issue: Not configurable for different network environments
- Fix: Make configurable via environment variable
Frontend Component Too Large (components/frontend/src/components/workspace-sections/feature-flags-settings.tsx)
- 477 lines exceeds recommended 200 line limit
- Issue: Violates frontend standards for component size
- Fix: Extract sub-components (FeatureFlagRow, SaveControls, etc.)
Missing PropTypes Documentation (components/frontend/src/components/workspace-sections/feature-flags-settings.tsx)
- Component props lack JSDoc comments
- Impact: Harder to understand component API
- Fix: Add JSDoc comments for props

Positive Highlights

✅ Excellent RBAC Implementation

All endpoints properly call GetK8sClientsForRequest to verify user tokens (lines 140-145, 272-277, etc.)
Proper 401 responses for missing tokens

✅ Proper React Query Usage

Clean separation: api/ layer (pure functions) + queries/ layer (hooks)
Query keys properly namespaced: featureFlagKeys.list(projectName)
Mutations invalidate related queries correctly

✅ Shadcn UI Component Usage

100% Shadcn components (Switch, Table, Card, Badge, Alert, etc.)
No custom UI built from scratch
Follows frontend-development.md standards

✅ Type-Safe API Layer

Good TypeScript types for all API functions
Uses type over interface consistently
Proper type exports from API layer

✅ Three-Tier Override Logic

Clean architecture: ConfigMap override → Unleash default → disabled
Well-documented in code and ADR-0006

✅ Comprehensive Tests

1080 lines of backend tests (featureflags_admin_test.go)
Good coverage of edge cases

✅ Tag-Based Filtering

Smart design: Only workspace-configurable flags shown in UI
Platform-only flags hidden from workspace admins

Recommendations

High Priority

Fix RBAC Pattern Violation (Critical Epic: RAT Architecture & Design #2)
- Add explicit RBAC check before ConfigMap operations OR
- Use user-scoped client: reqK8s, _ := GetK8sClientsForRequest(c) for ConfigMap ops
- Pattern: See handlers/sessions.go:250-260 for RBAC check example
Add Token Redaction (Critical Outcome: Reduce Refinement Time with agent System #1)
- Implement custom HTTP client with redacting logger for Unleash API calls
- Pattern: See server/server.go:22-34 for token redaction formatter
Fix Frontend State Race Condition (Major Epic: Testing & Validation #6)
- Add enabled: \!hasChanges to useQuery refetchInterval
- Or add confirmation: "Refresh will discard unsaved changes"

Medium Priority

Extract Frontend Sub-Components (Minor Bump actions/checkout from 4 to 5 #10)
- Create FeatureFlagRow.tsx (table row logic)
- Create FeatureFlagSaveControls.tsx (save/discard buttons)
- Keeps main component under 200 lines
Use Request Context (Minor Test: Updated Workflow Validation #8)
- Replace context.Background() with c.Request.Context()
- Enables proper cancellation
Add Response Validation (Major Epic: Data Source Integration #3)
- Validate API responses match expected types
- Use zod or similar for runtime type checking

Documentation Improvements

Document Environment Variable Caching (Major Epic: AI Agent Development #4)
- Add comment in code that pod restart is required for config changes
- Or implement hot-reload capability
Add JSDoc Comments
- Document component props and API function parameters

Code Quality Metrics

✅ Backend: Uses user token authentication correctly
✅ Backend: Proper error handling (mostly)
✅ Frontend: Zero any types
✅ Frontend: All React Query patterns followed
✅ Frontend: All Shadcn UI components
⚠️ Backend: Missing RBAC check for ConfigMap operations
⚠️ Frontend: Component exceeds size limit (477 lines)

Security Checklist

✅ User token extracted and validated
⚠️ ConfigMap operations use service account without RBAC check
⚠️ Admin token might leak in HTTP logs
✅ No tokens in application logs (except Unleash admin token handling)
✅ Input validation present (flag names, request bodies)
✅ Proper HTTP status codes

Overall Assessment

Quality Score: 7.5/10

This is a well-designed feature with good architectural patterns. The main concerns are:

RBAC gap in ConfigMap operations (violates established backend patterns)
Potential token leakage in Unleash admin API calls
Frontend state management race condition

Once these critical issues are addressed, this PR will be production-ready. The three-tier override system (workspace → platform → disabled) is elegant and the tag-based filtering is a smart design choice.

Recommendation: Request changes to address Critical #1 and #2 before merging.

🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

CLAUDE.md - Master project instructions, development standards
backend-development.md - Go backend, K8s integration patterns
frontend-development.md - NextJS, Shadcn UI, React Query patterns
security-standards.md - Auth, RBAC, token handling
k8s-client-usage.md - User token vs service account patterns
error-handling.md - Consistent error patterns
react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

maskarb and others added 4 commits February 17, 2026 13:20

This comment has been minimized.

Sign in to view

maskarb changed the title ~~[RHOAI-47012] feat: feature flags~~ feat: feature flags Feb 17, 2026

maskarb and others added 4 commits February 17, 2026 14:54

fix(backend): address golangci-lint staticcheck issues

fa53952

- Rename sessionId to sessionID in IsEnabledWithContext (ST1003) - Use type conversion instead of struct literal for EnvState (S1016) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

maskarb marked this pull request as draft February 18, 2026 16:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: feature flags#650

feat: feature flags#650
maskarb wants to merge 9 commits intoambient-code:mainfrom
maskarb:rhoai-47012-feature-flags

maskarb commented Feb 17, 2026

Uh oh!

This comment has been minimized.

bobbravo2 commented Feb 17, 2026

Uh oh!

github-actions bot commented Feb 18, 2026 •

edited

Loading

What Amber Loaded for Code Review

Uh oh!

github-actions bot commented Feb 18, 2026 •

edited

Loading

What Amber Loaded for Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

maskarb commented Feb 17, 2026

Uh oh!

This comment has been minimized.

bobbravo2 commented Feb 17, 2026

Uh oh!

github-actions bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Claude Code Review

Summary

Issues by Severity

🚫 Blocker Issues

🔴 Critical Issues

🟡 Major Issues

1. Backend Service Account Usage for ConfigMap Operations

2. Potential Token Leak in Admin Proxy

🔵 Minor Issues

1. Inconsistent Error Message Format

2. Missing enabled Parameter Validation

3. Frontend Type Safety

4. React Query Stale Time Configuration

5. Missing Loading State in Feature Flag Provider

6. Code Duplication in Enable/Disable Handlers

Positive Highlights

✅ Excellent Security Practices

✅ Comprehensive Testing

✅ Error Handling

✅ Frontend Best Practices

✅ Documentation Quality

Recommendations

Priority 1: Security & RBAC

Priority 2: Code Quality

Priority 3: Polish

Architecture Compliance

Test Coverage Assessment

Final Verdict

What Amber Loaded for Code Review

Uh oh!

github-actions bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Claude Code Review

Summary

Issues by Severity

🚫 Blocker Issues

🔴 Critical Issues

🟡 Major Issues

🔵 Minor Issues

Positive Highlights

Recommendations

High Priority

Medium Priority

Documentation Improvements

Code Quality Metrics

Security Checklist

Overall Assessment

What Amber Loaded for Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

github-actions bot commented Feb 18, 2026 •

edited

Loading

2. Missing `enabled` Parameter Validation

github-actions bot commented Feb 18, 2026 •

edited

Loading