Skip to content

Conversation

@MichaelClifford
Copy link
Collaborator

No description provided.

@github-actions
Copy link
Contributor

Code Review: Move .mcp.json into runners

Summary

This PR simplifies MCP server configuration by removing the dynamic discovery mechanism and using a single, centrally-controlled configuration file. The changes improve security by preventing user workspaces from loading arbitrary MCP servers.

✅ Strengths

Security Improvements

  • Excellent: Removes the ability for user workspaces to inject MCP server configurations, which is a significant security improvement
  • Good: Hardcoded path /app/claude-runner/.mcp.json makes the configuration location explicit and predictable
  • Good: Maintains the _filter_mcp_servers() safety mechanism to only allow http and sse types

Code Quality

  • Clean refactoring: Reduced code from ~70 lines to ~25 lines (-51 lines)
  • Clear documentation: Updated docstring clearly explains the security rationale
  • No breaking changes: Method signature remains the same (cwd_path parameter still accepted but unused)

Container Integration

  • Correct placement: The .mcp.json file at components/runners/claude-code-runner/.mcp.json will be copied to /app/claude-runner/.mcp.json per the Dockerfile (line 21)
  • Proper initialization: Empty mcpServers: {} is a safe default

🔍 Issues & Recommendations

1. Unused Parameter (Minor)

Issue: The cwd_path parameter in _load_mcp_config(self, cwd_path: str) is no longer used but still required.

Recommendation: Consider one of these options:

# Option A: Keep parameter for backward compatibility (current approach is fine)
def _load_mcp_config(self, cwd_path: str) -> dict | None:
    """..."""
    # cwd_path intentionally unused - MCP config only loaded from runner directory

# Option B: Remove parameter if nothing calls this with explicit arguments
def _load_mcp_config(self) -> dict | None:

Impact: Low - the current approach maintains API compatibility

2. Missing Test Coverage (Medium)

Issue: No tests found for MCP configuration loading in components/runners/claude-code-runner/.

Recommendation: Add unit tests covering:

  • Loading valid .mcp.json with http/sse servers
  • Handling missing .mcp.json file
  • Handling malformed JSON
  • Filtering logic for server types
  • Security: verify workspace configs are NOT loaded

Example test structure:

def test_load_mcp_config_success():
    # Test successful load from runner directory

def test_load_mcp_config_not_found():
    # Test graceful handling when file missing

def test_load_mcp_config_invalid_json():
    # Test error handling for malformed JSON

def test_filter_mcp_servers_only_allows_http_sse():
    # Test security filtering

3. Documentation Gap (Low)

Issue: No documentation found explaining MCP configuration to operators/admins.

Recommendation: Add documentation explaining:

  • Where to place .mcp.json (in the runner directory, baked into container image)
  • How to add MCP servers (rebuild container image with updated .mcp.json)
  • Security model (why workspace configs are not allowed)
  • Example configurations

Suggested location: docs/CLAUDE_CODE_RUNNER.md or a new docs/MCP_CONFIGURATION.md

4. Container Build Consideration (Low)

Issue: Adding MCP servers requires rebuilding the container image.

Observation: This is more restrictive than the previous approach but more secure. Consider if this workflow is acceptable for your use case, or if you need a ConfigMap-based approach for dynamic configuration.

Alternative approach (if dynamic config needed):

# Load from ConfigMap-mounted file instead
runner_mcp_file = Path("/etc/mcp-config/.mcp.json")  # ConfigMap mount point

🎯 Security Review

✅ Security Improvements

  1. Prevents workspace injection: User repositories can no longer load MCP servers
  2. Centralized control: Platform operators control available MCP servers
  3. Maintains filtering: Still validates server types (http/sse only)

⚠️ Security Considerations

  1. File permissions: Ensure .mcp.json in container has appropriate permissions (should be owned by root, read-only)
  2. Validation: The current filtering logic is good but consider:
    • URL validation (prevent localhost/internal IPs if needed)
    • Protocol validation (enforce HTTPS for http type servers?)
    • Rate limiting for MCP server calls

📋 Testing Checklist

Before merging, verify:

  • Container builds successfully with new .mcp.json file
  • Runner starts without errors when .mcp.json is empty
  • Logs show "No .mcp.json file found in runner directory" OR "No valid MCP servers found after filtering" (expected with empty config)
  • No errors when MCP servers ARE configured in .mcp.json
  • User workspace .mcp.json files are ignored (if present)

🎨 Python Code Quality

✅ Follows Project Standards

  • Uses dict | None return type (modern Python 3.10+ union syntax)
  • Proper error handling with try/except
  • Good logging practices
  • Maintains existing code style

Minor Suggestions

  1. Consider adding type hints for the return value of _filter_mcp_servers:

    def _filter_mcp_servers(self, servers: dict) -> dict:  # ✓ Already typed
  2. The variable naming is clear and consistent

📊 Overall Assessment

Recommendation: ✅ APPROVE with minor suggestions

This is a well-executed security improvement that simplifies the codebase. The core logic is sound, and the changes align with security best practices by centralizing MCP server configuration.

Priority: Address test coverage before merging (medium priority)
Optional: Documentation and parameter cleanup can be done in follow-up PRs


Reviewed using project standards from CLAUDE.md:

  • Python formatting standards: ✅ (black-compatible, proper typing)
  • Error handling patterns: ✅ (explicit error returns, proper logging)
  • Security considerations: ✅ (improved security posture)
  • Container best practices: ✅ (file placement aligns with Dockerfile)

@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

Claude Code Review

Summary

This PR refactors MCP (Model Context Protocol) configuration loading to centralize it in the runner's own directory (/app/claude-runner/.mcp.json) instead of searching multiple locations in user workspaces. This is a security-focused change that prevents users from potentially loading arbitrary MCP servers from their repositories. However, the PR contains a critical breaking bug that will cause runtime failures.

Overall Assessment: ⚠️ BLOCK MERGE - Contains a blocker issue that must be fixed.


Issues by Severity

🚫 Blocker Issues

1. Function signature mismatch causing runtime error

Location: wrapper.py:328 and wrapper.py:1815

Issue: The function signature was changed to remove the cwd_path parameter:

# New signature (line 1815)
def _load_mcp_config(self) -> dict | None:

# But the call site still passes the parameter (line 328)
mcp_servers = self._load_mcp_config(cwd_path)  # ❌ BREAKS

Impact: This will cause a TypeError: _load_mcp_config() takes 1 positional argument but 2 were given at runtime, breaking all session executions.

Fix Required:

# Line 328 should be changed to:
mcp_servers = self._load_mcp_config()

🟡 Major Issues

2. Hardcoded container path reduces flexibility

Location: wrapper.py:1831

Issue: The MCP config path is hardcoded to /app/claude-runner/.mcp.json:

runner_mcp_file = Path("/app/claude-runner/.mcp.json")

Concerns:

  • No way to override for testing or development
  • Breaks if Dockerfile working directory changes
  • Makes unit testing difficult without filesystem mocking

Recommendation: Use a configurable path with a sensible default:

# At class level or module level
DEFAULT_MCP_CONFIG_PATH = Path("/app/claude-runner/.mcp.json")

# In method
mcp_config_path = os.getenv('RUNNER_MCP_CONFIG_PATH', str(DEFAULT_MCP_CONFIG_PATH))
runner_mcp_file = Path(mcp_config_path)

This maintains security (no user-controlled paths) while improving testability.

3. Missing documentation updates

Issue: The PR removes significant functionality (environment variable controls, multi-location search) but doesn't update:

  • No README changes explaining the new simplified behavior
  • No migration guide for users who may have been using MCP_CONFIG_PATH or MCP_CONFIG_SEARCH
  • No explanation of why this security change was necessary

Impact: Operators and developers won't understand:

  • Why their environment variables no longer work
  • Where to place MCP server configurations
  • Security implications of the change

Recommendation: Add documentation covering:

### MCP Server Configuration (v2.0+)

**Security Change:** MCP servers are now only loaded from the runner's 
centrally-controlled configuration file, not from user repositories.

**Location:** `/app/claude-runner/.mcp.json` (inside container)

**Removed Features:**
-`MCP_CONFIG_PATH` environment variable (security risk)
-`MCP_CONFIG_SEARCH` toggle (no longer needed)
- ❌ Workspace/repo .mcp.json discovery (security risk)

**Rationale:** Prevents users from loading arbitrary MCP servers that could
bypass security controls or access unauthorized resources.

🔵 Minor Issues

4. Logging could be more informative

Location: wrapper.py:1845

Issue: When no MCP config is found, the log just says:

logging.info("No .mcp.json file found in runner directory")

Suggestion: Add the expected path for easier debugging:

logging.info(f"No .mcp.json file found at {runner_mcp_file}")

5. Empty .mcp.json file lacks metadata

Location: .mcp.json

Issue: The file is valid but minimal:

{
  "mcpServers": {}
}

Suggestion: Add a comment header (JSON doesn't support comments, but consider adding a _comment field):

{
  "_comment": "MCP server configuration for vTeam runner. Only 'http' and 'sse' types allowed. See docs/mcp-servers.md for examples.",
  "mcpServers": {}
}

6. Missing test coverage

Issue: No tests were added to verify:

  • The new simplified loading behavior
  • Error handling for malformed .mcp.json
  • Confirmation that user workspace .mcp.json files are ignored
  • The security boundary works as intended

Recommendation: Add unit tests:

def test_load_mcp_config_from_runner_dir():
    # Test successful load from /app/claude-runner/.mcp.json
    pass

def test_load_mcp_config_ignores_workspace():
    # Test that workspace .mcp.json is NOT loaded
    pass

def test_load_mcp_config_handles_missing_file():
    # Test graceful handling when file doesn't exist
    pass

Positive Highlights

Excellent security improvement - Centralizing MCP config prevents users from potentially injecting malicious or unintended MCP servers via their repositories.

Simpler mental model - Reducing from 3 search locations to 1 makes the system more predictable and easier to reason about.

Good error handling preservation - The try/except blocks for JSON parsing and file errors were retained properly.

Consistent filtering - The _filter_mcp_servers method is still used, maintaining the "http" and "sse" type restrictions.

Clear intent in docstring - The updated docstring explicitly states the security reasoning.


Recommendations

Priority Order:

  1. CRITICAL - Fix the function call (line 328): Remove the cwd_path argument
  2. HIGH - Add tests: Verify the security boundary and error handling
  3. HIGH - Document the change: Update README with migration notes
  4. MEDIUM - Make path configurable: Use environment variable with default
  5. LOW - Improve logging: Include full path in log messages
  6. LOW - Add metadata to .mcp.json: Help future maintainers

Before merging:

  • Fix the TypeError at line 328
  • Test manually that sessions still start without errors
  • Verify MCP servers can still be loaded from the correct location
  • Consider if any existing deployments are using the old environment variables

Security Impact: ✅ Positive - Reduces attack surface by preventing user-controlled MCP server injection.

Breaking Change: ⚠️ Yes - Removes support for MCP_CONFIG_PATH and MCP_CONFIG_SEARCH environment variables. Deployments using these will need updates.

Backward Compatibility: ❌ Not maintained for environment variables, but most deployments likely weren't using them.


Estimated Review Time: 15 minutes
Estimated Fix Time: 5 minutes (for blocker), 30-60 minutes (for full recommendations)

cc @MichaelClifford - Great security-focused refactoring! Just need to fix that one function call and this will be ready to merge.

Signed-off-by: Michael Clifford <mcliffor@redhat.com>
@github-actions
Copy link
Contributor

Claude Code Review

Summary

This PR centralizes MCP (Model Context Protocol) server configuration to a single, platform-controlled location within the runner container. The change improves security significantly by removing the ability for user repositories to inject arbitrary MCP server configurations. The refactoring reduces code complexity (-51 lines, -70%) while maintaining backward compatibility and proper error handling.

Overall Assessment:APPROVE - Well-executed security improvement with minor documentation gaps.


Issues by Severity

🟡 Major Issues

1. Missing Test Coverage for Security-Critical Code

Location: components/runners/claude-code-runner/wrapper.py:1815-1853

Issue: No unit tests found for the MCP configuration loading and filtering logic. This is security-critical code that:

  • Filters MCP server types (only allows http/sse)
  • Validates configuration structure
  • Determines what external services the runner can access

Impact:

  • Changes to MCP loading logic have no automated validation
  • Security regressions could be introduced undetected
  • Difficult to verify the security boundary works as intended

Recommendation: Add comprehensive test coverage in tests/test_wrapper_mcp.py:

def test_load_mcp_config_success():
    """Test successful load from runner directory"""
    # Mock file exists with valid config
    # Verify returns filtered servers

def test_load_mcp_config_missing_file():
    """Test graceful handling when .mcp.json doesn't exist""
    # Verify returns None, logs info message

def test_load_mcp_config_invalid_json():
    """Test error handling for malformed JSON""
    # Verify returns None, logs error

def test_filter_mcp_servers_allows_http_sse():
    """Test only http and sse server types are allowed""
    # Verify stdio, other types are filtered out

def test_filter_mcp_servers_requires_url():
    """Test servers without URL are rejected""
    
def test_workspace_mcp_ignored():
    """Security: Verify user workspace .mcp.json files are NOT loaded""
    # This is the key security boundary test

Priority: HIGH - Should be added before or immediately after merge.


2. No Documentation for MCP Configuration

Location: components/runners/claude-code-runner/

Issue: No documentation explaining:

  • Where the MCP configuration file is located
  • How to add/modify MCP servers (requires rebuilding container)
  • Security model and why workspace configs are not allowed
  • Valid MCP server types and example configurations
  • Migration notes for removed environment variables

Impact:

  • Operators don't know how to configure MCP servers
  • Users may not understand why their workspace .mcp.json files are ignored
  • Breaking change (removed MCP_CONFIG_PATH, MCP_CONFIG_SEARCH) is undocumented

Recommendation: Add documentation to components/runners/claude-code-runner/README.md:

## MCP Server Configuration

### Overview
MCP (Model Context Protocol) servers are configured via a centrally-controlled
configuration file in the runner container. User repositories cannot load MCP
servers for security reasons.

### Configuration File Location
- **Container path:** `/app/claude-runner/.mcp.json`
- **Source path:** `components/runners/claude-code-runner/.mcp.json`

### Security Model
- ✅ MCP servers loaded ONLY from runner's .mcp.json
- ❌ User workspace .mcp.json files are IGNORED
- ✅ Only 'http' and 'sse' server types allowed
- ❌ 'stdio' and other types are filtered out

### Adding MCP Servers
To add MCP servers:
1. Edit `components/runners/claude-code-runner/.mcp.json`
2. Rebuild the container image: `make build-runner`
3. Redeploy with new image

Example configuration:
```json
{
  "mcpServers": {
    "weather": {
      "type": "sse",
      "url": "https://weather-mcp.example.com/sse"
    }
  }
}

Breaking Changes (v2.0+)

  • MCP_CONFIG_PATH environment variable removed
  • MCP_CONFIG_SEARCH environment variable removed
  • ❌ Workspace/repo .mcp.json discovery removed

Rationale: Prevents users from loading arbitrary MCP servers that could
bypass security controls or access unauthorized resources.


**Priority:** MEDIUM - Important for operators and maintainers.

---

### 🔵 Minor Issues

**3. Unused Parameter Creates Misleading API**

**Location:** `wrapper.py:1815`

**Issue:** The `cwd_path` parameter is accepted but never used:
```python
def _load_mcp_config(self, cwd_path: str) -> dict | None:
    # cwd_path is never referenced in the method body
    runner_mcp_file = Path("/app/claude-runner/.mcp.json")  # Hardcoded

Current approach is acceptable (maintains backward compatibility), but could be improved with a docstring note:

def _load_mcp_config(self, cwd_path: str) -> dict | None:
    """Load MCP server configuration from the vTeam runner's .mcp.json file.
    
    Args:
        cwd_path: Unused. Kept for backward compatibility with calling code.
    
    ...
    """

Priority: LOW - Current approach is fine.


4. Hardcoded Path Reduces Testability

Location: wrapper.py:1831

Issue: The path /app/claude-runner/.mcp.json is hardcoded, making unit testing require filesystem mocking.

Suggestion: Consider making the path configurable via environment variable (with secure default):

DEFAULT_MCP_CONFIG = "/app/claude-runner/.mcp.json"
runner_mcp_file = Path(os.getenv('RUNNER_MCP_CONFIG_PATH', DEFAULT_MCP_CONFIG))

Tradeoff:

  • ✅ Easier to test
  • ✅ More flexible for development
  • ⚠️ Adds configuration surface area

Current approach is acceptable for production use. Only change if testing becomes painful.

Priority: LOW - Nice-to-have.


5. Logging Could Include Full Path

Location: wrapper.py:1845

Current:

logging.info("No .mcp.json file found in runner directory")

Suggested:

logging.info(f"No .mcp.json file found at {runner_mcp_file}")

Benefit: Easier debugging, especially if path becomes configurable in future.

Priority: LOW


Positive Highlights

🎯 Security Excellence

  1. Eliminates Injection Vector: Users can no longer inject arbitrary MCP servers via repository files - this is a significant security win
  2. Defense in Depth: Maintains the _filter_mcp_servers() validation layer (http/sse types only)
  3. Principle of Least Privilege: Platform operators control MCP server access, not users
  4. Clear Security Boundary: Hardcoded path eliminates path traversal concerns

💎 Code Quality

  1. Dramatic Simplification: Reduces code from ~70 lines to ~25 lines (-51 lines, -70%)
  2. Improved Readability: Single path to check vs. 3-location search logic
  3. Maintains Error Handling: Proper try/except for JSON parsing and file I/O
  4. Backward Compatible: Method signature unchanged; calling code at line 328 works without modification
  5. Clear Documentation: Updated docstring explicitly explains security rationale
  6. Python Best Practices: Uses modern type hints (dict | None), structured logging

🏗️ Architecture Alignment

  1. Correct Container Integration: File placement aligns with Dockerfile (line 21 copies to /app/claude-runner/)
  2. Safe Default: Empty mcpServers: {} is sensible and won't break startup
  3. Follows CLAUDE.md Standards:
    • ✅ Explicit error returns with context
    • ✅ Proper logging before returning errors
    • ✅ No panic() calls
    • ✅ Clean code principles

Recommendations

Before Merge

  • OPTIONAL: Add basic test coverage for _load_mcp_config and _filter_mcp_servers

After Merge (Follow-up PRs)

  1. HIGH: Comprehensive test suite for MCP configuration (security-critical)
  2. MEDIUM: Documentation in README explaining MCP configuration
  3. LOW: Consider making path configurable for testing
  4. LOW: Improve logging to include full paths

Manual Testing Checklist

Before deploying to production:

  • Container builds successfully with .mcp.json file
  • Runner starts without errors when mcpServers is empty
  • Logs show "No valid MCP servers found after filtering" (expected with empty config)
  • When MCP servers ARE added to .mcp.json, they load correctly
  • User workspace .mcp.json files are ignored (if present)

Security Analysis

✅ Security Improvements

Aspect Before After Impact
User Control Users could load MCP servers from repos Platform controls all MCP servers 🔒 Removes attack vector
Configuration Sources 3 locations (env var, workspace, repo) 1 location (runner dir) 🔒 Reduces attack surface
Path Traversal Risk Environment variable could point anywhere Hardcoded path only 🔒 Eliminates risk
Server Type Filtering http/sse only http/sse only ✅ Maintained

🔍 Remaining Security Considerations

  1. URL Validation: Consider validating MCP server URLs to prevent:
    • Localhost/internal IP access (SSRF)
    • Non-HTTPS URLs for http-type servers
  2. Rate Limiting: Consider rate limiting MCP server calls
  3. File Permissions: Ensure .mcp.json in container is read-only

Breaking Changes

⚠️ This PR contains breaking changes:

Removed Feature Impact Mitigation
MCP_CONFIG_PATH env var Deployments using this will break Rebuild container with .mcp.json
MCP_CONFIG_SEARCH env var Search toggle no longer functional Remove from deployment configs
Workspace .mcp.json discovery User repos can't load MCP servers Expected - security feature

Assessment: Breaking changes are acceptable and intentional for security reasons. Impact is likely minimal as these features were probably rarely used.


Python Code Quality Assessment

✅ Follows CLAUDE.md Standards

  • Type hints: ✅ Uses dict | None (Python 3.10+ union syntax)
  • Error handling: ✅ Explicit try/except with proper logging
  • Logging: ✅ Structured logs with context
  • No panic: ✅ Returns None on errors, no exceptions raised
  • Code style: ✅ Clean, readable, follows black conventions

✅ Best Practices

  • Single Responsibility: Method does one thing (load MCP config)
  • DRY: Reuses _filter_mcp_servers() method
  • Fail-Safe: Returns None instead of raising exceptions
  • Logging: Info for normal flow, error for failures

Overall Assessment

✅ APPROVE

This is a well-executed security improvement that significantly reduces the attack surface of the MCP server loading mechanism. The code quality is high, follows project standards, and maintains backward compatibility.

Strengths:

  • Strong security posture improvement
  • Clean, simple code (-70% lines)
  • Proper error handling and logging
  • Backward compatible API

Gaps (non-blocking):

  • Test coverage (should add)
  • Documentation (should add)
  • Minor code improvements (optional)

Recommendation: Merge this PR and address test coverage + documentation in follow-up PRs. The core security improvement is valuable and should not be delayed.


Review Methodology:

  • Analyzed code changes against CLAUDE.md standards
  • Verified container integration with Dockerfile
  • Checked for security implications
  • Assessed Python code quality
  • Reviewed backward compatibility
  • Considered operational impact

Reviewed by: Claude Code
Review Standards: CLAUDE.md (Python, Security, Error Handling, Container practices)

@MichaelClifford MichaelClifford changed the title move .mcp.json into runners [WIP] Jira MCP integration Nov 18, 2025
@github-actions
Copy link
Contributor

Claude Code Review

Summary

This PR centralizes MCP (Model Context Protocol) configuration by moving .mcp.json from user workspace repositories into the runners controlled directory, and implements OAuth callback infrastructure for MCP servers (specifically Atlassian MCP). This is a security-positive change that prevents users from injecting arbitrary MCP servers while enabling OAuth-based authentication flows.

Overall Assessment: Strong architectural improvement with good security considerations. Several critical issues need addressing before merge, primarily around error handling, environment variable substitution validation, and testing coverage.


Issues by Severity

Critical Issues

1. Missing Environment Variable Substitution Implementation
Location: components/runners/claude-code-runner/.mcp.json:7

The .mcp.json file uses template syntax for VTEAM_FRONTEND_URL, but theres no visible code in the diff that performs environment variable substitution. If the Claude Code SDK or MCP client doesnt automatically expand environment variables, the redirect URI will literally be the template string instead of the actual URL, causing OAuth flow to completely fail.

2. No Validation of VTEAM_FRONTEND_URL
Location: components/operator/internal/handlers/sessions.go:483

The operator blindly passes the result of config.DiscoverFrontendURL() (which can return empty string) into the runner pod environment without validation. If Route/Ingress discovery fails, users will get cryptic OAuth errors with no clear indication why.

3. Insecure postMessage targetOrigin
Location: components/frontend/src/app/oauth/mcp/complete/page.tsx:47,63

Using window.location.origin as the targetOrigin in postMessage creates a potential security vulnerability. The correct approach is to send to a fixed, known origin or validate the origin matches expected patterns.

Major Issues

4. No Error Handling for DiscoverFrontendURL
The function doesnt distinguish between different failure modes (resource doesnt exist, permission denied, malformed spec, API error), making diagnosis difficult.

5. Missing Type Safety: NextRequest Param Validation
The OAuth callback endpoint extracts parameters but doesnt validate them according to OAuth 2.0 spec. Recommendation: Use zod schemas for validation.

6. Race Condition in Auto-Close Logic
The window auto-closes after 2 seconds without confirming the message was received. If postMessage throws, the timeout is already scheduled.

7. Operator Package Standards Violation: Missing Error Context
The DiscoverFrontendURL function silently swallows errors from DynamicClient.Resource().Get(), violating CLAUDE.md standards for error handling.

Minor Issues

8. Missing Loading/Error States - New route missing loading.tsx and error.tsx files (required by DESIGN_GUIDELINES.md)

9. Hardcoded Route Names - Should be configurable via environment variable

10. Inconsistent Error Message Casing - Mix of sentence case and OAuth standard snake_case

11. Missing JSDoc for Public Component - Component has comment but not proper JSDoc format

12. No Telemetry/Metrics - No tracking of OAuth flow success/failure rates


Positive Highlights

Excellent Security Improvement: Moving MCP config from user-controlled workspace to runner directory prevents arbitrary MCP server injection

Proper Type Safety in React: Using proper TypeScript union types for status

Good Separation of Concerns: OAuth callback endpoint just redirects to completion page

Follows Shadcn UI Patterns: Uses proper Shadcn components

Type-Safe Unstructured Access: Operator code correctly uses unstructured.NestedMap/NestedSlice helpers

Graceful Degradation: Warning logs instead of failing when frontend URL cant be discovered

Simplified MCP Loading: Refactored _load_mcp_config in wrapper.py is much cleaner - removed 51 lines, added 16


Recommendations

Before Merge (Priority Order):

  1. BLOCKER: Implement and test environment variable substitution in .mcp.json
  2. BLOCKER: Validate VTEAM_FRONTEND_URL is non-empty before creating session pods
  3. HIGH: Fix postMessage targetOrigin security issue
  4. HIGH: Add structured error handling to DiscoverFrontendURL
  5. HIGH: Fix auto-close race condition in completion page
  6. MEDIUM: Add input validation to OAuth callback endpoint (zod schemas)
  7. MEDIUM: Add unit tests for environment variable discovery logic

Testing Requirements:

Unit Tests Needed:

  • wrapper.py: Test _load_mcp_config with environment variable substitution
  • config.go: Test DiscoverFrontendURL with various Route/Ingress configurations
  • Frontend: Test OAuth callback parameter validation

Integration Tests Needed:

  • End-to-end OAuth flow (mock Atlassian OAuth)
  • Frontend URL discovery in different Kubernetes/OpenShift configurations
  • Error scenarios (missing Route, permission denied, malformed specs)

Code Quality Summary

Go Code: ⭐⭐⭐⭐ (4/5) - Follows operator patterns, minor error handling improvements needed

TypeScript/React: ⭐⭐⭐ (3/5) - Good Shadcn usage, needs input validation and security fixes

Python: ⭐⭐⭐⭐⭐ (5/5) - Excellent refactoring, clear documentation

Overall: ⭐⭐⭐⭐ (4/5) - Good work, address critical issues before merge.


Summary: This PR makes meaningful security and architectural improvements. Main concerns are around environment variable substitution validation, error handling visibility, and OAuth security best practices. Great refactoring of the MCP config loading logic!

@Gkrumbach07
Copy link
Collaborator

Tracked in Jira: https://issues.redhat.com/browse/RHOAIENG-39130

Signed-off-by: Michael Clifford <mcliffor@redhat.com>
@github-actions
Copy link
Contributor

Claude Code Review

Summary

This PR implements MCP (Model Context Protocol) OAuth integration for Jira and other MCP servers. The implementation adds OAuth callback handling in the frontend and automatic frontend URL discovery in the operator. The changes follow a secure, centralized configuration approach for MCP servers.

Overall Assessment: The implementation is well-designed with good security practices. However, there are several critical security and error handling issues that must be addressed before merging, particularly around XSS vulnerabilities and missing input validation.


Issues by Severity

🚫 Blocker Issues

1. XSS Vulnerability in OAuth Callback Route (components/frontend/src/app/api/oauth/mcp/callback/route.ts)

Location: route.ts:10-13, 19-30

The OAuth callback route directly passes user-controlled query parameters (error, error_description, code, state) to URL search params without any validation or sanitization. This creates an XSS vulnerability where attackers could inject malicious scripts.

// VULNERABLE CODE
const error = request.nextUrl.searchParams.get('error')
const errorDescription = request.nextUrl.searchParams.get('error_description')
completionUrl.searchParams.set('error', error)
completionUrl.searchParams.set('error_description', errorDescription)

Fix Required:

  • Validate that error matches expected OAuth error codes (e.g., whitelist: access_denied, invalid_request, server_error)
  • Sanitize error_description or limit its length (max 200 chars)
  • Validate state matches expected format (alphanumeric, fixed length)
  • Validate code format (alphanumeric, expected length range)

2. XSS Vulnerability in Complete Page (components/frontend/src/app/oauth/mcp/complete/page.tsx)

Location: page.tsx:26, 121-122

User-controlled error parameters are directly rendered in the DOM without sanitization:

// VULNERABLE CODE
setErrorMessage(errorDescription || error)
// ...later rendered as:
<p className="text-sm text-red-600 font-medium">{errorMessage}</p>

Fix Required:

  • Sanitize error messages before rendering
  • Use a whitelist of allowed error messages
  • Escape HTML entities
  • Limit error message length

3. Missing postMessage Origin Validation (components/frontend/src/app/oauth/mcp/complete/page.tsx)

Location: page.tsx:47, 63

The code uses window.location.origin as the targetOrigin for postMessage, but doesn't validate the origin of the receiving window. An attacker could open this page and receive sensitive OAuth codes.

// INSECURE
window.opener.postMessage(messageData, window.location.origin)

Fix Required:

  • Define an explicit, hardcoded allowed origin list for postMessage
  • Validate the opener/parent origin before sending messages
  • Consider using a state parameter to verify the request authenticity

🔴 Critical Issues

4. Missing Error Handling in DiscoverFrontendURL (components/operator/internal/config/config.go)

Location: config.go:80, 103

The function silently ignores errors from Get() calls. If there's a permissions issue or API problem, the operator won't know why MCP OAuth isn't working.

// Missing error logging
route, err := DynamicClient.Resource(routeGVR).Namespace(namespace).Get(ctx, name, metav1.GetOptions{})
if err == nil {  // Only handles success case
    // ...
}

Fix Required:

  • Log errors (except IsNotFound) at appropriate levels
  • Differentiate between "not found" (info) and "permission denied" (warning)
  • Return error details for debugging

5. No Timeout on Context (components/operator/internal/config/config.go)

Location: config.go:71

Using context.TODO() without a timeout could cause the operator to hang if the API server is slow.

ctx := context.TODO()  // No timeout

Fix Required:

ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()

6. Hardcoded Resource Names (components/operator/internal/config/config.go)

Location: config.go:80, 103

The function assumes the Route/Ingress is named "frontend" with no way to configure it. This breaks if deployments use different naming conventions.

route, err := DynamicClient.Resource(routeGVR).Namespace(namespace).Get(ctx, "frontend", metav1.GetOptions{})

Fix Required:

  • Make resource name configurable via environment variable (e.g., FRONTEND_ROUTE_NAME)
  • Default to "frontend" for backwards compatibility
  • Document the configuration option

7. Type Safety Issues with Unstructured Access (components/operator/internal/config/config.go)

Location: config.go:82, 86, 106, 110

Multiple type assertions without checking the found return value, violating CLAUDE.md guidelines:

// VIOLATES CLAUDE.md Rule 4
if spec, found, _ := unstructured.NestedMap(route.Object, "spec"); found {
    if host, ok := spec["host"].(string); ok && host != "" {  // ok check, but ignoring 'found'

Fix Required:

spec, found, err := unstructured.NestedMap(route.Object, "spec")
if !found || err != nil {
    log.Printf("Failed to get spec from Route: err=%v", err)
    return ""
}
host, found, err := unstructured.NestedString(spec, "host")
if !found || err != nil || host == "" {
    log.Printf("No host found in Route spec")
    return ""
}

🟡 Major Issues

8. Missing React Query Integration (components/frontend/src/app/oauth/mcp/complete/page.tsx)

Violation: Frontend Design Guidelines - "React Query for ALL Data Operations"

The component uses direct useEffect for what could be a side effect managed by React Query or a custom hook.

Recommendation:

// Better pattern
const useMcpOAuthComplete = () => {
  const searchParams = useSearchParams()
  const [status, setStatus] = useState<'processing' | 'success' | 'error'>('processing')
  
  useEffect(() => {
    // OAuth completion logic
  }, [searchParams])
  
  return { status, errorMessage }
}

While not strictly data fetching, consider extracting the OAuth completion logic into a custom hook for better testability.

9. No Loading/Error States in Route Handler (components/frontend/src/app/api/oauth/mcp/callback/route.ts)

Location: route.ts:9

The route handler doesn't log errors or track metrics. If OAuth fails, debugging will be difficult.

Recommendation:

  • Add server-side logging for all OAuth callbacks
  • Log sanitized parameters (code length, state presence, error types)
  • Consider adding metrics/telemetry

10. Empty .mcp.json File (components/runners/claude-code-runner/.mcp.json)

Location: .mcp.json:2-3

The file is empty with no documentation or examples. The PR description mentions "Jira MCP integration" but there's no Jira configuration.

Recommendation:

  • Add commented examples of MCP server configuration
  • Document the expected structure
  • Include example for Jira MCP server
  • Add validation in wrapper.py for required fields

11. No Tests

Missing Coverage:

  • Frontend: No tests for OAuth callback route
  • Frontend: No tests for OAuth complete page (especially postMessage logic)
  • Operator: No tests for DiscoverFrontendURL function
  • Python: No tests for simplified MCP loading logic

Recommendation:
Add unit tests for:

  • OAuth parameter validation and sanitization
  • postMessage origin validation
  • URL discovery with mocked Routes/Ingresses
  • Error scenarios (missing resources, malformed data)

🔵 Minor Issues

12. Inconsistent Error Messages (components/frontend/src/app/oauth/mcp/complete/page.tsx)

Location: page.tsx:32, 57, 67, 74

Error messages have different formats and levels of detail. Standardize for better UX.

Recommendation:

const ERROR_MESSAGES = {
  NO_CODE: 'Authentication failed: No authorization code received',
  NO_PARENT: 'Authentication failed: Unable to communicate with session',
  POST_FAILED: 'Authentication failed: Communication error',
} as const

13. Missing JSDoc/Comments (components/frontend/src/app/api/oauth/mcp/callback/route.ts)

The route handler has good high-level comments but missing details on security considerations and expected query parameters.

Recommendation:
Add JSDoc with:

  • Expected query parameters and their formats
  • Security considerations
  • Error scenarios
  • Example usage

14. Auto-close Timeout is Magic Number (components/frontend/src/app/oauth/mcp/complete/page.tsx)

Location: page.tsx:52

setTimeout(() => { window.close() }, 2000)  // Why 2000?

Recommendation:

const AUTO_CLOSE_DELAY_MS = 2000 // 2 seconds - enough time for user to see success message
setTimeout(() => { window.close() }, AUTO_CLOSE_DELAY_MS)

15. Python Logging Could Be More Structured (components/runners/claude-code-runner/wrapper.py)

Location: wrapper.py:1834, 1840, 1845

The logging is good but could include more structured context for debugging.

Recommendation:

logging.info(f"Loading MCP config", extra={
    'config_path': str(runner_mcp_file),
    'exists': runner_mcp_file.exists()
})

16. Missing Route Pattern Validation (components/frontend/src/app/api/oauth/mcp/callback/route.ts)

The PR doesn't show the route configuration. Ensure the callback URL pattern is restricted to expected OAuth providers.

Recommendation:
Document allowed callback patterns and ensure rate limiting is in place to prevent abuse.


Positive Highlights

Excellent Security Improvement in MCP Config Loading (wrapper.py:1815-1853)

  • Moved from user-controlled workspace to centralized runner configuration
  • Prevents users from injecting arbitrary MCP servers
  • Clear documentation of security rationale
  • Simplified logic reduces attack surface

Good Separation of Concerns

  • OAuth callback → redirect → completion page flow is clean
  • Operator discovery logic is separate from session handling
  • Frontend URL discovery gracefully falls back between Route and Ingress

Proper Use of Shadcn Components (page.tsx)

  • Uses Card, CardHeader, CardTitle, CardDescription, CardContent
  • Proper icon usage (Loader2, CheckCircle2, XCircle)
  • Follows design system conventions

Good Error UX (page.tsx)

  • Three clear states (processing, success, error)
  • Appropriate visual feedback for each state
  • Auto-close on success reduces user friction

Graceful Degradation (config.go:122-123)

  • Logs warning when frontend URL can't be discovered
  • Returns empty string instead of crashing
  • Allows system to function without MCP OAuth

Environment Variable Integration (sessions.go:483)

  • Properly passes discovered URL to runner pods
  • Follows existing pattern for environment variable injection

Recommendations

High Priority (Before Merge)

  1. Fix all XSS vulnerabilities (Blockers Outcome: Reduce Refinement Time with agent System #1, Epic: RAT Architecture & Design #2)

    • Implement input validation and sanitization
    • Use allowlists for error codes and states
    • Limit string lengths
  2. Fix postMessage security (Blocker Epic: Data Source Integration #3)

    • Add origin validation
    • Use explicit target origins
    • Verify state parameter
  3. Add error handling and timeouts (Critical Epic: AI Agent Development #4, Epic: Jira Integration & Workflow #5)

    • Log API errors appropriately
    • Add context timeouts
    • Differentiate error types
  4. Fix type safety issues (Critical Test: Automation Workflow Validation #7)

    • Use proper unstructured.Nested* patterns
    • Check found return values
    • Follow CLAUDE.md guidelines

Medium Priority

  1. Add comprehensive tests (Major Bump actions/github-script from 6 to 7 #11)

    • OAuth flow tests (happy path + error cases)
    • postMessage security tests
    • URL discovery tests with mocked K8s resources
  2. Add .mcp.json documentation (Major Bump actions/checkout from 4 to 5 #10)

    • Include Jira example (per PR title)
    • Document schema
    • Add validation
  3. Make resource names configurable (Critical Epic: Testing & Validation #6)

    • Add environment variable for frontend resource name
    • Document configuration options

Low Priority

  1. Improve logging and error messages (Minor Bump actions/add-to-project from 0.5.0 to 1.0.2 #9, Add vTeam shared Claude Code configuration with hooks-based enforcement #12, Add RFE Council workflow diagram #15)
  2. Add JSDoc documentation (Minor Review context7 MCP server and patternfly team usage #13)
  3. Extract magic numbers (Minor Add OpenShift AI Virtual Team agents as source of truth #14)

Testing Checklist

Before merging, verify:

  • XSS protection works with malicious error_description values
  • postMessage only works from same-origin opener/parent
  • OAuth flow works in both popup and iframe scenarios
  • Frontend URL discovery works on OpenShift (Route) and Kubernetes (Ingress)
  • System functions gracefully when frontend URL can't be discovered
  • MCP config only loads from /app/claude-runner/.mcp.json
  • Error states display properly to users

Architecture Concerns

Question for Review:

  1. How is the OAuth state parameter validated? Is there a mechanism to prevent CSRF attacks where an attacker tricks a user into completing an OAuth flow initiated by the attacker?

  2. The .mcp.json file is currently empty. Should this PR include the actual Jira MCP configuration mentioned in the PR title, or is that planned for a follow-up PR?

  3. Should the VTEAM_FRONTEND_URL environment variable be validated/sanitized before being passed to the runner pod?

  4. Is there rate limiting on the OAuth callback endpoint to prevent abuse?


Estimated Fix Time:

  • Blocker fixes: 2-3 hours
  • Critical fixes: 2-3 hours
  • Tests: 3-4 hours
  • Total: ~8-10 hours

Recommendation: Do not merge until all Blocker and Critical issues are resolved and tests are added.

@bobbravo2
Copy link
Contributor

@jeremyeder
Copy link
Collaborator

"Yes, we can close 225 - that was an attempt at getting the jira cloud MCP server working, but got blocked by its need for webflow Ouath. So that will need to be adressed first before we can proceed with this one." mc

@jeremyeder jeremyeder closed this Dec 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants