feat: add ai_mcp_servers_js for dynamic MCP server selection #314

buger · 2026-02-01T15:30:22Z

Summary

Add ai_mcp_servers_js property to AI checks that allows dynamic computation of MCP servers at runtime
Implement secure sandbox evaluation using @nyariv/sandboxjs
Add environment variable resolution for ${VAR} placeholders in MCP server env configs
Enables conditional loading of expensive MCP servers (Jira, Confluence, Zendesk) based on intent routing tags
Support workflow tool entries in ai_mcp_servers_js (returns {workflow: "workflow-id", ...})
Fix parallel sandbox execution errors by passing log through scope instead of header declaration
Add detailed logging for MCP SSE communication and workflow tool execution
Make worktree IDs deterministic for reuse across multiple calls
Update @probelabs/probe to v0.6.0-rc210
Add enableTasks and MCP method filtering (allowedMethods/blockedMethods) options

Use Case

This feature supports the OEL Bot simplification effort where we want MCP servers to load conditionally based on tags from intent routing:

jira tag → loads Atlassian Jira MCP server
confluence tag → loads Atlassian Confluence MCP server
zendesk tag → loads Zendesk MCP server
No tags → no MCP servers loaded (faster startup, lower resource usage)

Example Usage

ai_mcp_servers_js: |
  const servers = {};
  const tags = outputs['route-intent']?.tags || [];
  if (tags.includes('jira')) {
    servers.jira = {
      command: "npx",
      args: ["-y", "@aashari/mcp-server-atlassian-jira"],
      env: { 
        ATLASSIAN_SITE_NAME: "${ATLASSIAN_SITE_NAME}",
        ATLASSIAN_API_TOKEN: "${ATLASSIAN_API_TOKEN}"
      }
    };
  }
  // Workflow tools can also be added dynamically
  if (tags.includes('code')) {
    servers['code-talk'] = { workflow: 'tyk-code-talk' };
  }
  return servers;

Test plan

Added unit tests for evaluateMcpServersJs() method
Tests verify sandbox evaluation with access to outputs, inputs, pr, files, env, memory
Tests verify environment variable resolution for ${VAR} placeholders
Tests verify parallel sandbox execution doesn't cause "already declared" errors
All 226 test suites pass

🤖 Generated with Claude Code

Add support for dynamically computing MCP servers in AI checks via JavaScript expressions. This enables conditional loading of MCP servers based on routing tags, outputs, or other runtime context. Key changes: - Add ai_mcp_servers_js property to CheckConfig interface - Implement evaluateMcpServersJs() in AiCheckProvider using sandbox evaluation - Add environment variable resolution for ${VAR} placeholders in MCP server env configs, enabling secure credential injection - Update config schema to include the new property - Add comprehensive tests for the feature This feature allows workflows to conditionally load expensive MCP servers (like Jira, Confluence, Zendesk) only when needed based on intent routing tags. Example usage: ```yaml ai_mcp_servers_js: | const servers = {}; const tags = outputs['route-intent']?.tags || []; if (tags.includes('jira')) { servers.jira = { command: "npx", args: ["-y", "@aashari/mcp-server-atlassian-jira"], env: { ATLASSIAN_API_TOKEN: "${ATLASSIAN_API_TOKEN}" } }; } return servers; ``` Co-Authored-By: Claude Opus 4.5 <[email protected]>

probelabs · 2026-02-01T15:32:25Z

PR Overview: Dynamic MCP Server Selection with JavaScript Expressions

Summary

This PR introduces ai_mcp_servers_js and ai_custom_tools_js properties to enable dynamic, runtime computation of MCP servers and custom tools for AI checks. The implementation uses a secure JavaScript sandbox (@nyariv/sandboxjs) to evaluate expressions with access to workflow context (outputs, inputs, PR metadata, files, environment variables, and memory).

Files Changed

Core Implementation:

src/providers/ai-check-provider.ts (+336/-8): Implemented evaluateMcpServersJs() and evaluateCustomToolsJs() methods with sandbox integration, environment variable resolution, and merge logic for dynamic/static configurations
src/types/config.ts (+68/-3): Added TypeScript interface definitions with comprehensive JSDoc examples for ai_mcp_servers_js and ai_custom_tools_js
src/generated/config-schema.ts (+60/-9): Auto-generated schema updated with new properties and detailed descriptions

Supporting Changes:

src/utils/sandbox.ts (+22/-8): Fixed parallel execution issue by passing log through scope instead of header declaration
src/providers/mcp-custom-sse-server.ts (+42/-9): Enhanced logging for MCP SSE communication and workflow tool execution
src/providers/workflow-tool-executor.ts (+11/-0): Added detailed logging for workflow tool outputs
src/utils/workspace-manager.ts (+14/-1): Made worktree IDs deterministic for reuse across multiple calls
src/utils/worktree-manager.ts (+5/-2): Added project reuse logic to prevent duplicate checkouts
src/ai-review-service.ts (+7/-0): Added enableTasks configuration option

Testing:

tests/unit/providers/ai-check-provider.test.ts (+163/-0): Comprehensive unit tests covering expression evaluation, context access, error handling, and edge cases

Dependencies:

package.json & package-lock.json: Updated @probelabs/probe to v0.6.0-rc210

Architecture & Impact Assessment

What This PR Accomplishes

Dynamic MCP Server Selection: AI checks can now conditionally load expensive MCP servers (Jira, Confluence, Zendesk) based on intent routing tags or other runtime conditions
Dynamic Custom Tools: Custom tools can be computed at runtime based on workflow outputs, PR context, or other criteria
Environment Variable Resolution: Supports ${VAR} placeholder syntax in MCP server env configs for secure credential injection
Secure Execution: Uses @nyariv/sandboxjs with hardened prototype whitelists to prevent unauthorized access
Unified MCP Entry Format: Single McpServerConfig interface supports stdio servers, SSE/HTTP endpoints, and workflow tool references

Key Technical Changes

Sandbox Integration: Leverages existing src/utils/sandbox.ts infrastructure (shared with failure conditions, routing, script provider)
Context Building: Provides access to outputs, inputs, pr, files, env, and memory to JavaScript expressions
Environment Filtering: Uses centralized buildSandboxEnv() utility from src/utils/env-exposure.ts for consistent security filtering
Merge Strategy: Dynamic configurations override static ones; dynamic and static custom tools are merged (deduplicated by name)
Error Handling: Graceful degradation on evaluation errors with detailed logging

Affected System Components

graph TD
    A[AI Check Provider] --> B[evaluateMcpServersJs]
    A --> C[evaluateCustomToolsJs]
    B --> D[Sandbox Evaluation]
    C --> D
    D --> E[Context: outputs, inputs, pr, files, env, memory]
    B --> F[Environment Variable Resolution]
    F --> G[MCP Server Config]
    C --> H[Custom Tools Array]
    G --> I[MCP Server Initialization]
    H --> J[Custom Tools SSE Server]
    
    K[Existing Sandbox Usage] --> D
    K --> L[Failure Conditions]
    K --> M[Routing Logic]
    K --> N[Script Provider]
    
    O[env-exposure.ts] --> P[buildSandboxEnv]
    P --> D

Component Relationships

Sandbox Infrastructure: Centralized in src/utils/sandbox.ts with prototype whitelists for Array, String, Object, Map, Set, Date, RegExp
Shared Patterns: Context building mirrors existing patterns in failure condition evaluator and routing logic
Security: Uses centralized buildSandboxEnv() utility for consistent environment filtering across providers
Environment Resolution: Integrates with existing EnvironmentResolver.resolveValue() for ${VAR} placeholder syntax

Scope Discovery & Context Expansion

Direct Impact

AI Check Provider: Core implementation in src/providers/ai-check-provider.ts
Configuration Schema: Auto-generated schema updated to include new properties
Type Definitions: TypeScript interfaces extended with JSDoc documentation

Related Components

Memory Store: Accessible via memory context object (get, has, list, getAll operations)
Environment Resolver: Used for ${VAR} placeholder resolution in MCP server env configs
Workflow Tool Executor: Validates workflow reference objects in custom tools
Custom Tools SSE Server: Ephemeral server for exposing custom tools to AI agents
env-exposure.ts: Centralized utility for safe environment variable filtering

Testing Coverage

Unit tests verify:

Expression evaluation with various return types
Context access (outputs, inputs, pr, files, env, memory)
Error handling (syntax errors, runtime errors, invalid return types)
Log helper injection for debugging
Filtering of invalid items from result arrays
Merge behavior when both dynamic and static tools are specified

Integration Points

Intent Routing: Primary use case is conditional loading based on outputs['route-intent']?.tags
OEL Bot: Target integration for Jira/Confluence/Zendesk MCP servers
Existing MCP Servers: Backward compatible with static ai_mcp_servers configuration

Use Case Example

checks:
  ai-review:
    type: ai
    prompt: "Review this PR"
    ai_mcp_servers_js: |
      const servers = {};
      const tags = outputs['route-intent']?.tags || [];
      if (tags.includes('jira')) {
        servers.jira = {
          command: "npx",
          args: ["-y", "@aashari/mcp-server-atlassian-jira"],
          env: { 
            ATLASSIAN_SITE_NAME: "${ATLASSIAN_SITE_NAME}",
            ATLASSIAN_API_TOKEN: "${ATLASSIAN_API_TOKEN}"
          }
        };
      }
      return servers;

Security Considerations

Sandbox Hardening: Prototype whitelists restrict access to dangerous methods
Environment Filtering: Uses centralized buildSandboxEnv() with denylist for sensitive keys (tokens, secrets, passwords)
No Side Effects: Expressions cannot modify global state or access external resources
Error Isolation: Evaluation errors do not crash the workflow; fallback to static configs

Performance Impact

Sandbox Initialization: One-time cost per provider instance (cached in this.sandbox)
Expression Evaluation: Minimal overhead for typical expressions (simple conditionals and object construction)
Resource Savings: Conditional loading avoids initializing expensive MCP servers when not needed

Migration Notes

Backward Compatible: Existing static ai_mcp_servers and ai_custom_tools configurations continue to work
Merge Behavior: Dynamic and static custom tools are merged (deduplicated by name)
Override Priority: Dynamic MCP servers completely override static configuration (not merged)

Review Focus Areas

Sandbox Security: Review prototype whitelists in src/utils/sandbox.ts for completeness
Environment Filtering: Verify buildSandboxEnv() patterns cover all sensitive key types
Error Handling: Ensure graceful degradation doesn't hide configuration errors
Test Coverage: Validate edge cases (empty results, malformed expressions, circular dependencies)
Documentation: Confirm JSDoc examples are clear and actionable
Code Duplication: Consider extracting common context building logic into shared method

Metadata

Review Effort: 3 / 5
Primary Label: feature

Powered by Visor from Probelabs

Last updated: 2026-02-02T14:11:13.786Z | Triggered by: pr_updated | Commit: 710def7

💡 TIP: You can chat with Visor using /visor ask <your question>

probelabs · 2026-02-01T15:34:53Z

Security Issues (8)

Severity	Location	Issue
🟢 Info	`src/providers/ai-check-provider.ts:1680-1772`	The evaluateMcpServersJs() method exposes extensive context to the sandbox including outputs, inputs, pr metadata, files, env, and memory. While the sandbox itself is secure, the exposure of dependency outputs and PR information could leak sensitive data if the JavaScript expression logs or captures this data. The current implementation doesn't restrict what the expression can do with this data beyond sandbox execution. 💡 Suggestion Consider adding data sanitization for sensitive fields in PR metadata and dependency outputs before passing to sandbox. Implement rate limiting or resource quotas to prevent abuse through complex expressions. Add audit logging for sandbox evaluations.
🟢 Info	`src/utils/sandbox.ts:239-268`	The log function injected into sandbox scope uses console.log with user-controlled prefix. While the prefix is sanitized, the log arguments themselves are not sanitized before being passed to console.log. This could allow log injection attacks if the sandbox code logs specially crafted strings. 💡 Suggestion Consider sanitizing log arguments to prevent log injection. Replace control characters and ANSI escape sequences in logged values. Use a dedicated logging utility that handles untrusted input safely.
🟢 Info	`src/providers/ai-check-provider.ts:1032-1040`	Error handling for ai_mcp_servers_js evaluation logs detailed error messages including the error message. If the error message contains sensitive information from the JavaScript expression (e.g., leaked credentials, internal paths), it will be logged. The catch block logs the full error message which could expose sensitive data. 💡 Suggestion Sanitize error messages before logging them. Remove or redact potentially sensitive information. Consider logging only a generic error message in production while preserving detailed errors for debugging in non-production environments.
🟡 Warning	`src/providers/ai-check-provider.ts:1057-1068`	Environment variable resolution in MCP server configs may allow command injection. The code resolves ${VAR} placeholders in env configs but doesn't validate that resolved values don't contain shell metacharacters before being passed to MCP server commands. An attacker who can control environment variables could inject malicious commands. 💡 Suggestion Validate and sanitize resolved environment variable values before passing them to MCP server commands. Consider using a whitelist of allowed characters or proper shell escaping for values that will be used in command contexts.
🟡 Warning	`src/providers/ai-check-provider.ts:1778-1799`	The buildSafeEnv() method uses regex patterns to filter sensitive environment variables, but the patterns are incomplete and may miss sensitive keys. Patterns like /auth/i will filter 'AUTH_TOKEN' but could miss variants like 'AUTHORIZATION', 'AUTH_KEY', 'APIKEY', 'SECRET_KEY', etc. Additionally, the filtering happens after the values are already accessed from process.env. 💡 Suggestion Use a whitelist approach instead of blacklist for environment variable exposure. Only expose explicitly allowed, non-sensitive environment variables to the sandbox. Consider reusing the existing buildSandboxEnv() function from env-exposure.ts which has more comprehensive filtering.
🟡 Warning	`src/providers/ai-check-provider.ts:1680-1772`	The evaluateMcpServersJs() method validates that server configs have command, url, or workflow properties, but doesn't validate the actual values. Malicious configurations could inject dangerous commands, point to malicious URLs, or reference unauthorized workflows. The validation only checks property existence, not content safety. 💡 Suggestion Add validation for command values to detect shell metacharacters (similar to mcp-check-provider.ts line 120-123). Validate URLs to ensure they use allowed protocols and don't point to internal addresses. Validate workflow references against an allowlist.
🟡 Warning	`src/types/config.ts:368-401`	The McpServerConfig interface allows optional command, url, and workflow properties without strict validation requirements. This enables configuration of MCP servers that could execute arbitrary commands (command), connect to arbitrary URLs (url), or invoke any workflow (workflow) without proper authorization checks. The interface doesn't enforce security constraints. 💡 Suggestion Add validation logic to enforce security constraints when processing McpServerConfig. Commands should be validated against an allowlist. URLs should be restricted to allowed protocols and optionally to specific domains. Workflow references should be validated against an allowlist or namespace restrictions.
🟡 Warning	`src/providers/ai-check-provider.ts:1131-1177`	Workflow entries extracted from ai_mcp_servers are not validated before being added to customToolsToLoad. The code checks for the presence of 'workflow' property but doesn't validate that the workflow ID is allowed or exists. This could allow unauthorized workflow execution if an attacker can influence the ai_mcp_servers_js expression. 💡 Suggestion Validate workflow references against an allowlist before adding them to custom tools. Ensure the workflow exists and the current context has permission to execute it. Add namespace restrictions to prevent accessing workflows from other projects or contexts.

Architecture Issues (8)

Severity	Location	Issue
🟠 Error	`src/providers/ai-check-provider.ts:1778`	buildSafeEnv() duplicates existing buildSandboxEnv() from src/utils/env-exposure.ts. The new implementation uses regex patterns to filter sensitive keys, while buildSandboxEnv() uses a denylist with VISOR_ALLOW_ENV/VISOR_DENY_ENV configuration. This creates two different security policies for environment variable exposure. 💡 Suggestion Reuse the existing buildSandboxEnv() utility from src/utils/env-exposure.ts instead of implementing a new filtering mechanism. This ensures consistent security policy across all sandbox usage points and maintains a single source of truth for environment variable exposure. 🔧 Suggested Fix `Replace buildSafeEnv() method with:` `private buildSafeEnv(): Record<string, string> { const { buildSandboxEnv } = require('../utils/env-exposure'); return buildSandboxEnv(process.env); }`
🟢 Info	`src/providers/ai-check-provider.ts:1063`	Environment variable resolution is applied to MCP server env configs but not to other configuration values that might contain ${VAR} placeholders. This creates inconsistency in how environment variables are handled throughout the configuration. 💡 Suggestion Consider applying EnvironmentResolver.resolveValue() consistently across all configuration values that support environment variable placeholders, or document why env resolution is only needed for MCP server configs. 🔧 Suggested Fix `N/A - this is a documentation/suggestion item`
🟢 Info	`src/utils/sandbox.ts:246`	The fix for parallel sandbox execution (passing log through scope instead of header) is a pragmatic solution but creates an implicit dependency between injectLog option and user code. If injectLog=false, user code can still declare its own log function, which is good, but the dual mechanism (scope vs header) is subtle. 💡 Suggestion Document the injectLog behavior clearly: when true, log is injected into scope and user code should not redeclare it; when false, user code may declare its own log. Consider adding a runtime check to detect if user code declares log when injectLog=true. 🔧 Suggested Fix `N/A - this is a documentation/suggestion item`
🟢 Info	`src/utils/worktree-manager.ts:109`	Making worktree IDs deterministic by removing Date.now() from the hash is a good change for reuse, but the comment doesn't explain the trade-off: multiple calls with the same repo+ref will now return the same worktree, which assumes the worktree content doesn't change between calls. 💡 Suggestion Document the assumption that worktree content is stable for a given repo+ref combination. If this assumption is violated (e.g., if refs can move), consider adding a timestamp or commit hash to the worktree ID. 🔧 Suggested Fix `N/A - this is a documentation/suggestion item`
🟡 Warning	`src/providers/ai-check-provider.ts:1606`	evaluateCustomToolsJs() and evaluateMcpServersJs() contain 95% identical code for building the jsContext object. The only difference is the logPrefix parameter. This violates DRY principles and makes maintenance harder. 💡 Suggestion Extract the common context-building logic into a private method buildJsContext() that takes prInfo, dependencyResults, and config as parameters. Both methods can then call this helper and add their specific logPrefix. 🔧 Suggested Fix `Extract to shared method: private buildJsContext( prInfo: PRInfo, dependencyResults: Map<string, ReviewSummary>, config: CheckProviderConfig ): Record<string, unknown> { const outputs: Record<string, unknown> = {}; for (const [checkId, result] of dependencyResults.entries()) { const summary = result as ReviewSummary & { output?: unknown }; outputs[checkId] = summary.output !== undefined ? summary.output : summary; }` `return { outputs, inputs: (config as any).inputs \|\| {}, pr: { number: prInfo.number, title: prInfo.title, description: prInfo.body, author: prInfo.author, branch: prInfo.head, base: prInfo.base, authorAssociation: prInfo.authorAssociation, }, files: prInfo.files?.map(f => ({ filename: f.filename, status: f.status, additions: f.additions, deletions: f.deletions, changes: f.changes, })) \|\| [], env: this.buildSafeEnv(), memory: (config as any).__memoryAccessor \|\| {}, }; }`
🟡 Warning	`src/providers/ai-check-provider.ts:1035`	The MCP server configuration merge logic has 6 steps (global, check-level, ai.mcpServers, ai_mcp_servers_js, env resolution, custom tools) with complex override semantics. The dynamic ai_mcp_servers_js completely overrides static configs rather than merging, which is inconsistent with how ai_custom_tools_js works (merges with static). 💡 Suggestion Consider making ai_mcp_servers_js merge behavior consistent with ai_custom_tools_js (merge instead of override), or document why override is the correct behavior. The current inconsistency makes the API harder to understand and use correctly. 🔧 Suggested Fix `Change line 1047 from: Object.assign(mcpServers, dynamicServers); To a merge that preserves static servers: for (const [name, cfg] of Object.entries(dynamicServers)) { mcpServers[name] = cfg; }`
🟡 Warning	`src/types/config.ts:371`	McpServerConfig has become a union type disguised as an interface with optional properties. The 'type detection by property presence' pattern (command=stdio, url=sse/http, workflow=workflow) is implicit and error-prone. This violates the 'explicit is better than implicit' principle. 💡 Suggestion Consider using a discriminated union type with explicit 'type' field for better type safety and clearer intent: type McpServerConfig = \| { type: 'stdio'; command: string; args?: string[]; env?: Record<string, string> } \| { type: 'sse' \| 'http'; url: string; transport?: 'sse' \| 'http' } \| { type: 'workflow'; workflow: string; inputs?: Record<string, unknown>; description?: string }; This makes the type system enforce correct usage and prevents invalid configurations. 🔧 Suggested Fix `Replace interface with discriminated union: export type McpServerConfig = \| McpStdioServer \| McpHttpServer \| McpWorkflowTool; interface McpStdioServer { type: 'stdio'; command: string; args?: string[]; env?: Record<string, string>; allowedMethods?: string[]; blockedMethods?: string[]; } interface McpHttpServer { type: 'sse' \| 'http'; url: string; transport?: 'sse' \| 'http'; allowedMethods?: string[]; blockedMethods?: string[]; }` `interface McpWorkflowTool { type: 'workflow'; workflow: string; inputs?: Record<string, unknown>; description?: string; allowedMethods?: string[]; blockedMethods?: string[]; }`
🟡 Warning	`src/providers/ai-check-provider.ts:1143`	Workflow tool extraction from ai_mcp_servers is a special case that treats workflow entries differently from other MCP server types. This creates an implicit convention where 'workflow' property triggers special handling, which is not obvious from the type definition. 💡 Suggestion Make workflow tools a first-class concept with explicit configuration (ai_workflow_tools or ai_workflow_tools_js) rather than overloading ai_mcp_servers. This would eliminate the need for extraction logic and make the API more explicit. 🔧 Suggested Fix `Consider adding separate config properties: // In CheckConfig interface: ai_workflow_tools?: Array<string \| WorkflowToolReference>; ai_workflow_tools_js?: string;` `// Then remove the workflow extraction logic from lines 1143-1173`

Performance Issues (8)

Severity	Location	Issue
🟢 Info	`src/providers/ai-check-provider.ts:1032`	The code iterates through mcpServers values twice: once for environment variable resolution and once for workflow entry extraction. These could be combined into a single pass. 💡 Suggestion Combine the environment variable resolution loop with the workflow entry extraction loop into a single iteration over mcpServers. 🔧 Suggested Fix // 5. Resolve environment variable placeholders and extract workflow entries in single pass const workflowEntriesFromMcp: WorkflowToolReference[] = []; const mcpEntriesToRemove: string[] = []; for (const [serverName, serverConfig] of Object.entries(mcpServers)) { // Resolve environment variables if (serverConfig.env) { const resolvedEnv: Record<string, string> = {}; for (const [key, value] of Object.entries(serverConfig.env)) { if (typeof value === 'string') { resolvedEnv[key] = String(EnvironmentResolver.resolveValue(value)); } else { resolvedEnv[key] = String(value); } } serverConfig.env = resolvedEnv; } // Extract workflow entries const cfg = serverConfig as unknown as Record<string, unknown>; if (cfg.workflow && typeof cfg.workflow === 'string') { workflowEntriesFromMcp.push({ workflow: cfg.workflow as string, args: cfg.inputs as Record<string, unknown> \| undefined, }); mcpEntriesToRemove.push(serverName); logger.debug( [AICheckProvider] Extracted workflow tool '${serverName}' from ai_mcp_servers ); } } `// Remove workflow entries from mcpServers for (const name of mcpEntriesToRemove) { delete mcpServers[name]; }`
🟢 Info	`src/providers/ai-check-provider.ts:1079`	The code performs multiple iterations over custom tools arrays for deduplication. When both dynamic and static tools exist, it iterates through static tools twice (once to create Set, once to merge). 💡 Suggestion Optimize the merge logic to use a single pass with a Set for O(1) lookups. 🔧 Suggested Fix // Option 2: Check for ai_custom_tools (static - backward compatible) // Merge with dynamic tools if both are specified const staticCustomTools = this.getCustomToolsForAI(config); if (staticCustomTools.length > 0) { if (customToolsToLoad.length > 0) { // Merge dynamic and static tools (avoid duplicates by name) - single pass const existingNames = new Set( customToolsToLoad.map(item => (typeof item === 'string' ? item : item.workflow)) ); for (const tool of staticCustomTools) { const name = typeof tool === 'string' ? tool : tool.workflow; if (!existingNames.has(name)) { existingNames.add(name); // Add to Set for future lookups customToolsToLoad.push(tool); } } } else { customToolsToLoad = staticCustomTools; customToolsServerName = '__custom_tools__'; } }
🟢 Info	`src/providers/ai-check-provider.ts:1167`	Similar inefficient iteration pattern when merging workflow entries with existing custom tools. Creates a Set from existing tools then iterates workflow entries without adding new names to the Set for potential future lookups. 💡 Suggestion Add workflow tool names to the existingNames Set during iteration to maintain O(1) lookup performance if more tools are added later. 🔧 Suggested Fix // Merge workflow entries with other custom tools if (workflowEntriesFromMcp.length > 0) { if (customToolsToLoad.length > 0) { // Avoid duplicates const existingNames = new Set( customToolsToLoad.map(item => (typeof item === 'string' ? item : item.workflow)) ); for (const wf of workflowEntriesFromMcp) { if (!existingNames.has(wf.workflow)) { existingNames.add(wf.workflow); // Maintain Set for consistency customToolsToLoad.push(wf); } } } else { customToolsToLoad = workflowEntriesFromMcp; } customToolsServerName = '__tools__'; }
🟢 Info	`src/utils/sandbox.ts:228`	compileAndRun performs multiple string operations on safePrefix: replace() called 3 times, then slice(). These could be combined into a single regex replace. 💡 Suggestion Combine the sanitization operations into a single regex replace for better performance. 🔧 Suggested Fix const inject = opts?.injectLog === true; let safePrefix = String(opts?.logPrefix ?? '[sandbox]'); // Sanitize prefix aggressively: drop control chars and risky tokens, limit length safePrefix = safePrefix .replace(/[\r\n\t\0`$\\]\|\$\{/g, '') // Combined regex for all patterns .slice(0, 64);
🟡 Warning	`src/providers/ai-check-provider.ts:1778`	buildSafeEnv() iterates through all process.env entries on every evaluation call, performing regex pattern matching against each key. This is O(n) where n is the number of environment variables. For frequent sandbox evaluations, this creates unnecessary overhead. 💡 Suggestion Cache the filtered environment result since process.env doesn't change during execution. Store the result in a private property and reuse it across multiple evaluateMcpServersJs/evaluateCustomToolsJs calls. 🔧 Suggested Fix private cachedSafeEnv: Record<string, string> \| null = null; private buildSafeEnv(): Record<string, string> { if (this.cachedSafeEnv) { return this.cachedSafeEnv; } const sensitivePatterns = [ /api.?key/i, /secret/i, /token/i, /password/i, /credential/i, /auth/i, /private/i, ]; const safeEnv: Record<string, string> = {}; for (const [key, value] of Object.entries(process.env)) { if (value === undefined) continue; const isSensitive = sensitivePatterns.some(pattern => pattern.test(key)); if (!isSensitive) { safeEnv[key] = value; } } `this.cachedSafeEnv = safeEnv; return safeEnv; }`
🟡 Warning	`src/providers/ai-check-provider.ts:1606`	evaluateCustomToolsJs and evaluateMcpServersJs both build identical context objects (outputs, pr, files, env, memory) independently. This duplicates work when both expressions are evaluated in the same check execution. 💡 Suggestion Extract the context building logic into a shared private method that both evaluateCustomToolsJs and evaluateMcpServersJs can call. This reduces code duplication and ensures consistent context structure. 🔧 Suggested Fix `private buildSandboxContext( prInfo: PRInfo, dependencyResults: Map<string, ReviewSummary>, config: CheckProviderConfig ): Record<string, unknown> { const outputs: Record<string, unknown> = {}; for (const [checkId, result] of dependencyResults.entries()) { const summary = result as ReviewSummary & { output?: unknown }; outputs[checkId] = summary.output !== undefined ? summary.output : summary; }` `return { outputs, inputs: (config as any).inputs \|\| {}, pr: { number: prInfo.number, title: prInfo.title, description: prInfo.body, author: prInfo.author, branch: prInfo.head, base: prInfo.base, authorAssociation: prInfo.authorAssociation, }, files: prInfo.files?.map(f => ({ filename: f.filename, status: f.status, additions: f.additions, deletions: f.deletions, changes: f.changes, })) \|\| [], env: this.buildSafeEnv(), memory: (config as any).__memoryAccessor \|\| {}, }; }`
🟡 Warning	`src/providers/ai-check-provider.ts:1680`	evaluateMcpServersJs duplicates the same context building logic as evaluateCustomToolsJs, creating outputs object and jsContext with identical structure. 💡 Suggestion Use the shared buildSandboxContext method instead of duplicating the context building logic. 🔧 Suggested Fix `private evaluateMcpServersJs( expression: string, prInfo: PRInfo, dependencyResults: Map<string, ReviewSummary>, config: CheckProviderConfig ): Record<string, import('../types/config').McpServerConfig> { if (!this.sandbox) { this.sandbox = createSecureSandbox(); }` `const jsContext = this.buildSandboxContext(prInfo, dependencyResults, config); // ... rest of method`
🟡 Warning	`src/providers/ai-check-provider.ts:1606`	evaluateCustomToolsJs duplicates the same context building logic as evaluateMcpServersJs, creating outputs object and jsContext with identical structure. 💡 Suggestion Use the shared buildSandboxContext method instead of duplicating the context building logic. 🔧 Suggested Fix `private evaluateCustomToolsJs( expression: string, prInfo: PRInfo, dependencyResults: Map<string, ReviewSummary>, config: CheckProviderConfig ): Array<string \| WorkflowToolReference> { if (!this.sandbox) { this.sandbox = createSecureSandbox(); }` `const jsContext = this.buildSandboxContext(prInfo, dependencyResults, config); // ... rest of method`

Quality Issues (13)

Severity	Location	Issue
🟠 Error	`src/providers/ai-check-provider.ts:1680-1772`	Missing unit tests for evaluateMcpServersJs() method. The PR description claims 'Added unit tests for evaluateMcpServersJs() method' but the test file only contains tests for evaluateCustomToolsJs(). Critical functionality for dynamic MCP server selection is untested. 💡 Suggestion Add comprehensive unit tests for evaluateMcpServersJs() covering: 1) Valid server config objects with command/url/workflow properties, 2) Invalid return types (array, null, primitive), 3) Invalid server config objects, 4) Context access (outputs, inputs, pr, files, env, memory), 5) Syntax and runtime error handling, 6) Log helper injection
🟢 Info	`src/utils/sandbox.ts:239-258`	Fix for parallel sandbox execution passes 'log' through scope instead of header declaration, but the comment mentions 'shared global state'. However, each call to compileAndRun() creates its own scopeWithLog object, so there's no actual shared state. The fix may be addressing a symptom rather than the root cause. 💡 Suggestion Verify the actual race condition scenario. If the issue was multiple executions sharing the same sandbox instance with header-declared 'log', document that clearly. Consider making sandbox instances immutable after compilation.
🟢 Info	`src/utils/worktree-manager.ts:100-119`	Deterministic worktree ID generation uses MD5 hash truncated to 8 characters. While MD5 is sufficient for this use case, 8 hex characters (32 bits) has a non-trivial collision probability with many repositories. The comment says 'enables worktree reuse' but doesn't address collision handling. 💡 Suggestion Consider using 12+ characters for better collision resistance. Add collision detection/handling logic if a worktree with the same ID but different repo/ref already exists.
🟢 Info	`src/providers/mcp-custom-sse-server.ts:489-510`	SSE message logging creates string previews and logs at debug level for every message. For high-frequency MCP communication, this could cause performance issues and log bloat. The preview calculation (substring operations) happens even when debug logging is disabled. 💡 Suggestion Move preview calculation inside a conditional check for debug level: if (logger.isDebugEnabled()) { ... }. Consider making preview length configurable.
🟢 Info	`src/providers/workflow-tool-executor.ts:226-250`	Workflow completion logging uses logger.info for output keys and logger.debug for output preview. This creates inconsistent visibility - users see keys but not values. The preview truncation at 500 chars may hide important information. 💡 Suggestion Consider using debug for both or info for both. Make preview length configurable. Consider adding a separate 'verbose' mode for full output logging.
🟡 Warning	`src/providers/ai-check-provider.ts:1599-1678`	Significant code duplication between evaluateCustomToolsJs() and evaluateMcpServersJs(). Both methods build identical context objects (outputs, inputs, pr, files, env, memory) and use the same sandbox evaluation pattern. This violates DRY principle and increases maintenance burden. 💡 Suggestion Extract common context building logic into a private method buildJsContext(prInfo, dependencyResults, config) and refactor both evaluation methods to use it. Consider a generic evaluateJsExpression<T>(expression, context, options) method.
🟡 Warning	`src/providers/ai-check-provider.ts:1774-1808`	buildSafeEnv() uses incomplete regex patterns for filtering sensitive environment variables. Patterns like /auth/i and /private/i are too broad and may filter legitimate non-sensitive variables (e.g., 'AUTHOR', 'PUBLISH_PRIVATE'). Conversely, sensitive variables with non-standard naming (e.g., 'KEY', 'CREDENTIALS_FILE') would leak through. 💡 Suggestion Use a whitelist approach instead of blacklist. Only expose known-safe environment variables (e.g., 'CI', 'NODE_ENV', 'PATH', 'HOME', 'USER'). Alternatively, use more specific patterns like /api[_-]?key/i, /secret[_-]?key/i, /access[_-]?token/i, and add word boundaries.
🟡 Warning	`src/providers/ai-check-provider.ts:1035-1054`	Error handling for ai_mcp_servers_js evaluation swallows errors and continues silently. While this prevents workflow crashes, it makes configuration errors invisible to users. The error is logged but the check proceeds without dynamic servers, potentially causing unexpected behavior. 💡 Suggestion Consider adding a configuration option (e.g., 'strict_js_evaluation') that allows users to fail the check on JavaScript evaluation errors. Alternatively, increment a metric or add a warning annotation to the PR comment when dynamic evaluation fails.
🟡 Warning	`src/providers/ai-check-provider.ts:1068-1080`	Similar error swallowing issue for ai_custom_tools_js evaluation. Errors are logged but the check continues with fallback to static tools, which may hide misconfigurations. 💡 Suggestion Same as above - consider adding strict mode or visibility for evaluation failures.
🟡 Warning	`src/providers/ai-check-provider.ts:1135-1180`	Complex merge strategy for custom tools across 4 different sources (ai_custom_tools_js, ai_custom_tools, MCP servers with 'tools:', workflow entries from MCP) is difficult to reason about and has potential for subtle bugs. The deduplication logic only checks workflow names, not full tool configurations. 💡 Suggestion Document the merge strategy clearly with examples. Consider simplifying to 2-3 sources max. Add tests verifying merge behavior when multiple sources specify the same tool with different configurations.
🟡 Warning	`src/utils/workspace-manager.ts:213-238`	Workspace reuse logic checks for existing repository + worktreePath combinations but doesn't verify if the worktree is still valid or up-to-date. A stale worktree from a previous run could be reused, potentially causing incorrect results. 💡 Suggestion Add validation to ensure the reused worktree exists and is valid (e.g., check git status, verify it's not corrupted). Consider adding a timestamp or checksum to detect stale worktrees.
🟡 Warning	`tests/unit/providers/ai-check-provider.test.ts:626-790`	Tests use hardcoded string values without clear semantic meaning. For example, expect(result).toEqual(['engineer-tool', 'jira-tool']) uses magic strings that appear to be reverse-engineered from the implementation rather than derived from requirements. 💡 Suggestion Define constants at the top of the test file with clear semantic meaning (e.g., const EXPECTED_ENGINEER_TOOLS = ['engineer-tool', 'jira-tool']). Add comments explaining why these specific values are expected based on the test scenario.
🟡 Warning	`tests/unit/providers/ai-check-provider.test.ts:626-790`	Test suite lacks comprehensive negative testing for evaluateCustomToolsJs(). While there are tests for syntax/runtime errors, there are no tests for: 1) Malicious code attempting to escape sandbox, 2) Infinite loops or resource exhaustion, 3) Prototype pollution attempts, 4) Very large return values, 5) Circular references in returned objects. 💡 Suggestion Add security-focused negative tests to verify sandbox hardening. Test with attempts to access process, require, __proto__, constructor, etc. Add tests for resource limits (timeout, memory).

Powered by Visor from Probelabs

Last updated: 2026-02-02T14:11:17.462Z | Triggered by: pr_updated | Commit: 710def7

💡 TIP: You can chat with Visor using /visor ask <your question>

The worktree tests (.visor/worktrees/) run against historical code versions that have singleton state leakage issues (MemoryStore). Tests pass individually but fail when run in sequence due to shared state. This change explicitly specifies test directories (defaults/, tests/, examples/) rather than running all tests recursively, ensuring pre-commit hooks are reliable. Co-Authored-By: Claude Opus 4.5 <[email protected]>

- Update validation to accept entries with 'workflow' property - Add Option 4 to extract workflow entries from ai_mcp_servers - Update McpServerConfig type to support unified flat detection: - command → stdio MCP server - url → SSE/HTTP MCP server - workflow → workflow tool reference - empty {} → auto-detect from tools: section Co-Authored-By: Claude Opus 4.5 <[email protected]>

Workflow tools like tyk-code-talk can run complex operations (code search, AI analysis) that take several minutes. The default 30-second MCP timeout was causing these calls to fail. Set 10-minute (600000ms) timeout for the __tools__ SSE server that hosts workflow tools. Co-Authored-By: Claude Opus 4.5 <[email protected]>

- Fix sandbox parallel execution errors by passing `log` through scope instead of declaring in code header (prevents "already declared" errors) - Only inject `log` when injectLog=true, allowing callers to define their own - Fix wrapFunction default to handle partial options correctly - Add detailed logging for MCP SSE communication and workflow tool execution - Make worktree IDs deterministic for reuse across multiple calls - Reuse existing project paths in workspace manager to prevent duplicates - Update @probelabs/probe to v0.6.0-rc210 - Add enableTasks option propagation to ProbeAgent Co-Authored-By: Claude Opus 4.5 <[email protected]>

buger and others added 4 commits February 1, 2026 15:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add ai_mcp_servers_js for dynamic MCP server selection #314

feat: add ai_mcp_servers_js for dynamic MCP server selection #314

Uh oh!

buger commented Feb 1, 2026 •

edited

Loading

Uh oh!

probelabs bot commented Feb 1, 2026 •

edited

Loading

Uh oh!

probelabs bot commented Feb 1, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: add ai_mcp_servers_js for dynamic MCP server selection #314

Are you sure you want to change the base?

feat: add ai_mcp_servers_js for dynamic MCP server selection #314

Uh oh!

Conversation

buger commented Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Use Case

Example Usage

Test plan

Uh oh!

probelabs bot commented Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Overview: Dynamic MCP Server Selection with JavaScript Expressions

Summary

Files Changed

Architecture & Impact Assessment

What This PR Accomplishes

Key Technical Changes

Affected System Components

Component Relationships

Scope Discovery & Context Expansion

Direct Impact

Related Components

Testing Coverage

Integration Points

Use Case Example

Security Considerations

Performance Impact

Migration Notes

Review Focus Areas

Uh oh!

probelabs bot commented Feb 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Security Issues (8)

Architecture Issues (8)

Performance Issues (8)

Quality Issues (13)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

buger commented Feb 1, 2026 •

edited

Loading

probelabs bot commented Feb 1, 2026 •

edited

Loading

probelabs bot commented Feb 1, 2026 •

edited

Loading