Skip to content

Conversation

@buger
Copy link
Contributor

@buger buger commented Feb 1, 2026

Summary

  • Add ai_mcp_servers_js property to AI checks that allows dynamic computation of MCP servers at runtime
  • Implement secure sandbox evaluation using @nyariv/sandboxjs
  • Add environment variable resolution for ${VAR} placeholders in MCP server env configs
  • Enables conditional loading of expensive MCP servers (Jira, Confluence, Zendesk) based on intent routing tags
  • Support workflow tool entries in ai_mcp_servers_js (returns {workflow: "workflow-id", ...})
  • Fix parallel sandbox execution errors by passing log through scope instead of header declaration
  • Add detailed logging for MCP SSE communication and workflow tool execution
  • Make worktree IDs deterministic for reuse across multiple calls
  • Update @probelabs/probe to v0.6.0-rc210
  • Add enableTasks and MCP method filtering (allowedMethods/blockedMethods) options

Use Case

This feature supports the OEL Bot simplification effort where we want MCP servers to load conditionally based on tags from intent routing:

  • jira tag → loads Atlassian Jira MCP server
  • confluence tag → loads Atlassian Confluence MCP server
  • zendesk tag → loads Zendesk MCP server
  • No tags → no MCP servers loaded (faster startup, lower resource usage)

Example Usage

ai_mcp_servers_js: |
  const servers = {};
  const tags = outputs['route-intent']?.tags || [];
  if (tags.includes('jira')) {
    servers.jira = {
      command: "npx",
      args: ["-y", "@aashari/mcp-server-atlassian-jira"],
      env: { 
        ATLASSIAN_SITE_NAME: "${ATLASSIAN_SITE_NAME}",
        ATLASSIAN_API_TOKEN: "${ATLASSIAN_API_TOKEN}"
      }
    };
  }
  // Workflow tools can also be added dynamically
  if (tags.includes('code')) {
    servers['code-talk'] = { workflow: 'tyk-code-talk' };
  }
  return servers;

Test plan

  • Added unit tests for evaluateMcpServersJs() method
  • Tests verify sandbox evaluation with access to outputs, inputs, pr, files, env, memory
  • Tests verify environment variable resolution for ${VAR} placeholders
  • Tests verify parallel sandbox execution doesn't cause "already declared" errors
  • All 226 test suites pass

🤖 Generated with Claude Code

Add support for dynamically computing MCP servers in AI checks via JavaScript
expressions. This enables conditional loading of MCP servers based on routing
tags, outputs, or other runtime context.

Key changes:
- Add ai_mcp_servers_js property to CheckConfig interface
- Implement evaluateMcpServersJs() in AiCheckProvider using sandbox evaluation
- Add environment variable resolution for ${VAR} placeholders in MCP server
  env configs, enabling secure credential injection
- Update config schema to include the new property
- Add comprehensive tests for the feature

This feature allows workflows to conditionally load expensive MCP servers
(like Jira, Confluence, Zendesk) only when needed based on intent routing tags.

Example usage:
```yaml
ai_mcp_servers_js: |
  const servers = {};
  const tags = outputs['route-intent']?.tags || [];
  if (tags.includes('jira')) {
    servers.jira = {
      command: "npx",
      args: ["-y", "@aashari/mcp-server-atlassian-jira"],
      env: { ATLASSIAN_API_TOKEN: "${ATLASSIAN_API_TOKEN}" }
    };
  }
  return servers;
```

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@probelabs
Copy link
Contributor

probelabs bot commented Feb 1, 2026

PR Overview: Dynamic MCP Server Selection with JavaScript Expressions

Summary

This PR introduces ai_mcp_servers_js and ai_custom_tools_js properties to enable dynamic, runtime computation of MCP servers and custom tools for AI checks. The implementation uses a secure JavaScript sandbox (@nyariv/sandboxjs) to evaluate expressions with access to workflow context (outputs, inputs, PR metadata, files, environment variables, and memory).

Files Changed

Core Implementation:

  • src/providers/ai-check-provider.ts (+336/-8): Implemented evaluateMcpServersJs() and evaluateCustomToolsJs() methods with sandbox integration, environment variable resolution, and merge logic for dynamic/static configurations
  • src/types/config.ts (+68/-3): Added TypeScript interface definitions with comprehensive JSDoc examples for ai_mcp_servers_js and ai_custom_tools_js
  • src/generated/config-schema.ts (+60/-9): Auto-generated schema updated with new properties and detailed descriptions

Supporting Changes:

  • src/utils/sandbox.ts (+22/-8): Fixed parallel execution issue by passing log through scope instead of header declaration
  • src/providers/mcp-custom-sse-server.ts (+42/-9): Enhanced logging for MCP SSE communication and workflow tool execution
  • src/providers/workflow-tool-executor.ts (+11/-0): Added detailed logging for workflow tool outputs
  • src/utils/workspace-manager.ts (+14/-1): Made worktree IDs deterministic for reuse across multiple calls
  • src/utils/worktree-manager.ts (+5/-2): Added project reuse logic to prevent duplicate checkouts
  • src/ai-review-service.ts (+7/-0): Added enableTasks configuration option

Testing:

  • tests/unit/providers/ai-check-provider.test.ts (+163/-0): Comprehensive unit tests covering expression evaluation, context access, error handling, and edge cases

Dependencies:

  • package.json & package-lock.json: Updated @probelabs/probe to v0.6.0-rc210

Architecture & Impact Assessment

What This PR Accomplishes

  1. Dynamic MCP Server Selection: AI checks can now conditionally load expensive MCP servers (Jira, Confluence, Zendesk) based on intent routing tags or other runtime conditions
  2. Dynamic Custom Tools: Custom tools can be computed at runtime based on workflow outputs, PR context, or other criteria
  3. Environment Variable Resolution: Supports ${VAR} placeholder syntax in MCP server env configs for secure credential injection
  4. Secure Execution: Uses @nyariv/sandboxjs with hardened prototype whitelists to prevent unauthorized access
  5. Unified MCP Entry Format: Single McpServerConfig interface supports stdio servers, SSE/HTTP endpoints, and workflow tool references

Key Technical Changes

  • Sandbox Integration: Leverages existing src/utils/sandbox.ts infrastructure (shared with failure conditions, routing, script provider)
  • Context Building: Provides access to outputs, inputs, pr, files, env, and memory to JavaScript expressions
  • Environment Filtering: Uses centralized buildSandboxEnv() utility from src/utils/env-exposure.ts for consistent security filtering
  • Merge Strategy: Dynamic configurations override static ones; dynamic and static custom tools are merged (deduplicated by name)
  • Error Handling: Graceful degradation on evaluation errors with detailed logging

Affected System Components

graph TD
    A[AI Check Provider] --> B[evaluateMcpServersJs]
    A --> C[evaluateCustomToolsJs]
    B --> D[Sandbox Evaluation]
    C --> D
    D --> E[Context: outputs, inputs, pr, files, env, memory]
    B --> F[Environment Variable Resolution]
    F --> G[MCP Server Config]
    C --> H[Custom Tools Array]
    G --> I[MCP Server Initialization]
    H --> J[Custom Tools SSE Server]
    
    K[Existing Sandbox Usage] --> D
    K --> L[Failure Conditions]
    K --> M[Routing Logic]
    K --> N[Script Provider]
    
    O[env-exposure.ts] --> P[buildSandboxEnv]
    P --> D
Loading

Component Relationships

  • Sandbox Infrastructure: Centralized in src/utils/sandbox.ts with prototype whitelists for Array, String, Object, Map, Set, Date, RegExp
  • Shared Patterns: Context building mirrors existing patterns in failure condition evaluator and routing logic
  • Security: Uses centralized buildSandboxEnv() utility for consistent environment filtering across providers
  • Environment Resolution: Integrates with existing EnvironmentResolver.resolveValue() for ${VAR} placeholder syntax

Scope Discovery & Context Expansion

Direct Impact

  • AI Check Provider: Core implementation in src/providers/ai-check-provider.ts
  • Configuration Schema: Auto-generated schema updated to include new properties
  • Type Definitions: TypeScript interfaces extended with JSDoc documentation

Related Components

  • Memory Store: Accessible via memory context object (get, has, list, getAll operations)
  • Environment Resolver: Used for ${VAR} placeholder resolution in MCP server env configs
  • Workflow Tool Executor: Validates workflow reference objects in custom tools
  • Custom Tools SSE Server: Ephemeral server for exposing custom tools to AI agents
  • env-exposure.ts: Centralized utility for safe environment variable filtering

Testing Coverage

Unit tests verify:

  • Expression evaluation with various return types
  • Context access (outputs, inputs, pr, files, env, memory)
  • Error handling (syntax errors, runtime errors, invalid return types)
  • Log helper injection for debugging
  • Filtering of invalid items from result arrays
  • Merge behavior when both dynamic and static tools are specified

Integration Points

  • Intent Routing: Primary use case is conditional loading based on outputs['route-intent']?.tags
  • OEL Bot: Target integration for Jira/Confluence/Zendesk MCP servers
  • Existing MCP Servers: Backward compatible with static ai_mcp_servers configuration

Use Case Example

checks:
  ai-review:
    type: ai
    prompt: "Review this PR"
    ai_mcp_servers_js: |
      const servers = {};
      const tags = outputs['route-intent']?.tags || [];
      if (tags.includes('jira')) {
        servers.jira = {
          command: "npx",
          args: ["-y", "@aashari/mcp-server-atlassian-jira"],
          env: { 
            ATLASSIAN_SITE_NAME: "${ATLASSIAN_SITE_NAME}",
            ATLASSIAN_API_TOKEN: "${ATLASSIAN_API_TOKEN}"
          }
        };
      }
      return servers;

Security Considerations

  • Sandbox Hardening: Prototype whitelists restrict access to dangerous methods
  • Environment Filtering: Uses centralized buildSandboxEnv() with denylist for sensitive keys (tokens, secrets, passwords)
  • No Side Effects: Expressions cannot modify global state or access external resources
  • Error Isolation: Evaluation errors do not crash the workflow; fallback to static configs

Performance Impact

  • Sandbox Initialization: One-time cost per provider instance (cached in this.sandbox)
  • Expression Evaluation: Minimal overhead for typical expressions (simple conditionals and object construction)
  • Resource Savings: Conditional loading avoids initializing expensive MCP servers when not needed

Migration Notes

  • Backward Compatible: Existing static ai_mcp_servers and ai_custom_tools configurations continue to work
  • Merge Behavior: Dynamic and static custom tools are merged (deduplicated by name)
  • Override Priority: Dynamic MCP servers completely override static configuration (not merged)

Review Focus Areas

  1. Sandbox Security: Review prototype whitelists in src/utils/sandbox.ts for completeness
  2. Environment Filtering: Verify buildSandboxEnv() patterns cover all sensitive key types
  3. Error Handling: Ensure graceful degradation doesn't hide configuration errors
  4. Test Coverage: Validate edge cases (empty results, malformed expressions, circular dependencies)
  5. Documentation: Confirm JSDoc examples are clear and actionable
  6. Code Duplication: Consider extracting common context building logic into shared method
Metadata
  • Review Effort: 3 / 5
  • Primary Label: feature

Powered by Visor from Probelabs

Last updated: 2026-02-02T14:11:13.786Z | Triggered by: pr_updated | Commit: 710def7

💡 TIP: You can chat with Visor using /visor ask <your question>

@probelabs
Copy link
Contributor

probelabs bot commented Feb 1, 2026

Security Issues (8)

Severity Location Issue
🟢 Info src/providers/ai-check-provider.ts:1680-1772
The evaluateMcpServersJs() method exposes extensive context to the sandbox including outputs, inputs, pr metadata, files, env, and memory. While the sandbox itself is secure, the exposure of dependency outputs and PR information could leak sensitive data if the JavaScript expression logs or captures this data. The current implementation doesn't restrict what the expression can do with this data beyond sandbox execution.
💡 SuggestionConsider adding data sanitization for sensitive fields in PR metadata and dependency outputs before passing to sandbox. Implement rate limiting or resource quotas to prevent abuse through complex expressions. Add audit logging for sandbox evaluations.
🟢 Info src/utils/sandbox.ts:239-268
The log function injected into sandbox scope uses console.log with user-controlled prefix. While the prefix is sanitized, the log arguments themselves are not sanitized before being passed to console.log. This could allow log injection attacks if the sandbox code logs specially crafted strings.
💡 SuggestionConsider sanitizing log arguments to prevent log injection. Replace control characters and ANSI escape sequences in logged values. Use a dedicated logging utility that handles untrusted input safely.
🟢 Info src/providers/ai-check-provider.ts:1032-1040
Error handling for ai_mcp_servers_js evaluation logs detailed error messages including the error message. If the error message contains sensitive information from the JavaScript expression (e.g., leaked credentials, internal paths), it will be logged. The catch block logs the full error message which could expose sensitive data.
💡 SuggestionSanitize error messages before logging them. Remove or redact potentially sensitive information. Consider logging only a generic error message in production while preserving detailed errors for debugging in non-production environments.
🟡 Warning src/providers/ai-check-provider.ts:1057-1068
Environment variable resolution in MCP server configs may allow command injection. The code resolves ${VAR} placeholders in env configs but doesn't validate that resolved values don't contain shell metacharacters before being passed to MCP server commands. An attacker who can control environment variables could inject malicious commands.
💡 SuggestionValidate and sanitize resolved environment variable values before passing them to MCP server commands. Consider using a whitelist of allowed characters or proper shell escaping for values that will be used in command contexts.
🟡 Warning src/providers/ai-check-provider.ts:1778-1799
The buildSafeEnv() method uses regex patterns to filter sensitive environment variables, but the patterns are incomplete and may miss sensitive keys. Patterns like /auth/i will filter 'AUTH_TOKEN' but could miss variants like 'AUTHORIZATION', 'AUTH_KEY', 'APIKEY', 'SECRET_KEY', etc. Additionally, the filtering happens after the values are already accessed from process.env.
💡 SuggestionUse a whitelist approach instead of blacklist for environment variable exposure. Only expose explicitly allowed, non-sensitive environment variables to the sandbox. Consider reusing the existing buildSandboxEnv() function from env-exposure.ts which has more comprehensive filtering.
🟡 Warning src/providers/ai-check-provider.ts:1680-1772
The evaluateMcpServersJs() method validates that server configs have command, url, or workflow properties, but doesn't validate the actual values. Malicious configurations could inject dangerous commands, point to malicious URLs, or reference unauthorized workflows. The validation only checks property existence, not content safety.
💡 SuggestionAdd validation for command values to detect shell metacharacters (similar to mcp-check-provider.ts line 120-123). Validate URLs to ensure they use allowed protocols and don't point to internal addresses. Validate workflow references against an allowlist.
🟡 Warning src/types/config.ts:368-401
The McpServerConfig interface allows optional command, url, and workflow properties without strict validation requirements. This enables configuration of MCP servers that could execute arbitrary commands (command), connect to arbitrary URLs (url), or invoke any workflow (workflow) without proper authorization checks. The interface doesn't enforce security constraints.
💡 SuggestionAdd validation logic to enforce security constraints when processing McpServerConfig. Commands should be validated against an allowlist. URLs should be restricted to allowed protocols and optionally to specific domains. Workflow references should be validated against an allowlist or namespace restrictions.
🟡 Warning src/providers/ai-check-provider.ts:1131-1177
Workflow entries extracted from ai_mcp_servers are not validated before being added to customToolsToLoad. The code checks for the presence of 'workflow' property but doesn't validate that the workflow ID is allowed or exists. This could allow unauthorized workflow execution if an attacker can influence the ai_mcp_servers_js expression.
💡 SuggestionValidate workflow references against an allowlist before adding them to custom tools. Ensure the workflow exists and the current context has permission to execute it. Add namespace restrictions to prevent accessing workflows from other projects or contexts.

Architecture Issues (8)

Severity Location Issue
🟠 Error src/providers/ai-check-provider.ts:1778
buildSafeEnv() duplicates existing buildSandboxEnv() from src/utils/env-exposure.ts. The new implementation uses regex patterns to filter sensitive keys, while buildSandboxEnv() uses a denylist with VISOR_ALLOW_ENV/VISOR_DENY_ENV configuration. This creates two different security policies for environment variable exposure.
💡 SuggestionReuse the existing buildSandboxEnv() utility from src/utils/env-exposure.ts instead of implementing a new filtering mechanism. This ensures consistent security policy across all sandbox usage points and maintains a single source of truth for environment variable exposure.
🔧 Suggested Fix
Replace buildSafeEnv() method with:

private buildSafeEnv(): Record<string, string> {
const { buildSandboxEnv } = require('../utils/env-exposure');
return buildSandboxEnv(process.env);
}

🟢 Info src/providers/ai-check-provider.ts:1063
Environment variable resolution is applied to MCP server env configs but not to other configuration values that might contain ${VAR} placeholders. This creates inconsistency in how environment variables are handled throughout the configuration.
💡 SuggestionConsider applying EnvironmentResolver.resolveValue() consistently across all configuration values that support environment variable placeholders, or document why env resolution is only needed for MCP server configs.
🔧 Suggested Fix
N/A - this is a documentation/suggestion item
🟢 Info src/utils/sandbox.ts:246
The fix for parallel sandbox execution (passing log through scope instead of header) is a pragmatic solution but creates an implicit dependency between injectLog option and user code. If injectLog=false, user code can still declare its own log function, which is good, but the dual mechanism (scope vs header) is subtle.
💡 SuggestionDocument the injectLog behavior clearly: when true, log is injected into scope and user code should not redeclare it; when false, user code may declare its own log. Consider adding a runtime check to detect if user code declares log when injectLog=true.
🔧 Suggested Fix
N/A - this is a documentation/suggestion item
🟢 Info src/utils/worktree-manager.ts:109
Making worktree IDs deterministic by removing Date.now() from the hash is a good change for reuse, but the comment doesn't explain the trade-off: multiple calls with the same repo+ref will now return the same worktree, which assumes the worktree content doesn't change between calls.
💡 SuggestionDocument the assumption that worktree content is stable for a given repo+ref combination. If this assumption is violated (e.g., if refs can move), consider adding a timestamp or commit hash to the worktree ID.
🔧 Suggested Fix
N/A - this is a documentation/suggestion item
🟡 Warning src/providers/ai-check-provider.ts:1606
evaluateCustomToolsJs() and evaluateMcpServersJs() contain 95% identical code for building the jsContext object. The only difference is the logPrefix parameter. This violates DRY principles and makes maintenance harder.
💡 SuggestionExtract the common context-building logic into a private method buildJsContext() that takes prInfo, dependencyResults, and config as parameters. Both methods can then call this helper and add their specific logPrefix.
🔧 Suggested Fix
Extract to shared method:

private buildJsContext(
prInfo: PRInfo,
dependencyResults: Map<string, ReviewSummary>,
config: CheckProviderConfig
): Record<string, unknown> {
const outputs: Record<string, unknown> = {};
for (const [checkId, result] of dependencyResults.entries()) {
const summary = result as ReviewSummary & { output?: unknown };
outputs[checkId] = summary.output !== undefined ? summary.output : summary;
}

return {
outputs,
inputs: (config as any).inputs || {},
pr: {
number: prInfo.number,
title: prInfo.title,
description: prInfo.body,
author: prInfo.author,
branch: prInfo.head,
base: prInfo.base,
authorAssociation: prInfo.authorAssociation,
},
files: prInfo.files?.map(f => ({
filename: f.filename,
status: f.status,
additions: f.additions,
deletions: f.deletions,
changes: f.changes,
})) || [],
env: this.buildSafeEnv(),
memory: (config as any).__memoryAccessor || {},
};
}

🟡 Warning src/providers/ai-check-provider.ts:1035
The MCP server configuration merge logic has 6 steps (global, check-level, ai.mcpServers, ai_mcp_servers_js, env resolution, custom tools) with complex override semantics. The dynamic ai_mcp_servers_js completely overrides static configs rather than merging, which is inconsistent with how ai_custom_tools_js works (merges with static).
💡 SuggestionConsider making ai_mcp_servers_js merge behavior consistent with ai_custom_tools_js (merge instead of override), or document why override is the correct behavior. The current inconsistency makes the API harder to understand and use correctly.
🔧 Suggested Fix
Change line 1047 from:
Object.assign(mcpServers, dynamicServers);
To a merge that preserves static servers:
for (const [name, cfg] of Object.entries(dynamicServers)) {
  mcpServers[name] = cfg;
}
🟡 Warning src/types/config.ts:371
McpServerConfig has become a union type disguised as an interface with optional properties. The 'type detection by property presence' pattern (command=stdio, url=sse/http, workflow=workflow) is implicit and error-prone. This violates the 'explicit is better than implicit' principle.
💡 SuggestionConsider using a discriminated union type with explicit 'type' field for better type safety and clearer intent:

type McpServerConfig =
| { type: 'stdio'; command: string; args?: string[]; env?: Record<string, string> }
| { type: 'sse' | 'http'; url: string; transport?: 'sse' | 'http' }
| { type: 'workflow'; workflow: string; inputs?: Record<string, unknown>; description?: string };

This makes the type system enforce correct usage and prevents invalid configurations.

🔧 Suggested Fix
Replace interface with discriminated union:

export type McpServerConfig =
| McpStdioServer
| McpHttpServer
| McpWorkflowTool;

interface McpStdioServer {
type: 'stdio';
command: string;
args?: string[];
env?: Record<string, string>;
allowedMethods?: string[];
blockedMethods?: string[];
}

interface McpHttpServer {
type: 'sse' | 'http';
url: string;
transport?: 'sse' | 'http';
allowedMethods?: string[];
blockedMethods?: string[];
}

interface McpWorkflowTool {
type: 'workflow';
workflow: string;
inputs?: Record<string, unknown>;
description?: string;
allowedMethods?: string[];
blockedMethods?: string[];
}

🟡 Warning src/providers/ai-check-provider.ts:1143
Workflow tool extraction from ai_mcp_servers is a special case that treats workflow entries differently from other MCP server types. This creates an implicit convention where 'workflow' property triggers special handling, which is not obvious from the type definition.
💡 SuggestionMake workflow tools a first-class concept with explicit configuration (ai_workflow_tools or ai_workflow_tools_js) rather than overloading ai_mcp_servers. This would eliminate the need for extraction logic and make the API more explicit.
🔧 Suggested Fix
Consider adding separate config properties:

// In CheckConfig interface:
ai_workflow_tools?: Array<string | WorkflowToolReference>;
ai_workflow_tools_js?: string;

// Then remove the workflow extraction logic from lines 1143-1173

Performance Issues (8)

Severity Location Issue
🟢 Info src/providers/ai-check-provider.ts:1032
The code iterates through mcpServers values twice: once for environment variable resolution and once for workflow entry extraction. These could be combined into a single pass.
💡 SuggestionCombine the environment variable resolution loop with the workflow entry extraction loop into a single iteration over mcpServers.
🔧 Suggested Fix
// 5. Resolve environment variable placeholders and extract workflow entries in single pass
const workflowEntriesFromMcp: WorkflowToolReference[] = [];
const mcpEntriesToRemove: string[] = [];

for (const [serverName, serverConfig] of Object.entries(mcpServers)) {
// Resolve environment variables
if (serverConfig.env) {
const resolvedEnv: Record<string, string> = {};
for (const [key, value] of Object.entries(serverConfig.env)) {
if (typeof value === 'string') {
resolvedEnv[key] = String(EnvironmentResolver.resolveValue(value));
} else {
resolvedEnv[key] = String(value);
}
}
serverConfig.env = resolvedEnv;
}

// Extract workflow entries
const cfg = serverConfig as unknown as Record<string, unknown>;
if (cfg.workflow && typeof cfg.workflow === 'string') {
workflowEntriesFromMcp.push({
workflow: cfg.workflow as string,
args: cfg.inputs as Record<string, unknown> | undefined,
});
mcpEntriesToRemove.push(serverName);
logger.debug(
[AICheckProvider] Extracted workflow tool &#39;${serverName}&#39; from ai_mcp_servers
);
}
}

// Remove workflow entries from mcpServers
for (const name of mcpEntriesToRemove) {
delete mcpServers[name];
}

🟢 Info src/providers/ai-check-provider.ts:1079
The code performs multiple iterations over custom tools arrays for deduplication. When both dynamic and static tools exist, it iterates through static tools twice (once to create Set, once to merge).
💡 SuggestionOptimize the merge logic to use a single pass with a Set for O(1) lookups.
🔧 Suggested Fix
// Option 2: Check for ai_custom_tools (static - backward compatible)
// Merge with dynamic tools if both are specified
const staticCustomTools = this.getCustomToolsForAI(config);
if (staticCustomTools.length > 0) {
  if (customToolsToLoad.length > 0) {
    // Merge dynamic and static tools (avoid duplicates by name) - single pass
    const existingNames = new Set(
      customToolsToLoad.map(item => (typeof item === 'string' ? item : item.workflow))
    );
    for (const tool of staticCustomTools) {
      const name = typeof tool === 'string' ? tool : tool.workflow;
      if (!existingNames.has(name)) {
        existingNames.add(name); // Add to Set for future lookups
        customToolsToLoad.push(tool);
      }
    }
  } else {
    customToolsToLoad = staticCustomTools;
    customToolsServerName = '__custom_tools__';
  }
}
🟢 Info src/providers/ai-check-provider.ts:1167
Similar inefficient iteration pattern when merging workflow entries with existing custom tools. Creates a Set from existing tools then iterates workflow entries without adding new names to the Set for potential future lookups.
💡 SuggestionAdd workflow tool names to the existingNames Set during iteration to maintain O(1) lookup performance if more tools are added later.
🔧 Suggested Fix
// Merge workflow entries with other custom tools
if (workflowEntriesFromMcp.length > 0) {
  if (customToolsToLoad.length > 0) {
    // Avoid duplicates
    const existingNames = new Set(
      customToolsToLoad.map(item => (typeof item === 'string' ? item : item.workflow))
    );
    for (const wf of workflowEntriesFromMcp) {
      if (!existingNames.has(wf.workflow)) {
        existingNames.add(wf.workflow); // Maintain Set for consistency
        customToolsToLoad.push(wf);
      }
    }
  } else {
    customToolsToLoad = workflowEntriesFromMcp;
  }
  customToolsServerName = '__tools__';
}
🟢 Info src/utils/sandbox.ts:228
compileAndRun performs multiple string operations on safePrefix: replace() called 3 times, then slice(). These could be combined into a single regex replace.
💡 SuggestionCombine the sanitization operations into a single regex replace for better performance.
🔧 Suggested Fix
const inject = opts?.injectLog === true;
let safePrefix = String(opts?.logPrefix ?? '[sandbox]');
// Sanitize prefix aggressively: drop control chars and risky tokens, limit length
safePrefix = safePrefix
  .replace(/[\r\n\t\0`$\\]|\$\{/g, '') // Combined regex for all patterns
  .slice(0, 64);
🟡 Warning src/providers/ai-check-provider.ts:1778
buildSafeEnv() iterates through all process.env entries on every evaluation call, performing regex pattern matching against each key. This is O(n) where n is the number of environment variables. For frequent sandbox evaluations, this creates unnecessary overhead.
💡 SuggestionCache the filtered environment result since process.env doesn't change during execution. Store the result in a private property and reuse it across multiple evaluateMcpServersJs/evaluateCustomToolsJs calls.
🔧 Suggested Fix
private cachedSafeEnv: Record<string, string> | null = null;

private buildSafeEnv(): Record<string, string> {
if (this.cachedSafeEnv) {
return this.cachedSafeEnv;
}
const sensitivePatterns = [
/api.?key/i,
/secret/i,
/token/i,
/password/i,
/credential/i,
/auth/i,
/private/i,
];
const safeEnv: Record<string, string> = {};

for (const [key, value] of Object.entries(process.env)) {
if (value === undefined) continue;
const isSensitive = sensitivePatterns.some(pattern => pattern.test(key));
if (!isSensitive) {
safeEnv[key] = value;
}
}

this.cachedSafeEnv = safeEnv;
return safeEnv;
}

🟡 Warning src/providers/ai-check-provider.ts:1606
evaluateCustomToolsJs and evaluateMcpServersJs both build identical context objects (outputs, pr, files, env, memory) independently. This duplicates work when both expressions are evaluated in the same check execution.
💡 SuggestionExtract the context building logic into a shared private method that both evaluateCustomToolsJs and evaluateMcpServersJs can call. This reduces code duplication and ensures consistent context structure.
🔧 Suggested Fix
private buildSandboxContext(
  prInfo: PRInfo,
  dependencyResults: Map<string, ReviewSummary>,
  config: CheckProviderConfig
): Record<string, unknown> {
  const outputs: Record<string, unknown> = {};
  for (const [checkId, result] of dependencyResults.entries()) {
    const summary = result as ReviewSummary & { output?: unknown };
    outputs[checkId] = summary.output !== undefined ? summary.output : summary;
  }

return {
outputs,
inputs: (config as any).inputs || {},
pr: {
number: prInfo.number,
title: prInfo.title,
description: prInfo.body,
author: prInfo.author,
branch: prInfo.head,
base: prInfo.base,
authorAssociation: prInfo.authorAssociation,
},
files:
prInfo.files?.map(f => ({
filename: f.filename,
status: f.status,
additions: f.additions,
deletions: f.deletions,
changes: f.changes,
})) || [],
env: this.buildSafeEnv(),
memory: (config as any).__memoryAccessor || {},
};
}

🟡 Warning src/providers/ai-check-provider.ts:1680
evaluateMcpServersJs duplicates the same context building logic as evaluateCustomToolsJs, creating outputs object and jsContext with identical structure.
💡 SuggestionUse the shared buildSandboxContext method instead of duplicating the context building logic.
🔧 Suggested Fix
private evaluateMcpServersJs(
  expression: string,
  prInfo: PRInfo,
  dependencyResults: Map<string, ReviewSummary>,
  config: CheckProviderConfig
): Record<string, import('../types/config').McpServerConfig> {
  if (!this.sandbox) {
    this.sandbox = createSecureSandbox();
  }

const jsContext = this.buildSandboxContext(prInfo, dependencyResults, config);
// ... rest of method

🟡 Warning src/providers/ai-check-provider.ts:1606
evaluateCustomToolsJs duplicates the same context building logic as evaluateMcpServersJs, creating outputs object and jsContext with identical structure.
💡 SuggestionUse the shared buildSandboxContext method instead of duplicating the context building logic.
🔧 Suggested Fix
private evaluateCustomToolsJs(
  expression: string,
  prInfo: PRInfo,
  dependencyResults: Map<string, ReviewSummary>,
  config: CheckProviderConfig
): Array<string | WorkflowToolReference> {
  if (!this.sandbox) {
    this.sandbox = createSecureSandbox();
  }

const jsContext = this.buildSandboxContext(prInfo, dependencyResults, config);
// ... rest of method

Quality Issues (13)

Severity Location Issue
🟠 Error src/providers/ai-check-provider.ts:1680-1772
Missing unit tests for evaluateMcpServersJs() method. The PR description claims 'Added unit tests for evaluateMcpServersJs() method' but the test file only contains tests for evaluateCustomToolsJs(). Critical functionality for dynamic MCP server selection is untested.
💡 SuggestionAdd comprehensive unit tests for evaluateMcpServersJs() covering: 1) Valid server config objects with command/url/workflow properties, 2) Invalid return types (array, null, primitive), 3) Invalid server config objects, 4) Context access (outputs, inputs, pr, files, env, memory), 5) Syntax and runtime error handling, 6) Log helper injection
🟢 Info src/utils/sandbox.ts:239-258
Fix for parallel sandbox execution passes 'log' through scope instead of header declaration, but the comment mentions 'shared global state'. However, each call to compileAndRun() creates its own scopeWithLog object, so there's no actual shared state. The fix may be addressing a symptom rather than the root cause.
💡 SuggestionVerify the actual race condition scenario. If the issue was multiple executions sharing the same sandbox instance with header-declared 'log', document that clearly. Consider making sandbox instances immutable after compilation.
🟢 Info src/utils/worktree-manager.ts:100-119
Deterministic worktree ID generation uses MD5 hash truncated to 8 characters. While MD5 is sufficient for this use case, 8 hex characters (32 bits) has a non-trivial collision probability with many repositories. The comment says 'enables worktree reuse' but doesn't address collision handling.
💡 SuggestionConsider using 12+ characters for better collision resistance. Add collision detection/handling logic if a worktree with the same ID but different repo/ref already exists.
🟢 Info src/providers/mcp-custom-sse-server.ts:489-510
SSE message logging creates string previews and logs at debug level for every message. For high-frequency MCP communication, this could cause performance issues and log bloat. The preview calculation (substring operations) happens even when debug logging is disabled.
💡 SuggestionMove preview calculation inside a conditional check for debug level: if (logger.isDebugEnabled()) { ... }. Consider making preview length configurable.
🟢 Info src/providers/workflow-tool-executor.ts:226-250
Workflow completion logging uses logger.info for output keys and logger.debug for output preview. This creates inconsistent visibility - users see keys but not values. The preview truncation at 500 chars may hide important information.
💡 SuggestionConsider using debug for both or info for both. Make preview length configurable. Consider adding a separate 'verbose' mode for full output logging.
🟡 Warning src/providers/ai-check-provider.ts:1599-1678
Significant code duplication between evaluateCustomToolsJs() and evaluateMcpServersJs(). Both methods build identical context objects (outputs, inputs, pr, files, env, memory) and use the same sandbox evaluation pattern. This violates DRY principle and increases maintenance burden.
💡 SuggestionExtract common context building logic into a private method buildJsContext(prInfo, dependencyResults, config) and refactor both evaluation methods to use it. Consider a generic evaluateJsExpression<T>(expression, context, options) method.
🟡 Warning src/providers/ai-check-provider.ts:1774-1808
buildSafeEnv() uses incomplete regex patterns for filtering sensitive environment variables. Patterns like /auth/i and /private/i are too broad and may filter legitimate non-sensitive variables (e.g., 'AUTHOR', 'PUBLISH_PRIVATE'). Conversely, sensitive variables with non-standard naming (e.g., 'KEY', 'CREDENTIALS_FILE') would leak through.
💡 SuggestionUse a whitelist approach instead of blacklist. Only expose known-safe environment variables (e.g., 'CI', 'NODE_ENV', 'PATH', 'HOME', 'USER'). Alternatively, use more specific patterns like /api[_-]?key/i, /secret[_-]?key/i, /access[_-]?token/i, and add word boundaries.
🟡 Warning src/providers/ai-check-provider.ts:1035-1054
Error handling for ai_mcp_servers_js evaluation swallows errors and continues silently. While this prevents workflow crashes, it makes configuration errors invisible to users. The error is logged but the check proceeds without dynamic servers, potentially causing unexpected behavior.
💡 SuggestionConsider adding a configuration option (e.g., 'strict_js_evaluation') that allows users to fail the check on JavaScript evaluation errors. Alternatively, increment a metric or add a warning annotation to the PR comment when dynamic evaluation fails.
🟡 Warning src/providers/ai-check-provider.ts:1068-1080
Similar error swallowing issue for ai_custom_tools_js evaluation. Errors are logged but the check continues with fallback to static tools, which may hide misconfigurations.
💡 SuggestionSame as above - consider adding strict mode or visibility for evaluation failures.
🟡 Warning src/providers/ai-check-provider.ts:1135-1180
Complex merge strategy for custom tools across 4 different sources (ai_custom_tools_js, ai_custom_tools, MCP servers with 'tools:', workflow entries from MCP) is difficult to reason about and has potential for subtle bugs. The deduplication logic only checks workflow names, not full tool configurations.
💡 SuggestionDocument the merge strategy clearly with examples. Consider simplifying to 2-3 sources max. Add tests verifying merge behavior when multiple sources specify the same tool with different configurations.
🟡 Warning src/utils/workspace-manager.ts:213-238
Workspace reuse logic checks for existing repository + worktreePath combinations but doesn't verify if the worktree is still valid or up-to-date. A stale worktree from a previous run could be reused, potentially causing incorrect results.
💡 SuggestionAdd validation to ensure the reused worktree exists and is valid (e.g., check git status, verify it's not corrupted). Consider adding a timestamp or checksum to detect stale worktrees.
🟡 Warning tests/unit/providers/ai-check-provider.test.ts:626-790
Tests use hardcoded string values without clear semantic meaning. For example, expect(result).toEqual(['engineer-tool', 'jira-tool']) uses magic strings that appear to be reverse-engineered from the implementation rather than derived from requirements.
💡 SuggestionDefine constants at the top of the test file with clear semantic meaning (e.g., const EXPECTED_ENGINEER_TOOLS = ['engineer-tool', 'jira-tool']). Add comments explaining why these specific values are expected based on the test scenario.
🟡 Warning tests/unit/providers/ai-check-provider.test.ts:626-790
Test suite lacks comprehensive negative testing for evaluateCustomToolsJs(). While there are tests for syntax/runtime errors, there are no tests for: 1) Malicious code attempting to escape sandbox, 2) Infinite loops or resource exhaustion, 3) Prototype pollution attempts, 4) Very large return values, 5) Circular references in returned objects.
💡 SuggestionAdd security-focused negative tests to verify sandbox hardening. Test with attempts to access process, require, __proto__, constructor, etc. Add tests for resource limits (timeout, memory).

Powered by Visor from Probelabs

Last updated: 2026-02-02T14:11:17.462Z | Triggered by: pr_updated | Commit: 710def7

💡 TIP: You can chat with Visor using /visor ask <your question>

buger and others added 4 commits February 1, 2026 15:57
The worktree tests (.visor/worktrees/) run against historical code versions
that have singleton state leakage issues (MemoryStore). Tests pass individually
but fail when run in sequence due to shared state.

This change explicitly specifies test directories (defaults/, tests/, examples/)
rather than running all tests recursively, ensuring pre-commit hooks are reliable.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Update validation to accept entries with 'workflow' property
- Add Option 4 to extract workflow entries from ai_mcp_servers
- Update McpServerConfig type to support unified flat detection:
  - command → stdio MCP server
  - url → SSE/HTTP MCP server
  - workflow → workflow tool reference
  - empty {} → auto-detect from tools: section

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Workflow tools like tyk-code-talk can run complex operations
(code search, AI analysis) that take several minutes. The default
30-second MCP timeout was causing these calls to fail.

Set 10-minute (600000ms) timeout for the __tools__ SSE server
that hosts workflow tools.

Co-Authored-By: Claude Opus 4.5 <[email protected]>
- Fix sandbox parallel execution errors by passing `log` through scope
  instead of declaring in code header (prevents "already declared" errors)
- Only inject `log` when injectLog=true, allowing callers to define their own
- Fix wrapFunction default to handle partial options correctly
- Add detailed logging for MCP SSE communication and workflow tool execution
- Make worktree IDs deterministic for reuse across multiple calls
- Reuse existing project paths in workspace manager to prevent duplicates
- Update @probelabs/probe to v0.6.0-rc210
- Add enableTasks option propagation to ProbeAgent

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants