feat(mcp): Add monorepo submodule support with caching and security enhancements #1082

RyderFreeman4Logos · 2026-01-08T14:04:45Z

Summary

Add monorepo submodule support with intelligent caching
Add file lock mechanism to prevent concurrent repomix execution
Add path traversal protection for monorepo tools
Add schema validation for cache metadata (security fix)
Add submodule count limit to prevent resource exhaustion (security fix)

Checklist

Run npm run test
Run npm run lint

🤖 Generated with Claude Code

Add 4 new MCP tools for monorepo development workflows: - init_monorepo_config: Auto-detect project structure (Rust/TS/Go/Python) - list_submodules: List configured submodules with dependencies - get_submodule_context: Load submodule content with caching - invalidate_submodule_cache: Manually invalidate cache Key features: - Git commit-based cache invalidation - Support for git submodules with independent history - Dependency graph with cycle detection - Justfile and script generation for cache management - Project detector for Rust workspace, npm/pnpm workspaces, Go workspaces This enables loading specific submodules on-demand instead of packing the entire monorepo, significantly reducing token usage for large projects. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

Add security hardening to prevent path traversal attacks: - Create pathValidator.ts with validation utilities: - validatePathWithinRoot(): prevents escaping project root - validateProjectRoot(): normalizes and validates project root - sanitizeSubmoduleName(): sanitizes names for safe file paths - validateSubmodulePath(): validates submodule paths - Apply path validation to all MCP tools: - get_submodule_context: validate projectRoot and submodule paths - list_submodules: validate projectRoot - invalidate_submodule_cache: validate projectRoot - init_monorepo_config: validate projectRoot - Add Zod schema constraints: - submoduleName: min(1), max(255) - projectRoot: max(4096) - submodules array: max(100) items - cacheDirectory: min(1), max(255) - Add 20 unit tests for path validation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>

[MOTIVATION] Multiple repomix processes running simultaneously in the same directory can cause data races and corrupt output files. Users need protection against accidental concurrent execution. [IMPLEMENTATION DETAILS] - Add src/core/lock/fileLock.ts with PID-based lock file mechanism - Lock is acquired before task execution in defaultAction.ts - Stale lock detection (checks if holding process is still alive) - Lock automatically released on completion or error (via finally block) - Skip locking for stdout mode (no file conflict possible) - 30-minute timeout as fallback for zombie processes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>

[MOTIVATION] SEC-005: MonorepoConfigSchema had no size limit on submodules dictionary. Malicious configuration files could define thousands of submodules, causing memory exhaustion or excessive processing time. [IMPLEMENTATION DETAILS] - Added MAX_SUBMODULES constant (1000) to define maximum allowed submodules - Added .refine() validation to submodules field using Zod - Provides clear error message when limit is exceeded - Maintains backward compatibility for normal usage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>

Add Zod schema validation when reading cache metadata to prevent security issues from malformed or malicious cache files. [MOTIVATION] Cache metadata files are read from disk and parsed as JSON without validation. This creates a security vulnerability where malformed or malicious cache files could cause runtime errors or unexpected behavior. [IMPLEMENTATION DETAILS] - Created CacheMetadataSchema in cacheTypes.ts using Zod - Schema matches all fields in CacheMetadata interface - Added validation for numeric fields (fileCount, tokenCount) to be non-negative integers - Updated readMeta() method to use CacheMetadataSchema.parse() - Zod validation throws meaningful errors on invalid data Fixes security issue SEC-003 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>

coderabbitai · 2026-01-08T14:04:58Z

📝 Walkthrough

Walkthrough

Introduces a comprehensive monorepo caching and dependency management system featuring file-based locking for directory synchronization, configuration schemas for submodule organization, project type detection, dependency graph analysis with cycle detection, per-submodule caching with git-based invalidation, and four new MCP tools for monorepo operations.

Changes

Cohort / File(s)	Summary
File Locking Infrastructure `src/core/lock/fileLock.ts`, `src/core/lock/index.ts`	Introduces file-based locking with stale-lock detection (30-min threshold), PID ownership tracking, atomic lock creation, and exported `acquireLock`, `releaseLock`, `withLock` functions plus `FileLockError` and `LockInfo` types.
Lock Integration `src/cli/actions/defaultAction.ts`	Adds lock acquisition in default action before worker task creation; releases lock in finally block; skips locking in stdout mode.
Monorepo Configuration `src/mcp/config/monorepoConfigLoader.ts`	Adds Zod schemas and validation for monorepo config with submodules, cache settings, and Repomix options; exports load/save utilities and config accessors.
Cache Management `src/mcp/cache/cacheTypes.ts`, `src/mcp/cache/cacheManager.ts`	Introduces cache metadata types, CacheManager class with dependency injection, git-based change detection, and per-submodule cache persistence with metadata validation.
Dependency Analysis `src/mcp/dependency/dependencyGraph.ts`	Adds DependencyGraph class for analyzing submodule dependencies; supports cycle detection, topological sorting, depth calculation, leaf/root identification, and graph statistics.
Project Detection `src/mcp/detection/projectDetector.ts`	Implements ProjectDetector for identifying project types (Rust, TypeScript, Python, Go, mixed, generic) and discovering submodules; includes git submodule detection and workspace pattern expansion.
Security & Validation `src/mcp/security/pathValidator.ts`	Adds path traversal prevention, submodule name sanitization, and root-bound validation utilities via `PathSecurityError` exception class.
MCP Tools `src/mcp/tools/getSubmoduleContextTool.ts`, `src/mcp/tools/initMonorepoConfigTool.ts`, `src/mcp/tools/invalidateSubmoduleCacheTool.ts`, `src/mcp/tools/listSubmodulesTool.ts`	Four new MCP tool registrations: get_submodule_context for packed submodule retrieval with optional dependencies, init_monorepo_config for project setup and justfile generation, invalidate_submodule_cache for cache management, and list_submodules for submodule enumeration with optional dependency info.
Justfile & Cache Script Generation `src/mcp/generation/justfileGenerator.ts`	Introduces JustfileGenerator for rendering monorepo cache management recipes and `generateCacheScript` function for Node.js-based cache CLI logic.
MCP Server Integration `src/mcp/mcpServer.ts`	Registers four new tools on MCP server and extends MCP_SERVER_INSTRUCTIONS documentation.
Gitignore Updates `.gitignore`	Broadens Repomix output patterns and adds ignore rules for lock files and Claude configuration directories.
Lock Module Tests `tests/core/lock/fileLock.test.ts`	Comprehensive test suite for file locking: lock creation/ownership validation, stale lock handling, lock release, error scenarios, and `withLock` wrapper behavior.
Cache Manager Tests `tests/mcp/cache/cacheManager.test.ts`	Tests CacheManager check/set/invalidate operations, git commit detection, metadata persistence, and directory creation.
Configuration Tests `tests/mcp/config/monorepoConfigLoader.test.ts`	Validates config loading, submodule accessors, and constant values.
Dependency Graph Tests `tests/mcp/dependency/dependencyGraph.test.ts`	Tests direct/transitive dependencies, dependents, cycle detection, topological sorting, node identification, and statistics calculation.
Security Tests `tests/mcp/security/pathValidator.test.ts`	Validates path traversal prevention, project root resolution, submodule name sanitization, and path security error conditions.
Action Tests `tests/cli/actions/defaultAction.test.ts`, `tests/cli/actions/defaultAction.tokenCountTree.test.ts`	Mock lock module (`acquireLock`, `releaseLock`, `FileLockError`) to enable lock behavior simulation in existing action tests.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant MCP as MCP Tool:<br/>getSubmoduleContext
    participant Config as monorepo<br/>config
    participant Graph as Dependency<br/>Graph
    participant Cache as Cache<br/>Manager
    participant CLI as CLI Packer
    
    User->>MCP: Call with submoduleName,<br/>includeDependencies
    MCP->>Config: Load monorepo config
    Config-->>MCP: Config loaded
    MCP->>Cache: Check cache for submodule
    alt Cache valid
        Cache-->>MCP: Cached content
    else Cache miss/invalid
        MCP->>CLI: Generate submodule content
        CLI-->>MCP: Packed content
        MCP->>Cache: Save to cache
        Cache-->>MCP: Cached
    end
    alt Include dependencies
        MCP->>Graph: Resolve dependencies
        Graph-->>MCP: Dependency list
        loop For each dependency
            MCP->>Cache: Load/generate dependency
            Cache-->>MCP: Dependency content
        end
    end
    MCP-->>User: Response with content +<br/>metadata + token counts

sequenceDiagram
    participant User
    participant MCP as MCP Tool:<br/>initMonorepoConfig
    participant Detector as Project<br/>Detector
    participant Config as Config<br/>Saver
    participant Generator as Justfile<br/>Generator
    participant Git as .gitignore
    
    User->>MCP: Call with projectRoot
    MCP->>Detector: Detect project structure
    Detector-->>MCP: Project type + submodules
    MCP->>Config: Save monorepo config
    Config-->>MCP: Config saved
    alt Generate justfile
        MCP->>Generator: Generate justfile
        Generator-->>MCP: Justfile content
        MCP->>Git: Update .gitignore
    end
    MCP-->>User: Success with config/justfile<br/>paths + detected info

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Reasoning: This PR introduces ~2,500 lines across multiple new modules with heterogeneous logic domains (locking, caching, dependency analysis, project detection, security, MCP tools). Each module contains distinct algorithms (cycle detection, topological sorting, path traversal prevention) and complex interactions (cache invalidation, dependency resolution). The integration points (defaultAction, mcpServer) and dependency injection patterns add further reasoning overhead. However, the changes are well-organized into cohesive modules and include comprehensive test coverage reducing uncertainty.

Possibly related PRs

add MCP-server attach_packed_output tool for importing existing repom… #756: Modifies src/mcp/mcpServer.ts to register additional MCP tools; shares code-level overlap with this PR's tool registration infrastructure.
chore(deps): Upgrade to Zod v4 #923: Touches Zod schema usage and configuration handling in src/mcp/; overlaps with the monorepo configuration schema definitions introduced here.
feat: update MCP tools to structured output format with SDK v1.15.0 #719: Modifies MCP server setup and tool descriptions in src/mcp/mcpServer.ts; related to the MCP server integration changes.

Suggested reviewers

yamadashy

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically summarizes the main change: adding monorepo submodule support with caching and security enhancements.
Description check	✅ Passed	The description includes a summary of key changes and completed checklist items, aligning with the template's required sections.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist · 2026-01-08T14:05:17Z

Summary of Changes

Hello @RyderFreeman4Logos, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances repomix by introducing robust monorepo submodule support. It provides intelligent caching based on Git history to improve performance, a file locking mechanism to prevent concurrent execution issues, and critical security features like path traversal protection and schema validation for cache data. New MCP tools are also added to facilitate the management and interaction with monorepo structures, making repomix more capable and secure for complex codebases.

Highlights

Monorepo Submodule Support: Introduced comprehensive support for monorepos, allowing repomix to understand and manage individual submodules within a larger project structure. This includes auto-detection of various project types like Rust, TypeScript, Python, and Go workspaces.
Intelligent Caching for Submodules: Implemented a caching mechanism for submodule content. This system leverages Git commit hashes to efficiently determine if a submodule has changed, thereby optimizing performance by avoiding redundant processing and reducing token usage.
File Locking Mechanism: Added a robust file-based locking system (.repomix.lock) to prevent concurrent repomix executions in the same directory. This ensures data integrity and prevents race conditions, with built-in detection and removal of stale locks.
Security Enhancements: Incorporated several critical security measures, including path traversal protection for monorepo tools, strict schema validation for cache metadata to prevent malformed or malicious data, and a submodule count limit to mitigate resource exhaustion attacks.
New MCP Tools for Monorepo Management: Introduced several new Model Context Protocol (MCP) tools to facilitate monorepo management: init_monorepo_config for auto-detection and configuration, list_submodules to view configured submodules and their dependencies, get_submodule_context for cached content retrieval, and invalidate_submodule_cache to force cache refreshes.
Automated Justfile and Script Generation: The init_monorepo_config tool can now automatically generate a justfile with convenient recipes and a Node.js cache management script (repomix-cache.mjs) to streamline common monorepo operations and cache invalidation workflows.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai

Actionable comments posted: 8

🤖 Fix all issues with AI agents

In @.gitignore:
- Line 42: The .gitignore entry **/*.repomix.lock does not match the lock file
created by fileLock.ts because that file is named .repomix.lock (leading dot);
update the pattern to include hidden filenames—e.g., add .repomix.lock and/or
**/.repomix.lock (or simply /.repomix.lock) so the actual .repomix.lock created
by fileLock.ts is ignored.

In @src/mcp/cache/cacheManager.ts:
- Around line 270-281: The listCached method uses the global fs.readdir instead
of the injected dependency; change listCached to call
this.deps.fsReaddir(this.cacheDir) and await its result, then filter for
'.meta.json' and map to remove the suffix as before, and keep the same try/catch
behavior (return [] on error) so tests can inject a mock fsReaddir via deps.

In @src/mcp/config/monorepoConfigLoader.ts:
- Around line 44-50: The MonorepoConfigSchema's submodules record uses
z.string() for keys which allows unsafe names; update the schema so submodule
keys are validated for path-safety (no path traversal segments, no absolute
paths, no directory separators, allowed chars only) by replacing the key schema
with a refined or transformed string (e.g., z.string().refine(...) or using
z.preprocess/transform) that enforces a safe-name regex and rejects names like
"../../../etc" before applying SubmoduleConfigSchema; keep the existing
MAX_SUBMODULES refinement and ensure the error message clearly indicates an
invalid submodule name when the key validation fails.
- Around line 90-97: The catch block in monorepoConfigLoader.ts currently maps
all errors to a null return, hiding validation issues; update the error handling
in that catch (referencing configPath and logger) so ENOENT still returns null,
but Zod validation failures (detect via error.name === 'ZodError' or instance of
ZodError) are handled separately: log at warning/error level with the validation
details and rethrow (or return a failure) so invalid configs are surfaced to
callers; keep the existing trace for other unexpected errors but ensure
validation errors are not silently swallowed.

In @src/mcp/dependency/dependencyGraph.ts:
- Around line 220-238: The current getDepth implementation uses a single visited
Set which short-circuits repeated visits and underestimates depths when nodes
are shared across multiple paths; replace that logic with a memoization Map to
cache computed depths (e.g., depthCache) and use a separate recursion stack Set
only to detect cycles; update calculateDepth to: if depthCache.has(name) return
cached value, mark name in recursion stack to detect cycles, compute max child
depth recursively, store the computed depth in depthCache before returning, and
remove name from recursion stack; keep using this.graph to read dependencies.

In @src/mcp/generation/justfileGenerator.ts:
- Around line 298-310: The loop that builds metaPath from config.submodules (see
the for (const [name] of Object.entries(config.submodules)) block and the
metaPath constant) is vulnerable to path traversal; fix it by sanitizing or
validating each submodule name before using it in path.join: add or call a
sanitizer (e.g., a sanitizeName function or existing validators in
pathValidator.ts) to strip path separators/traversal sequences and unsafe
characters, then use the sanitized name (safeName) when constructing metaPath
and any other file paths; alternatively enforce a Zod validation rule in
MonorepoConfigSchema to reject names containing '/' '\' or '..' and update the
generated script to rely on that validated/sanitized name.

In @src/mcp/tools/getSubmoduleContextTool.ts:
- Around line 196-199: createToolWorkspace() creates a temporary directory
(tempDir) that is never removed; wrap the logic that reads the output file
(using outputFilePath) inside a try/finally so you always clean up the tempDir,
and in the finally call an async removal (e.g., fs.promises.rm or fs.rm with
recursive/force equivalent) awaiting its completion and handling any removal
errors gracefully; update the function containing tempDir (e.g.,
getSubmoduleContextTool) to ensure tempDir is removed even on errors or early
returns.

🧹 Nitpick comments (22)

tests/mcp/security/pathValidator.test.ts (2)
51-55: Platform-specific assertion may fail on Windows.

The test assumes absolute paths start with /, which is Unix-specific. On Windows, absolute paths start with a drive letter (e.g., C:\).
💡 Suggested fix
     it('should normalize and resolve relative paths', () => {
       const result = validateProjectRoot('./subdir');
       expect(result).toContain('subdir');
-      expect(result.startsWith('/')).toBe(true); // Should be absolute
+      expect(path.isAbsolute(result)).toBe(true); // Should be absolute
     });
You'll need to import path from node:path at the top of the file.
22-37: Consider adding edge case tests for security-critical path validation.

The test coverage for validatePathWithinRoot is good for basic scenarios. For completeness, consider adding tests for:

Null byte injection: 'crates/foo\x00/../etc/passwd'

Mixed path separators: 'crates\\foo/../bar'

Encoded traversal: paths with URL-encoded sequences

These edge cases help ensure the security utilities are robust against more sophisticated attacks.
src/mcp/generation/justfileGenerator.ts (1)

221-227: Consider reusing sanitizeSubmoduleName from pathValidator.

This sanitizeName method has similar logic to sanitizeSubmoduleName in pathValidator.ts but differs in behavior (lowercase conversion, different character handling). Consider either:

Reusing the existing utility with an adapter

Documenting why the differences are intentional (just recipe names vs filesystem paths)
src/core/lock/fileLock.ts (1)
41-54: Consider validating all LockInfo fields.

The validation only checks pid and startTime are numbers, but doesn't verify cwd is a string. While this is unlikely to cause issues in practice, complete validation would be more robust.
💡 Suggested improvement
     if (typeof parsed.pid !== 'number' || typeof parsed.startTime !== 'number') {
       return null;
     }
+    if (typeof parsed.cwd !== 'string') {
+      return null;
+    }
tests/mcp/config/monorepoConfigLoader.test.ts (1)

11-16: Consider adding tests for error scenarios.

The test only covers the "file not found" case. Consider adding tests for:

Invalid JSON in config file

Valid JSON but invalid schema (e.g., missing required fields)

Exceeding MAX_SUBMODULES limit

This would help catch regressions if the error handling behavior changes.
tests/cli/actions/defaultAction.tokenCountTree.test.ts (1)
16-20: Consider aligning the FileLockError mock with the actual implementation.

The mock FileLockError class is simplified compared to the actual implementation, which accepts lockPath and existingLock parameters. While this may be sufficient for the current tests, consider including these properties to ensure consistency if future tests need to verify error details.
♻️ Suggested enhancement
-vi.mock('../../../src/core/lock/index', () => ({
-  acquireLock: vi.fn().mockResolvedValue('/mock/.repomix.lock'),
-  releaseLock: vi.fn().mockResolvedValue(undefined),
-  FileLockError: class FileLockError extends Error {},
-}));
+vi.mock('../../../src/core/lock/index', () => ({
+  acquireLock: vi.fn().mockResolvedValue('/mock/.repomix.lock'),
+  releaseLock: vi.fn().mockResolvedValue(undefined),
+  FileLockError: class FileLockError extends Error {
+    constructor(message: string, public lockPath: string, public existingLock?: unknown) {
+      super(message);
+      this.name = 'FileLockError';
+    }
+  },
+}));
src/mcp/cache/cacheTypes.ts (1)
10-20: Schema validation addresses the security fix objective.

The Zod schema provides type-safe validation for cache metadata parsing. Consider using z.infer<typeof CacheMetadataSchema> to derive the CacheMetadata interface and eliminate the duplication between the schema (lines 10-20) and interface (lines 25-52).
♻️ Optional: Derive interface from schema
 export const CacheMetadataSchema = z.object({
   submodule: z.string().describe('Submodule name'),
   generatedAt: z.string().describe('When the cache was generated'),
   gitCommit: z.string().describe('Git commit hash when cache was generated'),
   fileCount: z.number().int().nonnegative().describe('Number of files in the submodule'),
   tokenCount: z.number().int().nonnegative().describe('Total token count of the content'),
   dependencies: z.array(z.string()).describe('List of dependencies (other submodule names)'),
   repomixVersion: z.string().describe('Repomix version used to generate'),
   compressed: z.boolean().describe('Whether compression was enabled'),
   isGitSubmodule: z.boolean().optional().describe('Whether this is a git submodule'),
 });
 
-/**
- * Metadata stored alongside cached content
- */
-export interface CacheMetadata {
-  /** Submodule name */
-  submodule: string;
-  // ... rest of interface
-}
+/**
+ * Metadata stored alongside cached content
+ */
+export type CacheMetadata = z.infer<typeof CacheMetadataSchema>;
src/mcp/tools/initMonorepoConfigTool.ts (1)
133-142: Consider providing feedback when justfile already contains repomix section.

When the existing justfile already includes the repomix section marker, justfilePath remains undefined, so the success message won't indicate justfile status. Users may wonder if their justfile was checked.
♻️ Optional: Add feedback for existing repomix section
           if (existingContent) {
             // Append to existing justfile if repomix section doesn't exist
             if (!existingContent.includes('# Repomix Monorepo Cache Management')) {
               await fs.writeFile(justfileFullPath, `${existingContent}\n\n${justfileContent}`, 'utf-8');
               justfilePath = justfileFullPath;
+            } else {
+              // Section already exists, no changes needed
+              justfilePath = justfileFullPath; // or use a separate flag to indicate "already configured"
             }
           } else {
             await fs.writeFile(justfileFullPath, justfileContent, 'utf-8');
             justfilePath = justfileFullPath;
           }
src/mcp/tools/getSubmoduleContextTool.ts (4)
110-114: Consider logging when configured dependencies are missing.

Silently skipping dependencies that aren't found in config (line 113) could mask configuration errors. A warning log would help users identify mismatches between declared dependencies and configured submodules.
♻️ Suggested improvement
          const subConfig = config.submodules[name];
          if (!subConfig) {
+           logger.warn(`Dependency "${name}" not found in monorepo config, skipping`);
            continue; // Skip if dependency not found in config
          }
148-148: Address the TODO: fetch version from package.json.

The hardcoded repomixVersion: '1.0.0' should be dynamically retrieved to ensure cache metadata accurately reflects the tool version. This could affect cache invalidation logic if the tool behavior changes between versions.

Would you like me to help implement dynamic version retrieval from package.json?

155-156: Consider using accurate token counts from pack results instead of re-estimating.

The generateSubmoduleContent function returns tokenCount from the actual pack result, but this code re-estimates tokens from content length. For cached entries, meta.tokenCount is already available from the cache. This could provide more accurate totals.
♻️ Suggested improvement
-        // 5. Calculate total tokens
-        const totalTokens = Object.values(contents).reduce((sum, content) => sum + estimateTokens(content), 0);
+        // 5. Calculate total tokens from cache status metadata or estimates
+        let totalTokens = 0;
+        for (const status of cacheStatus) {
+          if (status.fromCache) {
+            const cached = await cacheManager.get(status.submodule, config.submodules[status.submodule].path, config.submodules[status.submodule].isGitSubmodule);
+            totalTokens += cached?.meta.tokenCount ?? estimateTokens(contents[status.submodule]);
+          } else {
+            // Token count was already computed during generation - could track it
+            totalTokens += estimateTokens(contents[status.submodule]);
+          }
+        }
222-222: Avoid dynamic import for already-available module.

The dynamic import('node:fs/promises') is unnecessary overhead. Either import fs at the top of the file or reuse an existing import.
♻️ Suggested fix

Add at the top of the file:
import fs from 'node:fs/promises';
Then replace line 222:
-  const output = await import('node:fs/promises').then((fs) => fs.readFile(outputFilePath, 'utf-8'));
+  const output = await fs.readFile(outputFilePath, 'utf-8');
src/mcp/dependency/dependencyGraph.ts (3)
24-40: Potential inconsistency: dependencies may reference non-existent submodules.

The graph construction adds reverse edges for dependencies without verifying they exist as submodules. This could lead to orphan entries in reverseGraph for external dependencies that aren't part of the monorepo config. While the code handles this gracefully (returning empty arrays), it might be worth validating or filtering during construction.
♻️ Optional validation during construction
   constructor(config: MonorepoConfig) {
     this.graph = new Map();
     this.reverseGraph = new Map();
+    const allSubmoduleNames = new Set(Object.keys(config.submodules));

     // Build forward and reverse dependency graphs
     for (const [name, submodule] of Object.entries(config.submodules)) {
-      this.graph.set(name, new Set(submodule.dependencies));
+      // Filter to only internal dependencies
+      const internalDeps = submodule.dependencies.filter(dep => allSubmoduleNames.has(dep));
+      this.graph.set(name, new Set(internalDeps));

       // Initialize reverse graph entry
       if (!this.reverseGraph.has(name)) {
         this.reverseGraph.set(name, new Set());
       }

       // Build reverse edges
-      for (const dep of submodule.dependencies) {
+      for (const dep of internalDeps) {
         if (!this.reverseGraph.has(dep)) {
           this.reverseGraph.set(dep, new Set());
         }
         this.reverseGraph.get(dep)?.add(name);
       }
     }
   }
115-150: Cycle detection may report duplicate cycles.

The algorithm can report the same cycle multiple times when it's reached from different entry points. For example, a cycle A→B→C→A might be reported as both [A,B,C,A] and [B,C,A,B]. Consider deduplicating cycles by normalizing them (e.g., starting from the lexicographically smallest node).

This is a minor concern since cycles in dependency graphs are typically errors that need to be fixed regardless of exact reporting.

152-166: Performance: topologicalSort performs cycle detection twice.

topologicalSort() calls hasCycles() which calls detectCycles(). For large dependency graphs, this redundant traversal could be optimized by integrating cycle detection into the topological sort itself (Kahn's algorithm naturally detects cycles).

This is acceptable for typical monorepo sizes but worth noting for future optimization if performance becomes a concern.
src/mcp/detection/projectDetector.ts (5)
184-190: Workspace member check may be overly permissive.

The check member.includes(pkg.name) || member.includes(path.dirname(pkg.manifest_path)) could incorrectly match packages. For example, if pkg.name is "core" and a workspace member is "core-utils", this would incorrectly match. Consider more precise matching.
♻️ More precise workspace member matching
         const isWorkspaceMember =
           workspaceMembers.size === 0 ||
           Array.from(workspaceMembers).some(
-            (member) => member.includes(pkg.name) || member.includes(path.dirname(pkg.manifest_path)),
+            (member) => {
+              // workspace_members format: "package_name version (path)"
+              return member.startsWith(`${pkg.name} `) || 
+                     member.includes(`(${path.dirname(pkg.manifest_path)})`);
+            },
           );
284-310: Consider caching package.json content from first pass to avoid re-reading.

The second pass re-reads package.json for each package that was already read in the first pass. This could be optimized by caching the parsed content.

This is a minor performance improvement and may not be necessary for typical workspace sizes.

340-352: Silent fallback when fast-glob is unavailable may cause incomplete detection.

If fast-glob isn't installed and complex glob patterns like packages/**/* are used, the code silently skips them with only a trace log. This could result in missing submodules without clear indication to the user.

Consider either:

Adding fast-glob as a dependency

Logging at warning level when complex patterns are skipped

Implementing basic ** pattern handling
♻️ Improve visibility of skipped patterns
       } catch {
         // fast-glob not available, skip complex patterns
-        logger.trace(`Skipping complex pattern: ${pattern}`);
+        logger.warn(`Cannot expand glob pattern "${pattern}": fast-glob not available. Some packages may not be detected.`);
       }
366-408: Python package detection doesn't extract inter-package dependencies.

Unlike Rust and TypeScript detection which extract workspace dependencies, Python detection always returns empty dependencies: []. Consider parsing pyproject.toml for dependencies that match discovered package names.

This is acceptable for initial implementation, but limits the usefulness of dependency graph features for Python monorepos.

243-256: Using a YAML parser would improve robustness, but requires adding a new dependency.

The regex-based YAML parsing at lines 246-250 is fragile and can fail with comments, nested structures, or non-standard formatting. However, the suggested yaml package is not currently a project dependency and would need to be added to package.json. Consider either:

Adding yaml or js-yaml as a dependency and refactoring to use it

Documenting that the current implementation assumes standard pnpm-workspace.yaml format (simple list syntax without comments)

Adding basic validation to detect unsupported formats and log a warning
src/mcp/cache/cacheManager.ts (2)
138-149: Potential double-slash in git diff path argument.

Line 147 appends / to submodulePath without checking if it already ends with a slash. While git typically handles this, it's cleaner to normalize:
♻️ Normalize path before appending
       // Normal directory: check for changes in that path
+      const normalizedPath = submodulePath.replace(/\/+$/, '');
       const result = await this.deps.execFileAsync('git', [
         '-C',
         this.rootDir,
         'diff',
         '--name-only',
         lastCommit,
         currentCommit,
         '--',
-        `${submodulePath}/`,
+        `${normalizedPath}/`,
       ]);
226-243: Non-atomic writes could leave cache in inconsistent state.

If the process crashes between writing metadata (line 239) and content (line 240), the cache could have metadata pointing to missing or outdated content. Consider writing to temporary files first, then renaming atomically.
♻️ Atomic write pattern
   async set(submoduleName: string, content: string, meta: Omit<CacheMetadata, 'generatedAt'>): Promise<void> {
     const metaPath = this.getMetaPath(submoduleName);
     const contentPath = this.getContentPath(submoduleName);
+    const tempMetaPath = `${metaPath}.tmp`;
+    const tempContentPath = `${contentPath}.tmp`;

     const fullMeta: CacheMetadata = {
       ...meta,
       generatedAt: new Date().toISOString(),
     };

     // Ensure cache directory exists
     await this.deps.fsMkdir(this.cacheDir, { recursive: true });

-    // Write files
-    await this.deps.fsWriteFile(metaPath, JSON.stringify(fullMeta, null, 2), 'utf-8');
-    await this.deps.fsWriteFile(contentPath, content, 'utf-8');
+    // Write to temp files first
+    await this.deps.fsWriteFile(tempContentPath, content, 'utf-8');
+    await this.deps.fsWriteFile(tempMetaPath, JSON.stringify(fullMeta, null, 2), 'utf-8');
+    
+    // Rename atomically (content first, since metadata signals validity)
+    await fs.rename(tempContentPath, contentPath);
+    await fs.rename(tempMetaPath, metaPath);

     logger.trace(`Cache saved for ${submoduleName}`);
   }

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8cb110c and 1ffd4b3.

📒 Files selected for processing (23)

.gitignore
src/cli/actions/defaultAction.ts
src/core/lock/fileLock.ts
src/core/lock/index.ts
src/mcp/cache/cacheManager.ts
src/mcp/cache/cacheTypes.ts
src/mcp/config/monorepoConfigLoader.ts
src/mcp/dependency/dependencyGraph.ts
src/mcp/detection/projectDetector.ts
src/mcp/generation/justfileGenerator.ts
src/mcp/mcpServer.ts
src/mcp/security/pathValidator.ts
src/mcp/tools/getSubmoduleContextTool.ts
src/mcp/tools/initMonorepoConfigTool.ts
src/mcp/tools/invalidateSubmoduleCacheTool.ts
src/mcp/tools/listSubmodulesTool.ts
tests/cli/actions/defaultAction.test.ts
tests/cli/actions/defaultAction.tokenCountTree.test.ts
tests/core/lock/fileLock.test.ts
tests/mcp/cache/cacheManager.test.ts
tests/mcp/config/monorepoConfigLoader.test.ts
tests/mcp/dependency/dependencyGraph.test.ts
tests/mcp/security/pathValidator.test.ts

🧰 Additional context used

🧠 Learnings (3)

📚 Learning: 2025-06-30T16:07:18.316Z

Learnt from: CR
Repo: yamadashy/repomix PR: 0
File: .cursorrules:0-0
Timestamp: 2025-06-30T16:07:18.316Z
Learning: Applies to .agents/rules/base.md : Check the rules written in `.agents/rules/base.md` as they contain important project-specific guidelines and instructions.

Applied to files:

.gitignore

📚 Learning: 2025-08-22T05:04:47.803Z

Learnt from: CR
Repo: yamadashy/repomix PR: 0
File: .github/instructions/base.instructions.md:0-0
Timestamp: 2025-08-22T05:04:47.803Z
Learning: Consult and adhere to the project-specific rules documented in .agents/rules/base.md during development and review

Applied to files:

.gitignore

📚 Learning: 2025-07-18T15:12:57.179Z

Learnt from: CR
Repo: yamadashy/repomix PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-18T15:12:57.179Z
Learning: .agents/rules/base.md

Applied to files:

.gitignore

🧬 Code graph analysis (15)

tests/mcp/security/pathValidator.test.ts (1)

src/mcp/security/pathValidator.ts (5)

validatePathWithinRoot (22-38)

PathSecurityError (6-11)

validateProjectRoot (47-53)

sanitizeSubmoduleName (63-81)

validateSubmodulePath (92-100)

tests/cli/actions/defaultAction.test.ts (2)

src/core/lock/fileLock.ts (1)

FileLockError (14-23)

src/core/lock/index.ts (1)

FileLockError (2-2)

src/mcp/security/pathValidator.ts (1)

website/server/scripts/bundle.mjs (1)

rootDir (19-19)

src/mcp/tools/listSubmodulesTool.ts (4)

src/mcp/security/pathValidator.ts (1)

validateProjectRoot (47-53)

src/mcp/config/monorepoConfigLoader.ts (2)

loadMonorepoConfig (83-98)

getSubmoduleNames (121-123)

src/mcp/tools/mcpToolRuntime.ts (3)

buildMcpToolErrorResponse (234-248)

buildMcpToolSuccessResponse (215-227)

convertErrorToJson (166-208)

src/mcp/dependency/dependencyGraph.ts (1)

DependencyGraph (16-280)

tests/core/lock/fileLock.test.ts (3)

src/core/lock/fileLock.ts (4)

acquireLock (83-133)

releaseLock (140-158)

FileLockError (14-23)

withLock (167-174)

src/core/lock/index.ts (4)

acquireLock (2-2)

releaseLock (2-2)

FileLockError (2-2)

withLock (2-2)

src/shared/logger.ts (1)

error (34-38)

tests/mcp/cache/cacheManager.test.ts (2)

src/mcp/cache/cacheManager.ts (2)

CacheManagerDeps (15-23)

CacheManager (38-282)

src/mcp/cache/cacheTypes.ts (1)

CacheMetadata (25-52)

src/mcp/tools/getSubmoduleContextTool.ts (5)

src/mcp/security/pathValidator.ts (2)

validateProjectRoot (47-53)

validateSubmodulePath (92-100)

src/mcp/config/monorepoConfigLoader.ts (2)

loadMonorepoConfig (83-98)

getSubmodule (114-116)

src/mcp/tools/mcpToolRuntime.ts (4)

buildMcpToolErrorResponse (234-248)

buildMcpToolSuccessResponse (215-227)

convertErrorToJson (166-208)

createToolWorkspace (49-59)

src/mcp/dependency/dependencyGraph.ts (1)

DependencyGraph (16-280)

src/mcp/cache/cacheManager.ts (1)

CacheManager (38-282)

src/mcp/config/monorepoConfigLoader.ts (1)

src/shared/logger.ts (1)

error (34-38)

src/mcp/tools/invalidateSubmoduleCacheTool.ts (4)

src/mcp/security/pathValidator.ts (1)

validateProjectRoot (47-53)

src/mcp/config/monorepoConfigLoader.ts (2)

loadMonorepoConfig (83-98)

getSubmoduleNames (121-123)

src/mcp/tools/mcpToolRuntime.ts (3)

buildMcpToolErrorResponse (234-248)

buildMcpToolSuccessResponse (215-227)

convertErrorToJson (166-208)

src/mcp/cache/cacheManager.ts (1)

CacheManager (38-282)

src/mcp/generation/justfileGenerator.ts (2)

src/mcp/detection/projectDetector.ts (1)

DetectedSubmodule (22-29)

src/mcp/config/monorepoConfigLoader.ts (1)

MonorepoConfig (71-71)

src/mcp/mcpServer.ts (4)

src/mcp/tools/initMonorepoConfigTool.ts (1)

registerInitMonorepoConfigTool (46-193)

src/mcp/tools/listSubmodulesTool.ts (1)

registerListSubmodulesTool (45-133)

src/mcp/tools/getSubmoduleContextTool.ts (1)

registerGetSubmoduleContextTool (57-179)

src/mcp/tools/invalidateSubmoduleCacheTool.ts (1)

registerInvalidateSubmoduleCacheTool (32-104)

tests/cli/actions/defaultAction.tokenCountTree.test.ts (1)

src/core/lock/fileLock.ts (1)

FileLockError (14-23)

src/mcp/dependency/dependencyGraph.ts (1)

src/mcp/config/monorepoConfigLoader.ts (1)

MonorepoConfig (71-71)

tests/mcp/config/monorepoConfigLoader.test.ts (1)

src/mcp/config/monorepoConfigLoader.ts (5)

loadMonorepoConfig (83-98)

MonorepoConfig (71-71)

getSubmodule (114-116)

getSubmoduleNames (121-123)

MONOREPO_CONFIG_FILE (76-76)

src/mcp/cache/cacheManager.ts (2)

src/mcp/security/pathValidator.ts (2)

sanitizeSubmoduleName (63-81)

validatePathWithinRoot (22-38)

src/mcp/cache/cacheTypes.ts (4)

CacheMetadata (25-52)

CacheMetadataSchema (10-20)

CacheCheckResult (68-80)

CachedContent (57-63)

🔇 Additional comments (54)

tests/mcp/security/pathValidator.test.ts (1)

1-126: Good test coverage for security-critical path validation.

The test suite comprehensively covers the path validation utilities including traversal attacks, name sanitization, and boundary conditions. The tests verify that PathSecurityError is thrown appropriately for invalid inputs.

src/mcp/security/pathValidator.ts (3)

22-38: Absolute userPath bypasses root resolution - verify this is intentional.

When userPath is an absolute path (e.g., /etc/passwd), path.resolve(resolvedRoot, userPath) ignores resolvedRoot entirely and returns the absolute path. The subsequent path.relative check should catch this via the path.isAbsolute(relative) condition, but this behavior is subtle.

The docstring mentions handling absolute paths, and the logic does reject them. This is correct behavior, but consider adding a comment clarifying why path.isAbsolute(relative) is needed (to catch absolute userPath that doesn't share the root prefix).

63-81: LGTM - thorough sanitization logic.

The function properly handles path separators, invalid characters, dot sequences, and length limits. The regex chain is well-ordered to prevent bypass attempts.

92-99: Good defensive layering with explicit absolute path rejection.

The explicit check for absolute paths before calling validatePathWithinRoot provides a clearer error message to users and is a good defensive practice, even though validatePathWithinRoot would also catch this case.

src/core/lock/fileLock.ts (3)

91-132: TOCTOU race condition is properly handled via atomic file creation.

The check-then-act pattern between readLockFile (line 92) and fs.open (line 116) has a potential race condition, but it's correctly mitigated by using the 'wx' flag which fails atomically if the file already exists. The EEXIST handler (lines 122-130) properly re-reads the lock to get the actual owner's info.

Good implementation of the file locking pattern.

140-158: Minor TOCTOU in release - acceptable for this use case.

There's a theoretical race between reading the lock (line 143) and unlinking it (line 145) where another process could delete and recreate the lock. The current implementation is acceptable since:

This is a best-effort ownership check

The scenario is unlikely in practice

The worst case is removing another process's fresh lock, which would just cause a retry

167-174: Clean implementation of the lock wrapper pattern.

The withLock helper properly ensures lock release in the finally block, even when the wrapped function throws.

src/mcp/config/monorepoConfigLoader.ts (1)

105-123: LGTM - Clean accessor functions.

The saveMonorepoConfig, getSubmodule, and getSubmoduleNames functions are straightforward and correctly implemented.

tests/mcp/config/monorepoConfigLoader.test.ts (1)

18-87: Good coverage of accessor function behavior.

The tests for getSubmodule and getSubmoduleNames thoroughly cover both successful lookups and edge cases (unknown submodule, empty config). The test fixtures correctly include all required fields per the schema.

.gitignore (1)

27-27: LGTM - Appropriate patterns for new artifacts.

The broadened **/repomix-output.* pattern and the new .claude/rules entry align with the new monorepo tooling that may generate outputs in nested directories.

Also applies to: 47-47

src/mcp/mcpServer.ts (3)

10-14: LGTM!

The monorepo tool imports are well-organized and follow the established naming convention.

28-30: LGTM!

The updated instructions clearly describe the monorepo tool workflow and provide helpful guidance for users.

58-62: LGTM!

The monorepo tool registrations are properly grouped and follow the established pattern.

src/cli/actions/defaultAction.ts (3)

11-11: LGTM!

The lock module import is correctly structured.

108-121: LGTM!

The lock acquisition logic is well-structured:

Correctly skips locking in stdout mode

Properly handles FileLockError by converting it to RepomixError for user-facing messages

Only sets lockPath on successful acquisition

156-160: LGTM!

The lock release logic is correct:

Placed in the finally block to ensure cleanup

Conditional check prevents attempting to release when no lock was acquired

Proper cleanup order (worker pool cleanup before lock release)

src/mcp/tools/invalidateSubmoduleCacheTool.ts (4)

10-21: LGTM!

The input schema has appropriate validation constraints:

Submodule name length limit (255) is reasonable

Array size limit (100) aligns with the PR's MAX_SUBMODULES security fix

Path length limit (4096) is appropriate for filesystem paths

Optional submodules parameter allows invalidating all caches

23-27: LGTM!

The output schema provides a clear structure with both machine-readable data (invalidated/skipped arrays) and human-readable summary (message).

48-67: LGTM!

The handler initialization is well-structured:

Uses validateProjectRoot to prevent path traversal attacks

Provides helpful error message when configuration is missing

Correctly determines whether to invalidate all or specific submodules

69-103: LGTM!

The invalidation logic and response handling are well-implemented:

Properly validates submodule existence before attempting invalidation

Builds comprehensive messages for all scenarios

Returns structured response with optional skipped field

Uses consistent error handling with convertErrorToJson

src/core/lock/index.ts (1)

1-2: LGTM!

The re-export module provides a clean public API for the locking functionality.

tests/cli/actions/defaultAction.test.ts (1)

20-24: LGTM!

The file lock mock is appropriately scoped for these tests. The simplified FileLockError class is sufficient since this test suite doesn't exercise lock error handling paths directly. The mock enables testing the defaultAction flow without actual file system locking.

tests/mcp/dependency/dependencyGraph.test.ts (2)

6-15: LGTM!

The createConfig helper function is well-designed for test readability. It cleanly transforms a simplified input format into the full MonorepoConfig structure with sensible defaults for cache and repomix settings.

17-192: Excellent test coverage for the DependencyGraph class.

The test suite comprehensively validates:

Direct and transitive dependency resolution

Reverse dependency lookups (dependents)

Cycle detection for both direct and indirect cycles

Topological sorting with proper null handling for cyclic graphs

Leaf/root node identification

Graph statistics computation

The diamond dependency test case (lines 57-72) is particularly valuable for ensuring the deduplication logic in getAllDependencies works correctly.

src/mcp/tools/listSubmodulesTool.ts (2)

9-17: LGTM!

The input schema has appropriate validation constraints. The max(4096) limit on projectRoot provides reasonable protection against oversized inputs, and the boolean defaults are sensibly conservative.

61-131: Well-structured tool implementation with good performance considerations.

Key strengths:

Lazy initialization of DependencyGraph (lines 78-81) avoids unnecessary computation when dependency info isn't requested

Proper error handling with structured responses

Path traversal protection via validateProjectRoot

The mapping of CycleInfo objects to their description strings (line 123) correctly aligns with the output schema's z.array(z.string()) definition.

tests/mcp/cache/cacheManager.test.ts (3)

5-18: LGTM!

The mock dependency injection pattern enables isolated unit testing without real filesystem or git interactions. The CacheManagerDeps interface provides a clean seam for testing.

20-83: Thorough test coverage for cache validity checking.

The three test cases properly cover the key scenarios:

Cache files missing → exists: false, valid: false

Cache exists with matching commit → exists: true, valid: true

Cache exists but commit changed → exists: true, valid: false

The sequential mock setup on lines 72-74 documents the expected call order for commit comparison logic.

85-160: LGTM!

The remaining tests adequately cover:

set(): Directory creation and dual file writes (meta + content)

invalidate(): Force removal of both cache files

invalidateAll(): Recursive cache directory removal

getSubmoduleGitCommit(): Correct git command construction for both regular directories and git submodules

src/mcp/cache/cacheTypes.ts (1)

54-91: LGTM!

The CachedContent, CacheCheckResult, and CacheOptions interfaces are well-documented and provide a clean API surface for cache operations. The optional fields appropriately handle varying cache states.

tests/core/lock/fileLock.test.ts (4)

7-22: LGTM!

The test setup properly isolates each test with a unique temporary directory and ensures cleanup. Using os.tmpdir() with fs.mkdtemp is the correct pattern for cross-platform test isolation.

24-82: Comprehensive acquireLock tests.

Good coverage of:

Lock file creation and path validation

Lock metadata (PID, startTime) verification

Re-acquisition prevention (same process)

Stale lock detection and removal

Minor note: The PID 999999999 (line 65) is used to simulate a non-existent process. While this could theoretically be a valid PID on some systems, it's an acceptable assumption for testing purposes.

84-171: LGTM!

The releaseLock and withLock tests appropriately verify:

Lock file removal on release

Graceful handling of non-existent lock files

Protection against releasing locks owned by other processes

Lock lifecycle within withLock (acquire → execute → release)

Proper cleanup even when the wrapped function throws

173-190: LGTM!

The FileLockError test validates that the error object contains the expected diagnostic information (lockPath, existingLock.pid), which is essential for debugging lock contention issues.

src/mcp/tools/initMonorepoConfigTool.ts (3)

13-25: LGTM!

The input schema has appropriate validation constraints:

cacheDirectory with min(1).max(255) prevents empty or excessively long paths

The style enum restricts to valid output formats

Sensible defaults for all optional parameters

198-213: LGTM!

The updateGitignore helper correctly:

Handles missing .gitignore files gracefully

Normalizes the cache pattern with a leading slash

Checks for existing entries before appending

Uses a descriptive comment section for the addition

71-109: Well-structured tool implementation.

The initialization flow follows a clear sequence:

Path validation for security

Project detection with helpful error messaging

Configuration building from detected structure

Persistence with ancillary tooling generation

The path traversal protection via validateProjectRoot aligns with the PR's security enhancement objectives.

src/mcp/tools/getSubmoduleContextTool.ts (4)

1-18: LGTM!

Imports are well-organized and include appropriate security utilities (validateProjectRoot, validateSubmodulePath) for path validation.

19-33: LGTM!

Input schema includes reasonable constraints (max 255 chars for submoduleName, max 4096 for projectRoot) and sensible defaults for optional parameters.

35-52: LGTM!

Output schema is well-structured with clear descriptions and provides useful metadata about cache status for debugging and monitoring.

231-237: LGTM!

The token estimation is clearly documented as a rough estimate. The 4 characters per token ratio is a reasonable approximation for general text.

src/mcp/dependency/dependencyGraph.ts (3)

1-11: LGTM!

The CycleInfo interface is well-documented with clear field descriptions.

55-76: LGTM!

The getAllDependencies method correctly implements a post-order DFS traversal, ensuring dependencies appear before dependents in the result. The visited set prevents infinite loops in cyclic graphs.

257-279: LGTM!

The getStats method provides comprehensive graph statistics. The O(V²) complexity from calling getDepth per node is acceptable for typical monorepo sizes, but could be optimized with memoization if needed for very large graphs.

src/mcp/detection/projectDetector.ts (6)

1-8: LGTM!

Standard Node.js imports with appropriate promisification of execFile for async usage.

40-57: LGTM!

Good use of dependency injection pattern for testability. The ProjectDetectorDeps interface allows mocking all external dependencies.

74-109: LGTM!

The detect() method has a clear four-step workflow with appropriate separation of concerns. Git submodule detection is correctly applied as a post-processing step.

114-135: LGTM!

The .gitmodules parsing handles the common format correctly. The error handling appropriately treats missing .gitmodules as "no submodules" rather than an error.

413-491: LGTM!

Go module detection handles both go.work workspaces and common directory patterns. The go.mod parsing correctly extracts module names. Similar to Python, dependency extraction is not implemented but acceptable for initial version.

496-560: LGTM!

The mixed project detection and helper methods are well-implemented. The submodule type detection has a clear priority order for projects with multiple manifest files.

src/mcp/cache/cacheManager.ts (4)

1-33: LGTM!

Good dependency injection pattern with CacheManagerDeps for testability. The imports include appropriate security utilities.

77-84: LGTM!

The readMeta method correctly uses Zod schema validation (CacheMetadataSchema.parse) to ensure cache metadata integrity. This aligns with the PR's security enhancement for schema validation.

91-107: LGTM!

The method correctly uses validatePathWithinRoot to prevent path traversal attacks and execFileAsync (not exec) to avoid shell injection. Returning an empty string on failure allows callers to treat it as "unknown" state.

159-221: LGTM!

The check and get methods have comprehensive validation with clear return types. Error handling appropriately returns structured results rather than throwing.

coderabbitai · 2026-01-08T14:10:44Z

.gitignore

 .aider*

 # repomix runner
+**/*.repomix.lock


⚠️ Potential issue | 🟡 Minor

Gitignore pattern may not match the actual lock file.

The pattern **/*.repomix.lock matches files ending in .repomix.lock with at least one character before the dot (e.g., foo.repomix.lock). However, the lock file created by fileLock.ts is named .repomix.lock (with a leading dot), which this pattern won't match.

💡 Suggested fix

# repomix runner -**/*.repomix.lock +**/.repomix.lock .repomix/

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

**/*.repomix.lock

# repomix runner

**/.repomix.lock

.repomix/

🤖 Prompt for AI Agents

In @.gitignore at line 42, The .gitignore entry **/*.repomix.lock does not match the lock file created by fileLock.ts because that file is named .repomix.lock (leading dot); update the pattern to include hidden filenames—e.g., add .repomix.lock and/or **/.repomix.lock (or simply /.repomix.lock) so the actual .repomix.lock created by fileLock.ts is ignored.

coderabbitai · 2026-01-08T14:10:44Z

src/mcp/cache/cacheManager.ts

+  /**
+   * List all cached submodules
+   */
+  async listCached(): Promise<string[]> {
+    try {
+      const files = await fs.readdir(this.cacheDir);
+      const metaFiles = files.filter((f) => f.endsWith('.meta.json'));
+      return metaFiles.map((f) => f.replace('.meta.json', ''));
+    } catch {
+      return [];
+    }
+  }


⚠️ Potential issue | 🟡 Minor

Inconsistent dependency usage: fs.readdir should use injected deps.

listCached uses fs.readdir directly (line 275) instead of this.deps.fsReaddir, breaking the dependency injection pattern and making this method harder to test.

🐛 Use injected dependency

async listCached(): Promise<string[]> { try { - const files = await fs.readdir(this.cacheDir); + const files = await this.deps.fsReaddir(this.cacheDir); const metaFiles = files.filter((f) => f.endsWith('.meta.json')); return metaFiles.map((f) => f.replace('.meta.json', '')); } catch { return []; } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

/**

* List all cached submodules

*/

async listCached(): Promise<string[]> {

try {

const files = await fs.readdir(this.cacheDir);

const metaFiles = files.filter((f) => f.endsWith('.meta.json'));

return metaFiles.map((f) => f.replace('.meta.json', ''));

} catch {

return [];

}

}

/**

* List all cached submodules

*/

async listCached(): Promise<string[]> {

try {

const files = await this.deps.fsReaddir(this.cacheDir);

const metaFiles = files.filter((f) => f.endsWith('.meta.json'));

return metaFiles.map((f) => f.replace('.meta.json', ''));

} catch {

return [];

}

}

🤖 Prompt for AI Agents

In @src/mcp/cache/cacheManager.ts around lines 270 - 281, The listCached method uses the global fs.readdir instead of the injected dependency; change listCached to call this.deps.fsReaddir(this.cacheDir) and await its result, then filter for '.meta.json' and map to remove the suffix as before, and keep the same try/catch behavior (return [] on error) so tests can inject a mock fsReaddir via deps.

coderabbitai · 2026-01-08T14:10:44Z

src/mcp/config/monorepoConfigLoader.ts

+export const MonorepoConfigSchema = z.object({
+  submodules: z
+    .record(z.string(), SubmoduleConfigSchema)
+    .refine((obj) => Object.keys(obj).length <= MAX_SUBMODULES, {
+      message: `Too many submodules defined (maximum: ${MAX_SUBMODULES})`,
+    })
+    .describe('Map of submodule name to configuration'),


⚠️ Potential issue | 🟠 Major

Submodule name keys lack path-safety validation.

The submodules record uses z.string() for keys without any validation. Malicious submodule names like ../../../etc would pass schema validation but could enable path traversal in downstream code (e.g., the generated cache script, cache file paths).

Consider using a Zod transform or refinement to validate submodule names:

🔒 Suggested fix

+// Safe submodule name pattern +const SafeSubmoduleNameSchema = z.string() + .min(1) + .max(255) + .regex(/^[a-zA-Z0-9._-]+$/, 'Submodule name contains invalid characters'); export const MonorepoConfigSchema = z.object({ submodules: z - .record(z.string(), SubmoduleConfigSchema) + .record(SafeSubmoduleNameSchema, SubmoduleConfigSchema) .refine((obj) => Object.keys(obj).length <= MAX_SUBMODULES, { message: `Too many submodules defined (maximum: ${MAX_SUBMODULES})`, }) .describe('Map of submodule name to configuration'),

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

export const MonorepoConfigSchema = z.object({

submodules: z

.record(z.string(), SubmoduleConfigSchema)

.refine((obj) => Object.keys(obj).length <= MAX_SUBMODULES, {

message: `Too many submodules defined (maximum: ${MAX_SUBMODULES})`,

})

.describe('Map of submodule name to configuration'),

// Safe submodule name pattern

const SafeSubmoduleNameSchema = z.string()

.min(1)

.max(255)

.regex(/^[a-zA-Z0-9._-]+$/, 'Submodule name contains invalid characters');

export const MonorepoConfigSchema = z.object({

submodules: z

.record(SafeSubmoduleNameSchema, SubmoduleConfigSchema)

.refine((obj) => Object.keys(obj).length <= MAX_SUBMODULES, {

message: `Too many submodules defined (maximum: ${MAX_SUBMODULES})`,

})

.describe('Map of submodule name to configuration'),

🤖 Prompt for AI Agents

In @src/mcp/config/monorepoConfigLoader.ts around lines 44 - 50, The MonorepoConfigSchema's submodules record uses z.string() for keys which allows unsafe names; update the schema so submodule keys are validated for path-safety (no path traversal segments, no absolute paths, no directory separators, allowed chars only) by replacing the key schema with a refined or transformed string (e.g., z.string().refine(...) or using z.preprocess/transform) that enforces a safe-name regex and rejects names like "../../../etc" before applying SubmoduleConfigSchema; keep the existing MAX_SUBMODULES refinement and ensure the error message clearly indicates an invalid submodule name when the key validation fails.

coderabbitai · 2026-01-08T14:10:44Z

src/mcp/config/monorepoConfigLoader.ts

+  } catch (error) {
+    if ((error as NodeJS.ErrnoException).code === 'ENOENT') {
+      logger.trace(`Monorepo config not found at ${configPath}`);
+      return null;
+    }
+    logger.trace(`Failed to load monorepo config: ${(error as Error).message}`);
+    return null;
+  }


⚠️ Potential issue | 🟡 Minor

Validation errors are silently swallowed, returning null.

When the config file exists but fails Zod validation (e.g., invalid schema), the function returns null with only a trace-level log. This makes it appear as if no config exists, hiding configuration errors from users.

Consider differentiating between "file not found" (return null) and "invalid config" (throw or log at warning level):

💡 Suggested improvement

} catch (error) { if ((error as NodeJS.ErrnoException).code === 'ENOENT') { logger.trace(`Monorepo config not found at ${configPath}`); return null; } - logger.trace(`Failed to load monorepo config: ${(error as Error).message}`); - return null; + // Config exists but is invalid - this is likely a user error + logger.warn(`Invalid monorepo config at ${configPath}: ${(error as Error).message}`); + throw error; }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

} catch (error) {

if ((error as NodeJS.ErrnoException).code === 'ENOENT') {

logger.trace(`Monorepo config not found at ${configPath}`);

return null;

}

logger.trace(`Failed to load monorepo config: ${(error as Error).message}`);

return null;

}

} catch (error) {

if ((error as NodeJS.ErrnoException).code === 'ENOENT') {

logger.trace(`Monorepo config not found at ${configPath}`);

return null;

}

// Config exists but is invalid - this is likely a user error

logger.warn(`Invalid monorepo config at ${configPath}: ${(error as Error).message}`);

throw error;

}

🤖 Prompt for AI Agents

In @src/mcp/config/monorepoConfigLoader.ts around lines 90 - 97, The catch block in monorepoConfigLoader.ts currently maps all errors to a null return, hiding validation issues; update the error handling in that catch (referencing configPath and logger) so ENOENT still returns null, but Zod validation failures (detect via error.name === 'ZodError' or instance of ZodError) are handled separately: log at warning/error level with the validation details and rethrow (or return a failure) so invalid configs are surfaced to callers; keep the existing trace for other unexpected errors but ensure validation errors are not silently swallowed.

coderabbitai · 2026-01-08T14:10:44Z

src/mcp/dependency/dependencyGraph.ts

+  getDepth(submoduleName: string): number {
+    const visited = new Set<string>();
+
+    const calculateDepth = (name: string): number => {
+      if (visited.has(name)) return 0;
+      visited.add(name);
+
+      const deps = this.graph.get(name) || new Set();
+      if (deps.size === 0) return 0;
+
+      let maxDepth = 0;
+      for (const dep of deps) {
+        maxDepth = Math.max(maxDepth, calculateDepth(dep) + 1);
+      }
+      return maxDepth;
+    };
+
+    return calculateDepth(submoduleName);
+  }


⚠️ Potential issue | 🟠 Major

Bug: getDepth may return incorrect values for DAGs with shared dependencies.

The visited set prevents re-computation but also causes incorrect depth calculations when a node is reachable through multiple paths. For example, in a graph where A→B→D and A→C→D, if D is visited via B first, visiting via C returns 0 for D's depth, causing C's depth to be underestimated.

🐛 Suggested fix using memoization instead of simple visited tracking

getDepth(submoduleName: string): number { - const visited = new Set<string>(); + const memo = new Map<string, number>(); const calculateDepth = (name: string): number => { - if (visited.has(name)) return 0; - visited.add(name); + if (memo.has(name)) return memo.get(name)!; const deps = this.graph.get(name) || new Set(); - if (deps.size === 0) return 0; + if (deps.size === 0) { + memo.set(name, 0); + return 0; + } let maxDepth = 0; for (const dep of deps) { maxDepth = Math.max(maxDepth, calculateDepth(dep) + 1); } + memo.set(name, maxDepth); return maxDepth; }; return calculateDepth(submoduleName); }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

getDepth(submoduleName: string): number {

const visited = new Set<string>();

const calculateDepth = (name: string): number => {

if (visited.has(name)) return 0;

visited.add(name);

const deps = this.graph.get(name) || new Set();

if (deps.size === 0) return 0;

let maxDepth = 0;

for (const dep of deps) {

maxDepth = Math.max(maxDepth, calculateDepth(dep) + 1);

}

return maxDepth;

};

return calculateDepth(submoduleName);

}

getDepth(submoduleName: string): number {

const memo = new Map<string, number>();

const calculateDepth = (name: string): number => {

if (memo.has(name)) return memo.get(name)!;

const deps = this.graph.get(name) || new Set();

if (deps.size === 0) {

memo.set(name, 0);

return 0;

}

let maxDepth = 0;

for (const dep of deps) {

maxDepth = Math.max(maxDepth, calculateDepth(dep) + 1);

}

memo.set(name, maxDepth);

return maxDepth;

};

return calculateDepth(submoduleName);

}

🤖 Prompt for AI Agents

In @src/mcp/dependency/dependencyGraph.ts around lines 220 - 238, The current getDepth implementation uses a single visited Set which short-circuits repeated visits and underestimates depths when nodes are shared across multiple paths; replace that logic with a memoization Map to cache computed depths (e.g., depthCache) and use a separate recursion stack Set only to detect cycles; update calculateDepth to: if depthCache.has(name) return cached value, mark name in recursion stack to detect cycles, compute max child depth recursively, store the computed depth in depthCache before returning, and remove name from recursion stack; keep using this.graph to read dependencies.

coderabbitai · 2026-01-08T14:10:44Z

src/mcp/generation/justfileGenerator.ts

+    // Install git hooks
+    lines.push('# Install git post-commit hook for automatic cache invalidation');
+    lines.push('install-hooks:');
+    lines.push('    #!/usr/bin/env bash');
+    lines.push('    mkdir -p .git/hooks');
+    lines.push("    cat > .git/hooks/post-commit << 'HOOK'");
+    lines.push('    #!/bin/bash');
+    lines.push('    # Repomix cache invalidation hook');
+    lines.push('    (node {{script_dir}}/repomix-cache.mjs --changed-only &) 2>/dev/null');
+    lines.push('    HOOK');
+    lines.push('    chmod +x .git/hooks/post-commit');
+    lines.push('    @echo "Git hook installed"');
+    lines.push('');


⚠️ Potential issue | 🟡 Minor

Git hook installation may have shell injection risk.

The install-hooks recipe interpolates {{script_dir}} directly into a shell script written to .git/hooks/post-commit. If scriptDir contains shell metacharacters, this could lead to unexpected behavior.

Additionally, the heredoc approach with cat > .git/hooks/post-commit << 'HOOK' uses a quoted delimiter which prevents variable expansion, but the {{script_dir}} is a just variable that gets expanded before the shell sees it.

💡 Suggested improvement

Consider validating scriptDir to contain only safe characters, or escape it properly:

lines.push('# Install git post-commit hook for automatic cache invalidation'); lines.push('install-hooks:'); lines.push(' #!/usr/bin/env bash'); + lines.push(' script_dir="{{script_dir}}"'); lines.push(' mkdir -p .git/hooks'); - lines.push(" cat > .git/hooks/post-commit << 'HOOK'"); + lines.push(' cat > .git/hooks/post-commit << HOOK'); lines.push(' #!/bin/bash'); lines.push(' # Repomix cache invalidation hook'); - lines.push(' (node {{script_dir}}/repomix-cache.mjs --changed-only &) 2>/dev/null'); + lines.push(' (node "$script_dir/repomix-cache.mjs" --changed-only &) 2>/dev/null'); lines.push(' HOOK');

coderabbitai · 2026-01-08T14:10:45Z

src/mcp/generation/justfileGenerator.ts

+    for (const [name] of Object.entries(config.submodules)) {
+      const metaPath = path.join(cacheDir, \`\${name}.meta.json\`);
+      try {
+        const metaContent = await fs.readFile(metaPath, 'utf-8');
+        const meta = JSON.parse(metaContent);
+        const lastUpdated = new Date(meta.generatedAt).toLocaleString();
+        console.log(name.padEnd(30), 'Cached'.padEnd(15), lastUpdated);
+      } catch {
+        console.log(name.padEnd(30), 'Not cached'.padEnd(15), '-');
+      }
+    }
+    return;
+  }


⚠️ Potential issue | 🟠 Major

Generated script is vulnerable to path traversal via malicious submodule names.

The generated cache script constructs file paths using submodule names directly from the config file (lines 299, 330-331) without sanitization. A malicious .repomix-monorepo.json with a submodule named ../../../etc/passwd would allow path traversal.

While the MonorepoConfigSchema in monorepoConfigLoader.ts validates the config structure, it doesn't sanitize submodule names for path safety. The path validation utilities from pathValidator.ts exist but aren't used in the generated script.

🔒 Suggested fix

Add path sanitization in the generated script:

// Add near the top of the generated script function sanitizeName(name) { return name .replace(/[/\\]/g, '-') .replace(/[^a-zA-Z0-9._-]/g, '') .replace(/\\.{2,}/g, '.') .replace(/^\\.+|\\.+$/g, ''); } // Then use it when constructing paths: const safeName = sanitizeName(name); const metaPath = path.join(cacheDir, \`\${safeName}.meta.json\`);

Alternatively, add Zod validation for submodule names in MonorepoConfigSchema to reject names with path separators or traversal sequences.

🤖 Prompt for AI Agents

In @src/mcp/generation/justfileGenerator.ts around lines 298 - 310, The loop that builds metaPath from config.submodules (see the for (const [name] of Object.entries(config.submodules)) block and the metaPath constant) is vulnerable to path traversal; fix it by sanitizing or validating each submodule name before using it in path.join: add or call a sanitizer (e.g., a sanitizeName function or existing validators in pathValidator.ts) to strip path separators/traversal sequences and unsafe characters, then use the sanitized name (safeName) when constructing metaPath and any other file paths; alternatively enforce a Zod validation rule in MonorepoConfigSchema to reject names containing '/' '\' or '..' and update the generated script to rely on that validated/sanitized name.

coderabbitai · 2026-01-08T14:10:45Z

src/mcp/tools/getSubmoduleContextTool.ts

+  const tempDir = await createToolWorkspace();
+  const styleKey = (options.style || 'xml') as keyof typeof defaultFilePathMap;
+  const outputFileName = defaultFilePathMap[styleKey];
+  const outputFilePath = path.join(tempDir, outputFileName);


⚠️ Potential issue | 🟡 Minor

Temporary directory is created but not cleaned up.

The createToolWorkspace() creates a temp directory, but there's no cleanup after reading the output file. This could lead to disk space accumulation over time, especially with frequent tool usage.

🧹 Suggested cleanup

+import fs from 'node:fs/promises'; + async function generateSubmoduleContent( rootDir: string, submodulePath: string, options: { compress?: boolean; style?: string; includePatterns?: string; ignorePatterns?: string; showLineNumbers?: boolean; removeComments?: boolean; }, ): Promise<{ output: string; fileCount: number; tokenCount: number }> { const tempDir = await createToolWorkspace(); - const styleKey = (options.style || 'xml') as keyof typeof defaultFilePathMap; - const outputFileName = defaultFilePathMap[styleKey]; - const outputFilePath = path.join(tempDir, outputFileName); - - // Validate submodule path to prevent path traversal - const fullPath = validateSubmodulePath(submodulePath, rootDir); - - const cliOptions: CliOptions = { - compress: options.compress, - include: options.includePatterns, - ignore: options.ignorePatterns, - output: outputFilePath, - style: styleKey, - securityCheck: true, - quiet: true, - outputShowLineNumbers: options.showLineNumbers, - removeComments: options.removeComments, - }; - - const result = await runCli(['.'], fullPath, cliOptions); - if (!result) { - throw new Error(`Failed to pack submodule: ${submodulePath}`); - } - - const { packResult } = result; - const output = await import('node:fs/promises').then((fs) => fs.readFile(outputFilePath, 'utf-8')); - - return { - output, - fileCount: packResult.totalFiles, - tokenCount: packResult.totalTokens, - }; + try { + const styleKey = (options.style || 'xml') as keyof typeof defaultFilePathMap; + const outputFileName = defaultFilePathMap[styleKey]; + const outputFilePath = path.join(tempDir, outputFileName); + + // Validate submodule path to prevent path traversal + const fullPath = validateSubmodulePath(submodulePath, rootDir); + + const cliOptions: CliOptions = { + compress: options.compress, + include: options.includePatterns, + ignore: options.ignorePatterns, + output: outputFilePath, + style: styleKey, + securityCheck: true, + quiet: true, + outputShowLineNumbers: options.showLineNumbers, + removeComments: options.removeComments, + }; + + const result = await runCli(['.'], fullPath, cliOptions); + if (!result) { + throw new Error(`Failed to pack submodule: ${submodulePath}`); + } + + const { packResult } = result; + const output = await fs.readFile(outputFilePath, 'utf-8'); + + return { + output, + fileCount: packResult.totalFiles, + tokenCount: packResult.totalTokens, + }; + } finally { + await fs.rm(tempDir, { recursive: true, force: true }).catch(() => {}); + } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const tempDir = await createToolWorkspace();

const styleKey = (options.style || 'xml') as keyof typeof defaultFilePathMap;

const outputFileName = defaultFilePathMap[styleKey];

const outputFilePath = path.join(tempDir, outputFileName);

import fs from 'node:fs/promises';

async function generateSubmoduleContent(

rootDir: string,

submodulePath: string,

options: {

compress?: boolean;

style?: string;

includePatterns?: string;

ignorePatterns?: string;

showLineNumbers?: boolean;

removeComments?: boolean;

},

): Promise<{ output: string; fileCount: number; tokenCount: number }> {

const tempDir = await createToolWorkspace();

try {

const styleKey = (options.style || 'xml') as keyof typeof defaultFilePathMap;

const outputFileName = defaultFilePathMap[styleKey];

const outputFilePath = path.join(tempDir, outputFileName);

// Validate submodule path to prevent path traversal

const fullPath = validateSubmodulePath(submodulePath, rootDir);

const cliOptions: CliOptions = {

compress: options.compress,

include: options.includePatterns,

ignore: options.ignorePatterns,

output: outputFilePath,

style: styleKey,

securityCheck: true,

quiet: true,

outputShowLineNumbers: options.showLineNumbers,

removeComments: options.removeComments,

};

const result = await runCli(['.'], fullPath, cliOptions);

if (!result) {

throw new Error(`Failed to pack submodule: ${submodulePath}`);

}

const { packResult } = result;

const output = await fs.readFile(outputFilePath, 'utf-8');

return {

output,

fileCount: packResult.totalFiles,

tokenCount: packResult.totalTokens,

};

} finally {

await fs.rm(tempDir, { recursive: true, force: true }).catch(() => {});

}

}

🤖 Prompt for AI Agents

In @src/mcp/tools/getSubmoduleContextTool.ts around lines 196 - 199, createToolWorkspace() creates a temporary directory (tempDir) that is never removed; wrap the logic that reads the output file (using outputFilePath) inside a try/finally so you always clean up the tempDir, and in the finally call an async removal (e.g., fs.promises.rm or fs.rm with recursive/force equivalent) awaiting its completion and handling any removal errors gracefully; update the function containing tempDir (e.g., getSubmoduleContextTool) to ensure tempDir is removed even on errors or early returns.

gemini-code-assist

Code Review

This pull request introduces comprehensive monorepo support, including intelligent caching, a file-locking mechanism to prevent race conditions, and several security enhancements like path traversal protection and schema validation. The implementation is well-structured, introducing modular components for caching, dependency graph management, and project structure detection. The addition of new MCP tools for monorepos and automated justfile generation significantly improves the developer experience. My review identified a few issues: a high-severity bug in the generated cache script where --changed-only functionality is not correctly implemented, and a couple of medium-severity items related to a hardcoded version string in the cache metadata and fragile parsing of .gitmodules files. Overall, this is a strong feature addition that will greatly benefit users working with monorepos.

gemini-code-assist · 2026-01-08T14:13:31Z

src/mcp/generation/justfileGenerator.ts

+export function generateCacheScript(): string {
+  return `#!/usr/bin/env node
+/**
+ * Repomix Monorepo Cache Management Script
+ * Auto-generated by repomix init_monorepo_config
+ */
+
+import fs from 'node:fs/promises';
+import path from 'node:path';
+import { parseArgs } from 'node:util';
+
+const CONFIG_FILE = '.repomix-monorepo.json';
+
+async function main() {
+  const { values } = parseArgs({
+    options: {
+      all: { type: 'boolean', default: false },
+      module: { type: 'string' },
+      'changed-only': { type: 'boolean', default: false },
+      regenerate: { type: 'boolean', default: false },
+      invalidate: { type: 'boolean', default: false },
+      status: { type: 'boolean', default: false },
+      list: { type: 'boolean', default: false },
+      dependencies: { type: 'boolean', default: false },
+    },
+  });
+
+  // Load config
+  let config;
+  try {
+    const content = await fs.readFile(CONFIG_FILE, 'utf-8');
+    config = JSON.parse(content);
+  } catch (error) {
+    console.error('Error: Could not load', CONFIG_FILE);
+    console.error('Run the init_monorepo_config MCP tool first.');
+    process.exit(1);
+  }
+
+  const cacheDir = config.cache?.directory || '.repomix-cache';
+
+  if (values.list) {
+    console.log('Configured submodules:');
+    for (const [name, sub] of Object.entries(config.submodules)) {
+      const submodule = sub;
+      console.log(\`  - \${name}: \${submodule.path}\${submodule.isGitSubmodule ? ' (git submodule)' : ''}\`);
+    }
+    return;
+  }
+
+  if (values.dependencies) {
+    console.log('Dependency graph:');
+    for (const [name, sub] of Object.entries(config.submodules)) {
+      const submodule = sub;
+      const deps = submodule.dependencies || [];
+      console.log(\`  \${name}: \${deps.length > 0 ? deps.join(', ') : '(no dependencies)'}\`);
+    }
+    return;
+  }
+
+  if (values.status) {
+    console.log('Cache status:');
+    console.log('');
+    console.log('Submodule'.padEnd(30), 'Status'.padEnd(15), 'Last Updated');
+    console.log('-'.repeat(70));
+
+    for (const [name] of Object.entries(config.submodules)) {
+      const metaPath = path.join(cacheDir, \`\${name}.meta.json\`);
+      try {
+        const metaContent = await fs.readFile(metaPath, 'utf-8');
+        const meta = JSON.parse(metaContent);
+        const lastUpdated = new Date(meta.generatedAt).toLocaleString();
+        console.log(name.padEnd(30), 'Cached'.padEnd(15), lastUpdated);
+      } catch {
+        console.log(name.padEnd(30), 'Not cached'.padEnd(15), '-');
+      }
+    }
+    return;
+  }
+
+  // Determine which submodules to process
+  let submodulesToProcess = [];
+  if (values.module) {
+    if (!config.submodules[values.module]) {
+      console.error(\`Error: Submodule "\${values.module}" not found in configuration\`);
+      process.exit(1);
+    }
+    submodulesToProcess = [values.module];
+  } else if (values.all || values['changed-only']) {
+    submodulesToProcess = Object.keys(config.submodules);
+  } else {
+    console.error('Error: Specify --all, --changed-only, or --module=<name>');
+    process.exit(1);
+  }
+
+  // Process submodules
+  for (const name of submodulesToProcess) {
+    if (values.invalidate) {
+      const metaPath = path.join(cacheDir, \`\${name}.meta.json\`);
+      const contentPath = path.join(cacheDir, \`\${name}.xml\`);
+      await fs.rm(metaPath, { force: true });
+      await fs.rm(contentPath, { force: true });
+      console.log(\`Invalidated: \${name}\`);
+    } else if (values.regenerate) {
+      console.log(\`Regenerating: \${name} (use MCP tool for actual regeneration)\`);
+      // Note: Actual regeneration happens through MCP tool
+      // This script just manages cache files
+    }
+  }
+}
+
+main().catch(console.error);
+`;
+}


The generated repomix-cache.mjs script has a functional bug related to the --changed-only flag. The justfile recipes update-changed-caches and the post-commit git hook both use this flag, with comments indicating they will only process changed submodules. However, the generated script's logic (e.g., lines 320-321 in the template) treats --changed-only identically to --all, preparing to process all submodules regardless of their status. Furthermore, the script doesn't actually perform any update; it only logs a "Regenerating" message. This is misleading and renders the 'intelligent caching' aspect of the generated tooling non-functional. The script should implement proper change detection to fulfill its documented purpose.

gemini-code-assist · 2026-01-08T14:13:31Z

src/mcp/detection/projectDetector.ts

+      const content = await this.deps.fsReadFile(gitmodulesPath, 'utf-8');
+
+      // Parse [submodule "xxx"] path = xxx
+      const pathRegex = /path\s*=\s*(.+)/g;


The regex for parsing .gitmodules is a bit too simple and may not correctly handle all valid git-config formats, especially paths that contain spaces and are quoted. For example, if a path is defined as path = "my libs/lib a", this regex will capture the path with quotes included. A more robust parsing approach would improve compatibility.

gemini-code-assist · 2026-01-08T14:13:31Z

src/mcp/tools/getSubmoduleContextTool.ts

+              fileCount: content.fileCount,
+              tokenCount: content.tokenCount,
+              dependencies: subConfig.dependencies,
+              repomixVersion: '1.0.0', // TODO: Get from package.json


The repomixVersion is hardcoded as '1.0.0'. This should be retrieved dynamically from package.json to ensure cache integrity across different versions of repomix, as the TODO comment indicates.

You can achieve this by:

Importing getVersion at the top of the file:
import { getVersion } from '../../core/file/packageJsonParse.js';

Fetching the version once before the loop inside the tool handler:
const repomixVersion = await getVersion();

Using this repomixVersion variable in the cacheManager.set call.

yamadashy · 2026-01-11T16:20:12Z

Hi @RyderFreeman4Logos, thank you for your interest in contributing to Repomix!

For significant feature additions like this, I'd appreciate if you could open an issue first to discuss the proposal before implementation. This helps us understand the use case and explore whether it aligns with Repomix's direction.

I'm curious about the specific problem this solves. What's the motivation for monorepo submodule support in the MCP server? Is there existing demand from users?

The scope is quite large, and I want to be mindful of the maintenance burden. I'd love to hear more about the use case before moving forward.

Thanks for understanding!

RyderFreeman4Logos · 2026-01-12T22:59:41Z

Hi @RyderFreeman4Logos, thank you for your interest in contributing to Repomix!

For significant feature additions like this, I'd appreciate if you could open an issue first to discuss the proposal before implementation. This helps us understand the use case and explore whether it aligns with Repomix's direction.

I'm curious about the specific problem this solves. What's the motivation for monorepo submodule support in the MCP server? Is there existing demand from users?

The scope is quite large, and I want to be mindful of the maintenance burden. I'd love to hear more about the use case before moving forward.

Thanks for understanding!

Hi @yamadashy,

Thank you for taking the time to review this PR and for your thoughtful feedback. You're absolutely right that we should have opened an issue first to discuss the feature before implementation. I apologize for not following that process.

Context and Motivation

I'm working on a nested Rust monorepo (mandate-monorepo) with 13 workspace members across multiple sub-repositories (~59,000 lines of Rust code):

mandate-core/           (sub-repo: core business logic)
├── core
├── mandate-proto
└── mandate-audit-cli

mandate-client/         (sub-repo: client-side, Dioxus UI)
├── mandate-client-core
├── mandate-dioxus
├── mandate-dev-cli
└── mandate-test-harness

mandate-enterprise/     (sub-repo: enterprise features)
├── mandate-enterprise
└── migration

+ mandate-server
+ mandate-edge
+ mandate-teloxide-serv
+ mandate-teloxide-audit-helper

The typical workflow involves using Claude Code with Repomix MCP for AI-assisted development. However, I ran into a practical problem:

The Problem: When using pack_codebase on the entire monorepo, the output becomes massive (often 50,000+ tokens), which:

Consumes excessive Claude context window
Makes each query slower and more expensive
Includes irrelevant code from other submodules when working on a specific component

Current Workaround: Manually packing individual submodules (e.g., repomix crates/mandate-client) is tedious and error-prone, requiring manual path management.

The Solution

The monorepo submodule support provides:

On-demand loading: Load only the submodule you're currently working on (get_submodule_context submoduleName="mandate-client")
Smart caching: Git commit-based invalidation means cache is automatically refreshed only when code actually changes
Dependency awareness: Automatically includes dependent submodules (e.g., if mandate-client depends on mandate-core, both are loaded)
Auto-detection: init_monorepo_config detects project structure automatically (Rust workspaces, npm/pnpm workspaces, Go workspaces, Python monorepos)

Token Savings Example:

Full monorepo pack: ~50,000 tokens
Single submodule with deps: ~5,000 tokens
90% token reduction for typical queries

Use Cases

This feature is designed for:

Large monorepo projects with multiple independent modules
Teams using Claude Code + Repomix MCP for AI-assisted development
Projects where different team members work on different submodules
Situations where loading the entire codebase is prohibitively expensive

Maintenance Considerations

I completely understand your concern about maintenance burden. Here's what I've tried to address:

Self-contained: The monorepo features are isolated in dedicated modules (src/mcp/cache/, src/mcp/config/, src/mcp/dependency/, src/mcp/detection/)
Well-tested: Includes comprehensive test coverage (14 test files added)
Optional: Existing users who don't need monorepo support are unaffected
Documented: Detailed MCP tool descriptions and error messages
Security-focused: Path traversal protection, submodule count limits, schema validation

However, I acknowledge this is a significant feature addition (2,760+ lines). If you feel the maintenance cost outweighs the benefit, I completely respect that decision.

User Demand

Regarding existing user demand: I don't have data on how many other users need this feature. This was built to solve my own workflow pain point. If there's insufficient community interest, it may not be worth the maintenance burden.

Next Steps

I'm happy to:

Open an issue to gather community feedback before proceeding
Simplify the implementation if certain features are too complex
Close this PR if you decide monorepo support doesn't align with Repomix's roadmap

Please let me know how you'd like to proceed. I genuinely appreciate the project and want to contribute in a way that helps rather than adds unwanted complexity.

Thank you again for your time and for building such a valuable tool!

Best regards,
Ryder

RyderFreeman4Logos and others added 6 commits January 6, 2026 05:02

chore(config): Update .gitignore patterns

c612754

RyderFreeman4Logos requested a review from yamadashy as a code owner January 8, 2026 14:04

coderabbitai bot requested changes Jan 8, 2026

View reviewed changes

gemini-code-assist bot reviewed Jan 8, 2026

View reviewed changes

-  const tempDir = await createToolWorkspace();
-  const styleKey = (options.style || 'xml') as keyof typeof defaultFilePathMap;
-  const outputFileName = defaultFilePathMap[styleKey];
-  const outputFilePath = path.join(tempDir, outputFileName);
+import fs from 'node:fs/promises';
+async function generateSubmoduleContent(
+  rootDir: string,
+  submodulePath: string,
+  options: {
+    compress?: boolean;
+    style?: string;
+    includePatterns?: string;
+    ignorePatterns?: string;
+    showLineNumbers?: boolean;
+    removeComments?: boolean;
+  },
+): Promise<{ output: string; fileCount: number; tokenCount: number }> {
+  const tempDir = await createToolWorkspace();
+  try {
+    const styleKey = (options.style || 'xml') as keyof typeof defaultFilePathMap;
+    const outputFileName = defaultFilePathMap[styleKey];
+    const outputFilePath = path.join(tempDir, outputFileName);
+    // Validate submodule path to prevent path traversal
+    const fullPath = validateSubmodulePath(submodulePath, rootDir);
+    const cliOptions: CliOptions = {
+      compress: options.compress,
+      include: options.includePatterns,
+      ignore: options.ignorePatterns,
+      output: outputFilePath,
+      style: styleKey,
+      securityCheck: true,
+      quiet: true,
+      outputShowLineNumbers: options.showLineNumbers,
+      removeComments: options.removeComments,
+    };
+    const result = await runCli(['.'], fullPath, cliOptions);
+    if (!result) {
+      throw new Error(`Failed to pack submodule: ${submodulePath}`);
+    }
+    const { packResult } = result;
+    const output = await fs.readFile(outputFilePath, 'utf-8');
+    return {
+      output,
+      fileCount: packResult.totalFiles,
+      tokenCount: packResult.totalTokens,
+    };
+  } finally {
+    await fs.rm(tempDir, { recursive: true, force: true }).catch(() => {});
+  }
+}

Uh oh!

feat(mcp): Add monorepo submodule support with caching and security enhancements #1082

Are you sure you want to change the base?

feat(mcp): Add monorepo submodule support with caching and security enhancements #1082

Uh oh!

Conversation

RyderFreeman4Logos commented Jan 8, 2026

Summary

Checklist

Uh oh!

coderabbitai bot commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

gemini-code-assist bot commented Jan 8, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

yamadashy commented Jan 11, 2026

Uh oh!

RyderFreeman4Logos commented Jan 12, 2026

Context and Motivation

The Solution

Use Cases

Maintenance Considerations

User Demand

Next Steps

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

coderabbitai bot commented Jan 8, 2026 •

edited

Loading