-
-
Notifications
You must be signed in to change notification settings - Fork 1k
feat(mcp): Add monorepo submodule support with caching and security enhancements #1082
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat(mcp): Add monorepo submodule support with caching and security enhancements #1082
Conversation
Add 4 new MCP tools for monorepo development workflows: - init_monorepo_config: Auto-detect project structure (Rust/TS/Go/Python) - list_submodules: List configured submodules with dependencies - get_submodule_context: Load submodule content with caching - invalidate_submodule_cache: Manually invalidate cache Key features: - Git commit-based cache invalidation - Support for git submodules with independent history - Dependency graph with cycle detection - Justfile and script generation for cache management - Project detector for Rust workspace, npm/pnpm workspaces, Go workspaces This enables loading specific submodules on-demand instead of packing the entire monorepo, significantly reducing token usage for large projects. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Add security hardening to prevent path traversal attacks: - Create pathValidator.ts with validation utilities: - validatePathWithinRoot(): prevents escaping project root - validateProjectRoot(): normalizes and validates project root - sanitizeSubmoduleName(): sanitizes names for safe file paths - validateSubmodulePath(): validates submodule paths - Apply path validation to all MCP tools: - get_submodule_context: validate projectRoot and submodule paths - list_submodules: validate projectRoot - invalidate_submodule_cache: validate projectRoot - init_monorepo_config: validate projectRoot - Add Zod schema constraints: - submoduleName: min(1), max(255) - projectRoot: max(4096) - submodules array: max(100) items - cacheDirectory: min(1), max(255) - Add 20 unit tests for path validation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
[MOTIVATION] Multiple repomix processes running simultaneously in the same directory can cause data races and corrupt output files. Users need protection against accidental concurrent execution. [IMPLEMENTATION DETAILS] - Add src/core/lock/fileLock.ts with PID-based lock file mechanism - Lock is acquired before task execution in defaultAction.ts - Stale lock detection (checks if holding process is still alive) - Lock automatically released on completion or error (via finally block) - Skip locking for stdout mode (no file conflict possible) - 30-minute timeout as fallback for zombie processes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
[MOTIVATION] SEC-005: MonorepoConfigSchema had no size limit on submodules dictionary. Malicious configuration files could define thousands of submodules, causing memory exhaustion or excessive processing time. [IMPLEMENTATION DETAILS] - Added MAX_SUBMODULES constant (1000) to define maximum allowed submodules - Added .refine() validation to submodules field using Zod - Provides clear error message when limit is exceeded - Maintains backward compatibility for normal usage 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Add Zod schema validation when reading cache metadata to prevent security issues from malformed or malicious cache files. [MOTIVATION] Cache metadata files are read from disk and parsed as JSON without validation. This creates a security vulnerability where malformed or malicious cache files could cause runtime errors or unexpected behavior. [IMPLEMENTATION DETAILS] - Created CacheMetadataSchema in cacheTypes.ts using Zod - Schema matches all fields in CacheMetadata interface - Added validation for numeric fields (fileCount, tokenCount) to be non-negative integers - Updated readMeta() method to use CacheMetadataSchema.parse() - Zod validation throws meaningful errors on invalid data Fixes security issue SEC-003 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
📝 WalkthroughWalkthroughIntroduces a comprehensive monorepo caching and dependency management system featuring file-based locking for directory synchronization, configuration schemas for submodule organization, project type detection, dependency graph analysis with cycle detection, per-submodule caching with git-based invalidation, and four new MCP tools for monorepo operations. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant MCP as MCP Tool:<br/>getSubmoduleContext
participant Config as monorepo<br/>config
participant Graph as Dependency<br/>Graph
participant Cache as Cache<br/>Manager
participant CLI as CLI Packer
User->>MCP: Call with submoduleName,<br/>includeDependencies
MCP->>Config: Load monorepo config
Config-->>MCP: Config loaded
MCP->>Cache: Check cache for submodule
alt Cache valid
Cache-->>MCP: Cached content
else Cache miss/invalid
MCP->>CLI: Generate submodule content
CLI-->>MCP: Packed content
MCP->>Cache: Save to cache
Cache-->>MCP: Cached
end
alt Include dependencies
MCP->>Graph: Resolve dependencies
Graph-->>MCP: Dependency list
loop For each dependency
MCP->>Cache: Load/generate dependency
Cache-->>MCP: Dependency content
end
end
MCP-->>User: Response with content +<br/>metadata + token counts
sequenceDiagram
participant User
participant MCP as MCP Tool:<br/>initMonorepoConfig
participant Detector as Project<br/>Detector
participant Config as Config<br/>Saver
participant Generator as Justfile<br/>Generator
participant Git as .gitignore
User->>MCP: Call with projectRoot
MCP->>Detector: Detect project structure
Detector-->>MCP: Project type + submodules
MCP->>Config: Save monorepo config
Config-->>MCP: Config saved
alt Generate justfile
MCP->>Generator: Generate justfile
Generator-->>MCP: Justfile content
MCP->>Git: Update .gitignore
end
MCP-->>User: Success with config/justfile<br/>paths + detected info
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes Reasoning: This PR introduces ~2,500 lines across multiple new modules with heterogeneous logic domains (locking, caching, dependency analysis, project detection, security, MCP tools). Each module contains distinct algorithms (cycle detection, topological sorting, path traversal prevention) and complex interactions (cache invalidation, dependency resolution). The integration points (defaultAction, mcpServer) and dependency injection patterns add further reasoning overhead. However, the changes are well-organized into cohesive modules and include comprehensive test coverage reducing uncertainty. Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @RyderFreeman4Logos, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🤖 Fix all issues with AI agents
In @.gitignore:
- Line 42: The .gitignore entry **/*.repomix.lock does not match the lock file
created by fileLock.ts because that file is named .repomix.lock (leading dot);
update the pattern to include hidden filenames—e.g., add .repomix.lock and/or
**/.repomix.lock (or simply /.repomix.lock) so the actual .repomix.lock created
by fileLock.ts is ignored.
In @src/mcp/cache/cacheManager.ts:
- Around line 270-281: The listCached method uses the global fs.readdir instead
of the injected dependency; change listCached to call
this.deps.fsReaddir(this.cacheDir) and await its result, then filter for
'.meta.json' and map to remove the suffix as before, and keep the same try/catch
behavior (return [] on error) so tests can inject a mock fsReaddir via deps.
In @src/mcp/config/monorepoConfigLoader.ts:
- Around line 44-50: The MonorepoConfigSchema's submodules record uses
z.string() for keys which allows unsafe names; update the schema so submodule
keys are validated for path-safety (no path traversal segments, no absolute
paths, no directory separators, allowed chars only) by replacing the key schema
with a refined or transformed string (e.g., z.string().refine(...) or using
z.preprocess/transform) that enforces a safe-name regex and rejects names like
"../../../etc" before applying SubmoduleConfigSchema; keep the existing
MAX_SUBMODULES refinement and ensure the error message clearly indicates an
invalid submodule name when the key validation fails.
- Around line 90-97: The catch block in monorepoConfigLoader.ts currently maps
all errors to a null return, hiding validation issues; update the error handling
in that catch (referencing configPath and logger) so ENOENT still returns null,
but Zod validation failures (detect via error.name === 'ZodError' or instance of
ZodError) are handled separately: log at warning/error level with the validation
details and rethrow (or return a failure) so invalid configs are surfaced to
callers; keep the existing trace for other unexpected errors but ensure
validation errors are not silently swallowed.
In @src/mcp/dependency/dependencyGraph.ts:
- Around line 220-238: The current getDepth implementation uses a single visited
Set which short-circuits repeated visits and underestimates depths when nodes
are shared across multiple paths; replace that logic with a memoization Map to
cache computed depths (e.g., depthCache) and use a separate recursion stack Set
only to detect cycles; update calculateDepth to: if depthCache.has(name) return
cached value, mark name in recursion stack to detect cycles, compute max child
depth recursively, store the computed depth in depthCache before returning, and
remove name from recursion stack; keep using this.graph to read dependencies.
In @src/mcp/generation/justfileGenerator.ts:
- Around line 298-310: The loop that builds metaPath from config.submodules (see
the for (const [name] of Object.entries(config.submodules)) block and the
metaPath constant) is vulnerable to path traversal; fix it by sanitizing or
validating each submodule name before using it in path.join: add or call a
sanitizer (e.g., a sanitizeName function or existing validators in
pathValidator.ts) to strip path separators/traversal sequences and unsafe
characters, then use the sanitized name (safeName) when constructing metaPath
and any other file paths; alternatively enforce a Zod validation rule in
MonorepoConfigSchema to reject names containing '/' '\' or '..' and update the
generated script to rely on that validated/sanitized name.
In @src/mcp/tools/getSubmoduleContextTool.ts:
- Around line 196-199: createToolWorkspace() creates a temporary directory
(tempDir) that is never removed; wrap the logic that reads the output file
(using outputFilePath) inside a try/finally so you always clean up the tempDir,
and in the finally call an async removal (e.g., fs.promises.rm or fs.rm with
recursive/force equivalent) awaiting its completion and handling any removal
errors gracefully; update the function containing tempDir (e.g.,
getSubmoduleContextTool) to ensure tempDir is removed even on errors or early
returns.
🧹 Nitpick comments (22)
tests/mcp/security/pathValidator.test.ts (2)
51-55: Platform-specific assertion may fail on Windows.The test assumes absolute paths start with
/, which is Unix-specific. On Windows, absolute paths start with a drive letter (e.g.,C:\).💡 Suggested fix
it('should normalize and resolve relative paths', () => { const result = validateProjectRoot('./subdir'); expect(result).toContain('subdir'); - expect(result.startsWith('/')).toBe(true); // Should be absolute + expect(path.isAbsolute(result)).toBe(true); // Should be absolute });You'll need to import
pathfromnode:pathat the top of the file.
22-37: Consider adding edge case tests for security-critical path validation.The test coverage for
validatePathWithinRootis good for basic scenarios. For completeness, consider adding tests for:
- Null byte injection:
'crates/foo\x00/../etc/passwd'- Mixed path separators:
'crates\\foo/../bar'- Encoded traversal: paths with URL-encoded sequences
These edge cases help ensure the security utilities are robust against more sophisticated attacks.
src/mcp/generation/justfileGenerator.ts (1)
221-227: Consider reusingsanitizeSubmoduleNamefrom pathValidator.This
sanitizeNamemethod has similar logic tosanitizeSubmoduleNameinpathValidator.tsbut differs in behavior (lowercase conversion, different character handling). Consider either:
- Reusing the existing utility with an adapter
- Documenting why the differences are intentional (just recipe names vs filesystem paths)
src/core/lock/fileLock.ts (1)
41-54: Consider validating allLockInfofields.The validation only checks
pidandstartTimeare numbers, but doesn't verifycwdis a string. While this is unlikely to cause issues in practice, complete validation would be more robust.💡 Suggested improvement
if (typeof parsed.pid !== 'number' || typeof parsed.startTime !== 'number') { return null; } + if (typeof parsed.cwd !== 'string') { + return null; + }tests/mcp/config/monorepoConfigLoader.test.ts (1)
11-16: Consider adding tests for error scenarios.The test only covers the "file not found" case. Consider adding tests for:
- Invalid JSON in config file
- Valid JSON but invalid schema (e.g., missing required fields)
- Exceeding MAX_SUBMODULES limit
This would help catch regressions if the error handling behavior changes.
tests/cli/actions/defaultAction.tokenCountTree.test.ts (1)
16-20: Consider aligning the FileLockError mock with the actual implementation.The mock
FileLockErrorclass is simplified compared to the actual implementation, which acceptslockPathandexistingLockparameters. While this may be sufficient for the current tests, consider including these properties to ensure consistency if future tests need to verify error details.♻️ Suggested enhancement
-vi.mock('../../../src/core/lock/index', () => ({ - acquireLock: vi.fn().mockResolvedValue('/mock/.repomix.lock'), - releaseLock: vi.fn().mockResolvedValue(undefined), - FileLockError: class FileLockError extends Error {}, -})); +vi.mock('../../../src/core/lock/index', () => ({ + acquireLock: vi.fn().mockResolvedValue('/mock/.repomix.lock'), + releaseLock: vi.fn().mockResolvedValue(undefined), + FileLockError: class FileLockError extends Error { + constructor(message: string, public lockPath: string, public existingLock?: unknown) { + super(message); + this.name = 'FileLockError'; + } + }, +}));src/mcp/cache/cacheTypes.ts (1)
10-20: Schema validation addresses the security fix objective.The Zod schema provides type-safe validation for cache metadata parsing. Consider using
z.infer<typeof CacheMetadataSchema>to derive theCacheMetadatainterface and eliminate the duplication between the schema (lines 10-20) and interface (lines 25-52).♻️ Optional: Derive interface from schema
export const CacheMetadataSchema = z.object({ submodule: z.string().describe('Submodule name'), generatedAt: z.string().describe('When the cache was generated'), gitCommit: z.string().describe('Git commit hash when cache was generated'), fileCount: z.number().int().nonnegative().describe('Number of files in the submodule'), tokenCount: z.number().int().nonnegative().describe('Total token count of the content'), dependencies: z.array(z.string()).describe('List of dependencies (other submodule names)'), repomixVersion: z.string().describe('Repomix version used to generate'), compressed: z.boolean().describe('Whether compression was enabled'), isGitSubmodule: z.boolean().optional().describe('Whether this is a git submodule'), }); -/** - * Metadata stored alongside cached content - */ -export interface CacheMetadata { - /** Submodule name */ - submodule: string; - // ... rest of interface -} +/** + * Metadata stored alongside cached content + */ +export type CacheMetadata = z.infer<typeof CacheMetadataSchema>;src/mcp/tools/initMonorepoConfigTool.ts (1)
133-142: Consider providing feedback when justfile already contains repomix section.When the existing justfile already includes the repomix section marker,
justfilePathremainsundefined, so the success message won't indicate justfile status. Users may wonder if their justfile was checked.♻️ Optional: Add feedback for existing repomix section
if (existingContent) { // Append to existing justfile if repomix section doesn't exist if (!existingContent.includes('# Repomix Monorepo Cache Management')) { await fs.writeFile(justfileFullPath, `${existingContent}\n\n${justfileContent}`, 'utf-8'); justfilePath = justfileFullPath; + } else { + // Section already exists, no changes needed + justfilePath = justfileFullPath; // or use a separate flag to indicate "already configured" } } else { await fs.writeFile(justfileFullPath, justfileContent, 'utf-8'); justfilePath = justfileFullPath; }src/mcp/tools/getSubmoduleContextTool.ts (4)
110-114: Consider logging when configured dependencies are missing.Silently skipping dependencies that aren't found in config (line 113) could mask configuration errors. A warning log would help users identify mismatches between declared dependencies and configured submodules.
♻️ Suggested improvement
const subConfig = config.submodules[name]; if (!subConfig) { + logger.warn(`Dependency "${name}" not found in monorepo config, skipping`); continue; // Skip if dependency not found in config }
148-148: Address the TODO: fetch version from package.json.The hardcoded
repomixVersion: '1.0.0'should be dynamically retrieved to ensure cache metadata accurately reflects the tool version. This could affect cache invalidation logic if the tool behavior changes between versions.Would you like me to help implement dynamic version retrieval from package.json?
155-156: Consider using accurate token counts from pack results instead of re-estimating.The
generateSubmoduleContentfunction returnstokenCountfrom the actual pack result, but this code re-estimates tokens from content length. For cached entries,meta.tokenCountis already available from the cache. This could provide more accurate totals.♻️ Suggested improvement
- // 5. Calculate total tokens - const totalTokens = Object.values(contents).reduce((sum, content) => sum + estimateTokens(content), 0); + // 5. Calculate total tokens from cache status metadata or estimates + let totalTokens = 0; + for (const status of cacheStatus) { + if (status.fromCache) { + const cached = await cacheManager.get(status.submodule, config.submodules[status.submodule].path, config.submodules[status.submodule].isGitSubmodule); + totalTokens += cached?.meta.tokenCount ?? estimateTokens(contents[status.submodule]); + } else { + // Token count was already computed during generation - could track it + totalTokens += estimateTokens(contents[status.submodule]); + } + }
222-222: Avoid dynamic import for already-available module.The dynamic
import('node:fs/promises')is unnecessary overhead. Either importfsat the top of the file or reuse an existing import.♻️ Suggested fix
Add at the top of the file:
import fs from 'node:fs/promises';Then replace line 222:
- const output = await import('node:fs/promises').then((fs) => fs.readFile(outputFilePath, 'utf-8')); + const output = await fs.readFile(outputFilePath, 'utf-8');src/mcp/dependency/dependencyGraph.ts (3)
24-40: Potential inconsistency: dependencies may reference non-existent submodules.The graph construction adds reverse edges for dependencies without verifying they exist as submodules. This could lead to orphan entries in
reverseGraphfor external dependencies that aren't part of the monorepo config. While the code handles this gracefully (returning empty arrays), it might be worth validating or filtering during construction.♻️ Optional validation during construction
constructor(config: MonorepoConfig) { this.graph = new Map(); this.reverseGraph = new Map(); + const allSubmoduleNames = new Set(Object.keys(config.submodules)); // Build forward and reverse dependency graphs for (const [name, submodule] of Object.entries(config.submodules)) { - this.graph.set(name, new Set(submodule.dependencies)); + // Filter to only internal dependencies + const internalDeps = submodule.dependencies.filter(dep => allSubmoduleNames.has(dep)); + this.graph.set(name, new Set(internalDeps)); // Initialize reverse graph entry if (!this.reverseGraph.has(name)) { this.reverseGraph.set(name, new Set()); } // Build reverse edges - for (const dep of submodule.dependencies) { + for (const dep of internalDeps) { if (!this.reverseGraph.has(dep)) { this.reverseGraph.set(dep, new Set()); } this.reverseGraph.get(dep)?.add(name); } } }
115-150: Cycle detection may report duplicate cycles.The algorithm can report the same cycle multiple times when it's reached from different entry points. For example, a cycle A→B→C→A might be reported as both [A,B,C,A] and [B,C,A,B]. Consider deduplicating cycles by normalizing them (e.g., starting from the lexicographically smallest node).
This is a minor concern since cycles in dependency graphs are typically errors that need to be fixed regardless of exact reporting.
152-166: Performance:topologicalSortperforms cycle detection twice.
topologicalSort()callshasCycles()which callsdetectCycles(). For large dependency graphs, this redundant traversal could be optimized by integrating cycle detection into the topological sort itself (Kahn's algorithm naturally detects cycles).This is acceptable for typical monorepo sizes but worth noting for future optimization if performance becomes a concern.
src/mcp/detection/projectDetector.ts (5)
184-190: Workspace member check may be overly permissive.The check
member.includes(pkg.name) || member.includes(path.dirname(pkg.manifest_path))could incorrectly match packages. For example, ifpkg.nameis "core" and a workspace member is "core-utils", this would incorrectly match. Consider more precise matching.♻️ More precise workspace member matching
const isWorkspaceMember = workspaceMembers.size === 0 || Array.from(workspaceMembers).some( - (member) => member.includes(pkg.name) || member.includes(path.dirname(pkg.manifest_path)), + (member) => { + // workspace_members format: "package_name version (path)" + return member.startsWith(`${pkg.name} `) || + member.includes(`(${path.dirname(pkg.manifest_path)})`); + }, );
284-310: Consider caching package.json content from first pass to avoid re-reading.The second pass re-reads
package.jsonfor each package that was already read in the first pass. This could be optimized by caching the parsed content.This is a minor performance improvement and may not be necessary for typical workspace sizes.
340-352: Silent fallback when fast-glob is unavailable may cause incomplete detection.If
fast-globisn't installed and complex glob patterns likepackages/**/*are used, the code silently skips them with only a trace log. This could result in missing submodules without clear indication to the user.Consider either:
- Adding
fast-globas a dependency- Logging at warning level when complex patterns are skipped
- Implementing basic
**pattern handling♻️ Improve visibility of skipped patterns
} catch { // fast-glob not available, skip complex patterns - logger.trace(`Skipping complex pattern: ${pattern}`); + logger.warn(`Cannot expand glob pattern "${pattern}": fast-glob not available. Some packages may not be detected.`); }
366-408: Python package detection doesn't extract inter-package dependencies.Unlike Rust and TypeScript detection which extract workspace dependencies, Python detection always returns empty
dependencies: []. Consider parsingpyproject.tomlfor dependencies that match discovered package names.This is acceptable for initial implementation, but limits the usefulness of dependency graph features for Python monorepos.
243-256: Using a YAML parser would improve robustness, but requires adding a new dependency.The regex-based YAML parsing at lines 246-250 is fragile and can fail with comments, nested structures, or non-standard formatting. However, the suggested
yamlpackage is not currently a project dependency and would need to be added topackage.json. Consider either:
- Adding
yamlorjs-yamlas a dependency and refactoring to use it- Documenting that the current implementation assumes standard pnpm-workspace.yaml format (simple list syntax without comments)
- Adding basic validation to detect unsupported formats and log a warning
src/mcp/cache/cacheManager.ts (2)
138-149: Potential double-slash in git diff path argument.Line 147 appends
/tosubmodulePathwithout checking if it already ends with a slash. While git typically handles this, it's cleaner to normalize:♻️ Normalize path before appending
// Normal directory: check for changes in that path + const normalizedPath = submodulePath.replace(/\/+$/, ''); const result = await this.deps.execFileAsync('git', [ '-C', this.rootDir, 'diff', '--name-only', lastCommit, currentCommit, '--', - `${submodulePath}/`, + `${normalizedPath}/`, ]);
226-243: Non-atomic writes could leave cache in inconsistent state.If the process crashes between writing metadata (line 239) and content (line 240), the cache could have metadata pointing to missing or outdated content. Consider writing to temporary files first, then renaming atomically.
♻️ Atomic write pattern
async set(submoduleName: string, content: string, meta: Omit<CacheMetadata, 'generatedAt'>): Promise<void> { const metaPath = this.getMetaPath(submoduleName); const contentPath = this.getContentPath(submoduleName); + const tempMetaPath = `${metaPath}.tmp`; + const tempContentPath = `${contentPath}.tmp`; const fullMeta: CacheMetadata = { ...meta, generatedAt: new Date().toISOString(), }; // Ensure cache directory exists await this.deps.fsMkdir(this.cacheDir, { recursive: true }); - // Write files - await this.deps.fsWriteFile(metaPath, JSON.stringify(fullMeta, null, 2), 'utf-8'); - await this.deps.fsWriteFile(contentPath, content, 'utf-8'); + // Write to temp files first + await this.deps.fsWriteFile(tempContentPath, content, 'utf-8'); + await this.deps.fsWriteFile(tempMetaPath, JSON.stringify(fullMeta, null, 2), 'utf-8'); + + // Rename atomically (content first, since metadata signals validity) + await fs.rename(tempContentPath, contentPath); + await fs.rename(tempMetaPath, metaPath); logger.trace(`Cache saved for ${submoduleName}`); }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (23)
.gitignoresrc/cli/actions/defaultAction.tssrc/core/lock/fileLock.tssrc/core/lock/index.tssrc/mcp/cache/cacheManager.tssrc/mcp/cache/cacheTypes.tssrc/mcp/config/monorepoConfigLoader.tssrc/mcp/dependency/dependencyGraph.tssrc/mcp/detection/projectDetector.tssrc/mcp/generation/justfileGenerator.tssrc/mcp/mcpServer.tssrc/mcp/security/pathValidator.tssrc/mcp/tools/getSubmoduleContextTool.tssrc/mcp/tools/initMonorepoConfigTool.tssrc/mcp/tools/invalidateSubmoduleCacheTool.tssrc/mcp/tools/listSubmodulesTool.tstests/cli/actions/defaultAction.test.tstests/cli/actions/defaultAction.tokenCountTree.test.tstests/core/lock/fileLock.test.tstests/mcp/cache/cacheManager.test.tstests/mcp/config/monorepoConfigLoader.test.tstests/mcp/dependency/dependencyGraph.test.tstests/mcp/security/pathValidator.test.ts
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-06-30T16:07:18.316Z
Learnt from: CR
Repo: yamadashy/repomix PR: 0
File: .cursorrules:0-0
Timestamp: 2025-06-30T16:07:18.316Z
Learning: Applies to .agents/rules/base.md : Check the rules written in `.agents/rules/base.md` as they contain important project-specific guidelines and instructions.
Applied to files:
.gitignore
📚 Learning: 2025-08-22T05:04:47.803Z
Learnt from: CR
Repo: yamadashy/repomix PR: 0
File: .github/instructions/base.instructions.md:0-0
Timestamp: 2025-08-22T05:04:47.803Z
Learning: Consult and adhere to the project-specific rules documented in .agents/rules/base.md during development and review
Applied to files:
.gitignore
📚 Learning: 2025-07-18T15:12:57.179Z
Learnt from: CR
Repo: yamadashy/repomix PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-07-18T15:12:57.179Z
Learning: .agents/rules/base.md
Applied to files:
.gitignore
🧬 Code graph analysis (15)
tests/mcp/security/pathValidator.test.ts (1)
src/mcp/security/pathValidator.ts (5)
validatePathWithinRoot(22-38)PathSecurityError(6-11)validateProjectRoot(47-53)sanitizeSubmoduleName(63-81)validateSubmodulePath(92-100)
tests/cli/actions/defaultAction.test.ts (2)
src/core/lock/fileLock.ts (1)
FileLockError(14-23)src/core/lock/index.ts (1)
FileLockError(2-2)
src/mcp/security/pathValidator.ts (1)
website/server/scripts/bundle.mjs (1)
rootDir(19-19)
src/mcp/tools/listSubmodulesTool.ts (4)
src/mcp/security/pathValidator.ts (1)
validateProjectRoot(47-53)src/mcp/config/monorepoConfigLoader.ts (2)
loadMonorepoConfig(83-98)getSubmoduleNames(121-123)src/mcp/tools/mcpToolRuntime.ts (3)
buildMcpToolErrorResponse(234-248)buildMcpToolSuccessResponse(215-227)convertErrorToJson(166-208)src/mcp/dependency/dependencyGraph.ts (1)
DependencyGraph(16-280)
tests/core/lock/fileLock.test.ts (3)
src/core/lock/fileLock.ts (4)
acquireLock(83-133)releaseLock(140-158)FileLockError(14-23)withLock(167-174)src/core/lock/index.ts (4)
acquireLock(2-2)releaseLock(2-2)FileLockError(2-2)withLock(2-2)src/shared/logger.ts (1)
error(34-38)
tests/mcp/cache/cacheManager.test.ts (2)
src/mcp/cache/cacheManager.ts (2)
CacheManagerDeps(15-23)CacheManager(38-282)src/mcp/cache/cacheTypes.ts (1)
CacheMetadata(25-52)
src/mcp/tools/getSubmoduleContextTool.ts (5)
src/mcp/security/pathValidator.ts (2)
validateProjectRoot(47-53)validateSubmodulePath(92-100)src/mcp/config/monorepoConfigLoader.ts (2)
loadMonorepoConfig(83-98)getSubmodule(114-116)src/mcp/tools/mcpToolRuntime.ts (4)
buildMcpToolErrorResponse(234-248)buildMcpToolSuccessResponse(215-227)convertErrorToJson(166-208)createToolWorkspace(49-59)src/mcp/dependency/dependencyGraph.ts (1)
DependencyGraph(16-280)src/mcp/cache/cacheManager.ts (1)
CacheManager(38-282)
src/mcp/config/monorepoConfigLoader.ts (1)
src/shared/logger.ts (1)
error(34-38)
src/mcp/tools/invalidateSubmoduleCacheTool.ts (4)
src/mcp/security/pathValidator.ts (1)
validateProjectRoot(47-53)src/mcp/config/monorepoConfigLoader.ts (2)
loadMonorepoConfig(83-98)getSubmoduleNames(121-123)src/mcp/tools/mcpToolRuntime.ts (3)
buildMcpToolErrorResponse(234-248)buildMcpToolSuccessResponse(215-227)convertErrorToJson(166-208)src/mcp/cache/cacheManager.ts (1)
CacheManager(38-282)
src/mcp/generation/justfileGenerator.ts (2)
src/mcp/detection/projectDetector.ts (1)
DetectedSubmodule(22-29)src/mcp/config/monorepoConfigLoader.ts (1)
MonorepoConfig(71-71)
src/mcp/mcpServer.ts (4)
src/mcp/tools/initMonorepoConfigTool.ts (1)
registerInitMonorepoConfigTool(46-193)src/mcp/tools/listSubmodulesTool.ts (1)
registerListSubmodulesTool(45-133)src/mcp/tools/getSubmoduleContextTool.ts (1)
registerGetSubmoduleContextTool(57-179)src/mcp/tools/invalidateSubmoduleCacheTool.ts (1)
registerInvalidateSubmoduleCacheTool(32-104)
tests/cli/actions/defaultAction.tokenCountTree.test.ts (1)
src/core/lock/fileLock.ts (1)
FileLockError(14-23)
src/mcp/dependency/dependencyGraph.ts (1)
src/mcp/config/monorepoConfigLoader.ts (1)
MonorepoConfig(71-71)
tests/mcp/config/monorepoConfigLoader.test.ts (1)
src/mcp/config/monorepoConfigLoader.ts (5)
loadMonorepoConfig(83-98)MonorepoConfig(71-71)getSubmodule(114-116)getSubmoduleNames(121-123)MONOREPO_CONFIG_FILE(76-76)
src/mcp/cache/cacheManager.ts (2)
src/mcp/security/pathValidator.ts (2)
sanitizeSubmoduleName(63-81)validatePathWithinRoot(22-38)src/mcp/cache/cacheTypes.ts (4)
CacheMetadata(25-52)CacheMetadataSchema(10-20)CacheCheckResult(68-80)CachedContent(57-63)
🔇 Additional comments (54)
tests/mcp/security/pathValidator.test.ts (1)
1-126: Good test coverage for security-critical path validation.The test suite comprehensively covers the path validation utilities including traversal attacks, name sanitization, and boundary conditions. The tests verify that
PathSecurityErroris thrown appropriately for invalid inputs.src/mcp/security/pathValidator.ts (3)
22-38: AbsoluteuserPathbypasses root resolution - verify this is intentional.When
userPathis an absolute path (e.g.,/etc/passwd),path.resolve(resolvedRoot, userPath)ignoresresolvedRootentirely and returns the absolute path. The subsequentpath.relativecheck should catch this via thepath.isAbsolute(relative)condition, but this behavior is subtle.The docstring mentions handling absolute paths, and the logic does reject them. This is correct behavior, but consider adding a comment clarifying why
path.isAbsolute(relative)is needed (to catch absoluteuserPaththat doesn't share the root prefix).
63-81: LGTM - thorough sanitization logic.The function properly handles path separators, invalid characters, dot sequences, and length limits. The regex chain is well-ordered to prevent bypass attempts.
92-99: Good defensive layering with explicit absolute path rejection.The explicit check for absolute paths before calling
validatePathWithinRootprovides a clearer error message to users and is a good defensive practice, even thoughvalidatePathWithinRootwould also catch this case.src/core/lock/fileLock.ts (3)
91-132: TOCTOU race condition is properly handled via atomic file creation.The check-then-act pattern between
readLockFile(line 92) andfs.open(line 116) has a potential race condition, but it's correctly mitigated by using the'wx'flag which fails atomically if the file already exists. TheEEXISThandler (lines 122-130) properly re-reads the lock to get the actual owner's info.Good implementation of the file locking pattern.
140-158: Minor TOCTOU in release - acceptable for this use case.There's a theoretical race between reading the lock (line 143) and unlinking it (line 145) where another process could delete and recreate the lock. The current implementation is acceptable since:
- This is a best-effort ownership check
- The scenario is unlikely in practice
- The worst case is removing another process's fresh lock, which would just cause a retry
167-174: Clean implementation of the lock wrapper pattern.The
withLockhelper properly ensures lock release in thefinallyblock, even when the wrapped function throws.src/mcp/config/monorepoConfigLoader.ts (1)
105-123: LGTM - Clean accessor functions.The
saveMonorepoConfig,getSubmodule, andgetSubmoduleNamesfunctions are straightforward and correctly implemented.tests/mcp/config/monorepoConfigLoader.test.ts (1)
18-87: Good coverage of accessor function behavior.The tests for
getSubmoduleandgetSubmoduleNamesthoroughly cover both successful lookups and edge cases (unknown submodule, empty config). The test fixtures correctly include all required fields per the schema..gitignore (1)
27-27: LGTM - Appropriate patterns for new artifacts.The broadened
**/repomix-output.*pattern and the new.claude/rulesentry align with the new monorepo tooling that may generate outputs in nested directories.Also applies to: 47-47
src/mcp/mcpServer.ts (3)
10-14: LGTM!The monorepo tool imports are well-organized and follow the established naming convention.
28-30: LGTM!The updated instructions clearly describe the monorepo tool workflow and provide helpful guidance for users.
58-62: LGTM!The monorepo tool registrations are properly grouped and follow the established pattern.
src/cli/actions/defaultAction.ts (3)
11-11: LGTM!The lock module import is correctly structured.
108-121: LGTM!The lock acquisition logic is well-structured:
- Correctly skips locking in stdout mode
- Properly handles
FileLockErrorby converting it toRepomixErrorfor user-facing messages- Only sets
lockPathon successful acquisition
156-160: LGTM!The lock release logic is correct:
- Placed in the
finallyblock to ensure cleanup- Conditional check prevents attempting to release when no lock was acquired
- Proper cleanup order (worker pool cleanup before lock release)
src/mcp/tools/invalidateSubmoduleCacheTool.ts (4)
10-21: LGTM!The input schema has appropriate validation constraints:
- Submodule name length limit (255) is reasonable
- Array size limit (100) aligns with the PR's MAX_SUBMODULES security fix
- Path length limit (4096) is appropriate for filesystem paths
- Optional
submodulesparameter allows invalidating all caches
23-27: LGTM!The output schema provides a clear structure with both machine-readable data (invalidated/skipped arrays) and human-readable summary (message).
48-67: LGTM!The handler initialization is well-structured:
- Uses
validateProjectRootto prevent path traversal attacks- Provides helpful error message when configuration is missing
- Correctly determines whether to invalidate all or specific submodules
69-103: LGTM!The invalidation logic and response handling are well-implemented:
- Properly validates submodule existence before attempting invalidation
- Builds comprehensive messages for all scenarios
- Returns structured response with optional
skippedfield- Uses consistent error handling with
convertErrorToJsonsrc/core/lock/index.ts (1)
1-2: LGTM!The re-export module provides a clean public API for the locking functionality.
tests/cli/actions/defaultAction.test.ts (1)
20-24: LGTM!The file lock mock is appropriately scoped for these tests. The simplified
FileLockErrorclass is sufficient since this test suite doesn't exercise lock error handling paths directly. The mock enables testing thedefaultActionflow without actual file system locking.tests/mcp/dependency/dependencyGraph.test.ts (2)
6-15: LGTM!The
createConfighelper function is well-designed for test readability. It cleanly transforms a simplified input format into the fullMonorepoConfigstructure with sensible defaults for cache and repomix settings.
17-192: Excellent test coverage for the DependencyGraph class.The test suite comprehensively validates:
- Direct and transitive dependency resolution
- Reverse dependency lookups (dependents)
- Cycle detection for both direct and indirect cycles
- Topological sorting with proper null handling for cyclic graphs
- Leaf/root node identification
- Graph statistics computation
The diamond dependency test case (lines 57-72) is particularly valuable for ensuring the deduplication logic in
getAllDependenciesworks correctly.src/mcp/tools/listSubmodulesTool.ts (2)
9-17: LGTM!The input schema has appropriate validation constraints. The
max(4096)limit onprojectRootprovides reasonable protection against oversized inputs, and the boolean defaults are sensibly conservative.
61-131: Well-structured tool implementation with good performance considerations.Key strengths:
- Lazy initialization of
DependencyGraph(lines 78-81) avoids unnecessary computation when dependency info isn't requested- Proper error handling with structured responses
- Path traversal protection via
validateProjectRootThe mapping of
CycleInfoobjects to theirdescriptionstrings (line 123) correctly aligns with the output schema'sz.array(z.string())definition.tests/mcp/cache/cacheManager.test.ts (3)
5-18: LGTM!The mock dependency injection pattern enables isolated unit testing without real filesystem or git interactions. The
CacheManagerDepsinterface provides a clean seam for testing.
20-83: Thorough test coverage for cache validity checking.The three test cases properly cover the key scenarios:
- Cache files missing →
exists: false, valid: false- Cache exists with matching commit →
exists: true, valid: true- Cache exists but commit changed →
exists: true, valid: falseThe sequential mock setup on lines 72-74 documents the expected call order for commit comparison logic.
85-160: LGTM!The remaining tests adequately cover:
set(): Directory creation and dual file writes (meta + content)invalidate(): Force removal of both cache filesinvalidateAll(): Recursive cache directory removalgetSubmoduleGitCommit(): Correct git command construction for both regular directories and git submodulessrc/mcp/cache/cacheTypes.ts (1)
54-91: LGTM!The
CachedContent,CacheCheckResult, andCacheOptionsinterfaces are well-documented and provide a clean API surface for cache operations. The optional fields appropriately handle varying cache states.tests/core/lock/fileLock.test.ts (4)
7-22: LGTM!The test setup properly isolates each test with a unique temporary directory and ensures cleanup. Using
os.tmpdir()withfs.mkdtempis the correct pattern for cross-platform test isolation.
24-82: Comprehensive acquireLock tests.Good coverage of:
- Lock file creation and path validation
- Lock metadata (PID, startTime) verification
- Re-acquisition prevention (same process)
- Stale lock detection and removal
Minor note: The PID
999999999(line 65) is used to simulate a non-existent process. While this could theoretically be a valid PID on some systems, it's an acceptable assumption for testing purposes.
84-171: LGTM!The
releaseLockandwithLocktests appropriately verify:
- Lock file removal on release
- Graceful handling of non-existent lock files
- Protection against releasing locks owned by other processes
- Lock lifecycle within
withLock(acquire → execute → release)- Proper cleanup even when the wrapped function throws
173-190: LGTM!The
FileLockErrortest validates that the error object contains the expected diagnostic information (lockPath,existingLock.pid), which is essential for debugging lock contention issues.src/mcp/tools/initMonorepoConfigTool.ts (3)
13-25: LGTM!The input schema has appropriate validation constraints:
cacheDirectorywithmin(1).max(255)prevents empty or excessively long paths- The
styleenum restricts to valid output formats- Sensible defaults for all optional parameters
198-213: LGTM!The
updateGitignorehelper correctly:
- Handles missing
.gitignorefiles gracefully- Normalizes the cache pattern with a leading slash
- Checks for existing entries before appending
- Uses a descriptive comment section for the addition
71-109: Well-structured tool implementation.The initialization flow follows a clear sequence:
- Path validation for security
- Project detection with helpful error messaging
- Configuration building from detected structure
- Persistence with ancillary tooling generation
The path traversal protection via
validateProjectRootaligns with the PR's security enhancement objectives.src/mcp/tools/getSubmoduleContextTool.ts (4)
1-18: LGTM!Imports are well-organized and include appropriate security utilities (
validateProjectRoot,validateSubmodulePath) for path validation.
19-33: LGTM!Input schema includes reasonable constraints (max 255 chars for submoduleName, max 4096 for projectRoot) and sensible defaults for optional parameters.
35-52: LGTM!Output schema is well-structured with clear descriptions and provides useful metadata about cache status for debugging and monitoring.
231-237: LGTM!The token estimation is clearly documented as a rough estimate. The 4 characters per token ratio is a reasonable approximation for general text.
src/mcp/dependency/dependencyGraph.ts (3)
1-11: LGTM!The
CycleInfointerface is well-documented with clear field descriptions.
55-76: LGTM!The
getAllDependenciesmethod correctly implements a post-order DFS traversal, ensuring dependencies appear before dependents in the result. Thevisitedset prevents infinite loops in cyclic graphs.
257-279: LGTM!The
getStatsmethod provides comprehensive graph statistics. The O(V²) complexity from callinggetDepthper node is acceptable for typical monorepo sizes, but could be optimized with memoization if needed for very large graphs.src/mcp/detection/projectDetector.ts (6)
1-8: LGTM!Standard Node.js imports with appropriate promisification of
execFilefor async usage.
40-57: LGTM!Good use of dependency injection pattern for testability. The
ProjectDetectorDepsinterface allows mocking all external dependencies.
74-109: LGTM!The
detect()method has a clear four-step workflow with appropriate separation of concerns. Git submodule detection is correctly applied as a post-processing step.
114-135: LGTM!The
.gitmodulesparsing handles the common format correctly. The error handling appropriately treats missing.gitmodulesas "no submodules" rather than an error.
413-491: LGTM!Go module detection handles both
go.workworkspaces and common directory patterns. Thego.modparsing correctly extracts module names. Similar to Python, dependency extraction is not implemented but acceptable for initial version.
496-560: LGTM!The mixed project detection and helper methods are well-implemented. The submodule type detection has a clear priority order for projects with multiple manifest files.
src/mcp/cache/cacheManager.ts (4)
1-33: LGTM!Good dependency injection pattern with
CacheManagerDepsfor testability. The imports include appropriate security utilities.
77-84: LGTM!The
readMetamethod correctly uses Zod schema validation (CacheMetadataSchema.parse) to ensure cache metadata integrity. This aligns with the PR's security enhancement for schema validation.
91-107: LGTM!The method correctly uses
validatePathWithinRootto prevent path traversal attacks andexecFileAsync(notexec) to avoid shell injection. Returning an empty string on failure allows callers to treat it as "unknown" state.
159-221: LGTM!The
checkandgetmethods have comprehensive validation with clear return types. Error handling appropriately returns structured results rather than throwing.
| .aider* | ||
|
|
||
| # repomix runner | ||
| **/*.repomix.lock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gitignore pattern may not match the actual lock file.
The pattern **/*.repomix.lock matches files ending in .repomix.lock with at least one character before the dot (e.g., foo.repomix.lock). However, the lock file created by fileLock.ts is named .repomix.lock (with a leading dot), which this pattern won't match.
💡 Suggested fix
# repomix runner
-**/*.repomix.lock
+**/.repomix.lock
.repomix/📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| **/*.repomix.lock | |
| # repomix runner | |
| **/.repomix.lock | |
| .repomix/ |
🤖 Prompt for AI Agents
In @.gitignore at line 42, The .gitignore entry **/*.repomix.lock does not match
the lock file created by fileLock.ts because that file is named .repomix.lock
(leading dot); update the pattern to include hidden filenames—e.g., add
.repomix.lock and/or **/.repomix.lock (or simply /.repomix.lock) so the actual
.repomix.lock created by fileLock.ts is ignored.
| /** | ||
| * List all cached submodules | ||
| */ | ||
| async listCached(): Promise<string[]> { | ||
| try { | ||
| const files = await fs.readdir(this.cacheDir); | ||
| const metaFiles = files.filter((f) => f.endsWith('.meta.json')); | ||
| return metaFiles.map((f) => f.replace('.meta.json', '')); | ||
| } catch { | ||
| return []; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent dependency usage: fs.readdir should use injected deps.
listCached uses fs.readdir directly (line 275) instead of this.deps.fsReaddir, breaking the dependency injection pattern and making this method harder to test.
🐛 Use injected dependency
async listCached(): Promise<string[]> {
try {
- const files = await fs.readdir(this.cacheDir);
+ const files = await this.deps.fsReaddir(this.cacheDir);
const metaFiles = files.filter((f) => f.endsWith('.meta.json'));
return metaFiles.map((f) => f.replace('.meta.json', ''));
} catch {
return [];
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| /** | |
| * List all cached submodules | |
| */ | |
| async listCached(): Promise<string[]> { | |
| try { | |
| const files = await fs.readdir(this.cacheDir); | |
| const metaFiles = files.filter((f) => f.endsWith('.meta.json')); | |
| return metaFiles.map((f) => f.replace('.meta.json', '')); | |
| } catch { | |
| return []; | |
| } | |
| } | |
| /** | |
| * List all cached submodules | |
| */ | |
| async listCached(): Promise<string[]> { | |
| try { | |
| const files = await this.deps.fsReaddir(this.cacheDir); | |
| const metaFiles = files.filter((f) => f.endsWith('.meta.json')); | |
| return metaFiles.map((f) => f.replace('.meta.json', '')); | |
| } catch { | |
| return []; | |
| } | |
| } |
🤖 Prompt for AI Agents
In @src/mcp/cache/cacheManager.ts around lines 270 - 281, The listCached method
uses the global fs.readdir instead of the injected dependency; change listCached
to call this.deps.fsReaddir(this.cacheDir) and await its result, then filter for
'.meta.json' and map to remove the suffix as before, and keep the same try/catch
behavior (return [] on error) so tests can inject a mock fsReaddir via deps.
| export const MonorepoConfigSchema = z.object({ | ||
| submodules: z | ||
| .record(z.string(), SubmoduleConfigSchema) | ||
| .refine((obj) => Object.keys(obj).length <= MAX_SUBMODULES, { | ||
| message: `Too many submodules defined (maximum: ${MAX_SUBMODULES})`, | ||
| }) | ||
| .describe('Map of submodule name to configuration'), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Submodule name keys lack path-safety validation.
The submodules record uses z.string() for keys without any validation. Malicious submodule names like ../../../etc would pass schema validation but could enable path traversal in downstream code (e.g., the generated cache script, cache file paths).
Consider using a Zod transform or refinement to validate submodule names:
🔒 Suggested fix
+// Safe submodule name pattern
+const SafeSubmoduleNameSchema = z.string()
+ .min(1)
+ .max(255)
+ .regex(/^[a-zA-Z0-9._-]+$/, 'Submodule name contains invalid characters');
export const MonorepoConfigSchema = z.object({
submodules: z
- .record(z.string(), SubmoduleConfigSchema)
+ .record(SafeSubmoduleNameSchema, SubmoduleConfigSchema)
.refine((obj) => Object.keys(obj).length <= MAX_SUBMODULES, {
message: `Too many submodules defined (maximum: ${MAX_SUBMODULES})`,
})
.describe('Map of submodule name to configuration'),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| export const MonorepoConfigSchema = z.object({ | |
| submodules: z | |
| .record(z.string(), SubmoduleConfigSchema) | |
| .refine((obj) => Object.keys(obj).length <= MAX_SUBMODULES, { | |
| message: `Too many submodules defined (maximum: ${MAX_SUBMODULES})`, | |
| }) | |
| .describe('Map of submodule name to configuration'), | |
| // Safe submodule name pattern | |
| const SafeSubmoduleNameSchema = z.string() | |
| .min(1) | |
| .max(255) | |
| .regex(/^[a-zA-Z0-9._-]+$/, 'Submodule name contains invalid characters'); | |
| export const MonorepoConfigSchema = z.object({ | |
| submodules: z | |
| .record(SafeSubmoduleNameSchema, SubmoduleConfigSchema) | |
| .refine((obj) => Object.keys(obj).length <= MAX_SUBMODULES, { | |
| message: `Too many submodules defined (maximum: ${MAX_SUBMODULES})`, | |
| }) | |
| .describe('Map of submodule name to configuration'), |
🤖 Prompt for AI Agents
In @src/mcp/config/monorepoConfigLoader.ts around lines 44 - 50, The
MonorepoConfigSchema's submodules record uses z.string() for keys which allows
unsafe names; update the schema so submodule keys are validated for path-safety
(no path traversal segments, no absolute paths, no directory separators, allowed
chars only) by replacing the key schema with a refined or transformed string
(e.g., z.string().refine(...) or using z.preprocess/transform) that enforces a
safe-name regex and rejects names like "../../../etc" before applying
SubmoduleConfigSchema; keep the existing MAX_SUBMODULES refinement and ensure
the error message clearly indicates an invalid submodule name when the key
validation fails.
| } catch (error) { | ||
| if ((error as NodeJS.ErrnoException).code === 'ENOENT') { | ||
| logger.trace(`Monorepo config not found at ${configPath}`); | ||
| return null; | ||
| } | ||
| logger.trace(`Failed to load monorepo config: ${(error as Error).message}`); | ||
| return null; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Validation errors are silently swallowed, returning null.
When the config file exists but fails Zod validation (e.g., invalid schema), the function returns null with only a trace-level log. This makes it appear as if no config exists, hiding configuration errors from users.
Consider differentiating between "file not found" (return null) and "invalid config" (throw or log at warning level):
💡 Suggested improvement
} catch (error) {
if ((error as NodeJS.ErrnoException).code === 'ENOENT') {
logger.trace(`Monorepo config not found at ${configPath}`);
return null;
}
- logger.trace(`Failed to load monorepo config: ${(error as Error).message}`);
- return null;
+ // Config exists but is invalid - this is likely a user error
+ logger.warn(`Invalid monorepo config at ${configPath}: ${(error as Error).message}`);
+ throw error;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| } catch (error) { | |
| if ((error as NodeJS.ErrnoException).code === 'ENOENT') { | |
| logger.trace(`Monorepo config not found at ${configPath}`); | |
| return null; | |
| } | |
| logger.trace(`Failed to load monorepo config: ${(error as Error).message}`); | |
| return null; | |
| } | |
| } catch (error) { | |
| if ((error as NodeJS.ErrnoException).code === 'ENOENT') { | |
| logger.trace(`Monorepo config not found at ${configPath}`); | |
| return null; | |
| } | |
| // Config exists but is invalid - this is likely a user error | |
| logger.warn(`Invalid monorepo config at ${configPath}: ${(error as Error).message}`); | |
| throw error; | |
| } |
🤖 Prompt for AI Agents
In @src/mcp/config/monorepoConfigLoader.ts around lines 90 - 97, The catch block
in monorepoConfigLoader.ts currently maps all errors to a null return, hiding
validation issues; update the error handling in that catch (referencing
configPath and logger) so ENOENT still returns null, but Zod validation failures
(detect via error.name === 'ZodError' or instance of ZodError) are handled
separately: log at warning/error level with the validation details and rethrow
(or return a failure) so invalid configs are surfaced to callers; keep the
existing trace for other unexpected errors but ensure validation errors are not
silently swallowed.
| getDepth(submoduleName: string): number { | ||
| const visited = new Set<string>(); | ||
|
|
||
| const calculateDepth = (name: string): number => { | ||
| if (visited.has(name)) return 0; | ||
| visited.add(name); | ||
|
|
||
| const deps = this.graph.get(name) || new Set(); | ||
| if (deps.size === 0) return 0; | ||
|
|
||
| let maxDepth = 0; | ||
| for (const dep of deps) { | ||
| maxDepth = Math.max(maxDepth, calculateDepth(dep) + 1); | ||
| } | ||
| return maxDepth; | ||
| }; | ||
|
|
||
| return calculateDepth(submoduleName); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: getDepth may return incorrect values for DAGs with shared dependencies.
The visited set prevents re-computation but also causes incorrect depth calculations when a node is reachable through multiple paths. For example, in a graph where A→B→D and A→C→D, if D is visited via B first, visiting via C returns 0 for D's depth, causing C's depth to be underestimated.
🐛 Suggested fix using memoization instead of simple visited tracking
getDepth(submoduleName: string): number {
- const visited = new Set<string>();
+ const memo = new Map<string, number>();
const calculateDepth = (name: string): number => {
- if (visited.has(name)) return 0;
- visited.add(name);
+ if (memo.has(name)) return memo.get(name)!;
const deps = this.graph.get(name) || new Set();
- if (deps.size === 0) return 0;
+ if (deps.size === 0) {
+ memo.set(name, 0);
+ return 0;
+ }
let maxDepth = 0;
for (const dep of deps) {
maxDepth = Math.max(maxDepth, calculateDepth(dep) + 1);
}
+ memo.set(name, maxDepth);
return maxDepth;
};
return calculateDepth(submoduleName);
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| getDepth(submoduleName: string): number { | |
| const visited = new Set<string>(); | |
| const calculateDepth = (name: string): number => { | |
| if (visited.has(name)) return 0; | |
| visited.add(name); | |
| const deps = this.graph.get(name) || new Set(); | |
| if (deps.size === 0) return 0; | |
| let maxDepth = 0; | |
| for (const dep of deps) { | |
| maxDepth = Math.max(maxDepth, calculateDepth(dep) + 1); | |
| } | |
| return maxDepth; | |
| }; | |
| return calculateDepth(submoduleName); | |
| } | |
| getDepth(submoduleName: string): number { | |
| const memo = new Map<string, number>(); | |
| const calculateDepth = (name: string): number => { | |
| if (memo.has(name)) return memo.get(name)!; | |
| const deps = this.graph.get(name) || new Set(); | |
| if (deps.size === 0) { | |
| memo.set(name, 0); | |
| return 0; | |
| } | |
| let maxDepth = 0; | |
| for (const dep of deps) { | |
| maxDepth = Math.max(maxDepth, calculateDepth(dep) + 1); | |
| } | |
| memo.set(name, maxDepth); | |
| return maxDepth; | |
| }; | |
| return calculateDepth(submoduleName); | |
| } |
🤖 Prompt for AI Agents
In @src/mcp/dependency/dependencyGraph.ts around lines 220 - 238, The current
getDepth implementation uses a single visited Set which short-circuits repeated
visits and underestimates depths when nodes are shared across multiple paths;
replace that logic with a memoization Map to cache computed depths (e.g.,
depthCache) and use a separate recursion stack Set only to detect cycles; update
calculateDepth to: if depthCache.has(name) return cached value, mark name in
recursion stack to detect cycles, compute max child depth recursively, store the
computed depth in depthCache before returning, and remove name from recursion
stack; keep using this.graph to read dependencies.
| // Install git hooks | ||
| lines.push('# Install git post-commit hook for automatic cache invalidation'); | ||
| lines.push('install-hooks:'); | ||
| lines.push(' #!/usr/bin/env bash'); | ||
| lines.push(' mkdir -p .git/hooks'); | ||
| lines.push(" cat > .git/hooks/post-commit << 'HOOK'"); | ||
| lines.push(' #!/bin/bash'); | ||
| lines.push(' # Repomix cache invalidation hook'); | ||
| lines.push(' (node {{script_dir}}/repomix-cache.mjs --changed-only &) 2>/dev/null'); | ||
| lines.push(' HOOK'); | ||
| lines.push(' chmod +x .git/hooks/post-commit'); | ||
| lines.push(' @echo "Git hook installed"'); | ||
| lines.push(''); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Git hook installation may have shell injection risk.
The install-hooks recipe interpolates {{script_dir}} directly into a shell script written to .git/hooks/post-commit. If scriptDir contains shell metacharacters, this could lead to unexpected behavior.
Additionally, the heredoc approach with cat > .git/hooks/post-commit << 'HOOK' uses a quoted delimiter which prevents variable expansion, but the {{script_dir}} is a just variable that gets expanded before the shell sees it.
💡 Suggested improvement
Consider validating scriptDir to contain only safe characters, or escape it properly:
lines.push('# Install git post-commit hook for automatic cache invalidation');
lines.push('install-hooks:');
lines.push(' #!/usr/bin/env bash');
+ lines.push(' script_dir="{{script_dir}}"');
lines.push(' mkdir -p .git/hooks');
- lines.push(" cat > .git/hooks/post-commit << 'HOOK'");
+ lines.push(' cat > .git/hooks/post-commit << HOOK');
lines.push(' #!/bin/bash');
lines.push(' # Repomix cache invalidation hook');
- lines.push(' (node {{script_dir}}/repomix-cache.mjs --changed-only &) 2>/dev/null');
+ lines.push(' (node "$script_dir/repomix-cache.mjs" --changed-only &) 2>/dev/null');
lines.push(' HOOK');| for (const [name] of Object.entries(config.submodules)) { | ||
| const metaPath = path.join(cacheDir, \`\${name}.meta.json\`); | ||
| try { | ||
| const metaContent = await fs.readFile(metaPath, 'utf-8'); | ||
| const meta = JSON.parse(metaContent); | ||
| const lastUpdated = new Date(meta.generatedAt).toLocaleString(); | ||
| console.log(name.padEnd(30), 'Cached'.padEnd(15), lastUpdated); | ||
| } catch { | ||
| console.log(name.padEnd(30), 'Not cached'.padEnd(15), '-'); | ||
| } | ||
| } | ||
| return; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generated script is vulnerable to path traversal via malicious submodule names.
The generated cache script constructs file paths using submodule names directly from the config file (lines 299, 330-331) without sanitization. A malicious .repomix-monorepo.json with a submodule named ../../../etc/passwd would allow path traversal.
While the MonorepoConfigSchema in monorepoConfigLoader.ts validates the config structure, it doesn't sanitize submodule names for path safety. The path validation utilities from pathValidator.ts exist but aren't used in the generated script.
🔒 Suggested fix
Add path sanitization in the generated script:
// Add near the top of the generated script
function sanitizeName(name) {
return name
.replace(/[/\\]/g, '-')
.replace(/[^a-zA-Z0-9._-]/g, '')
.replace(/\\.{2,}/g, '.')
.replace(/^\\.+|\\.+$/g, '');
}
// Then use it when constructing paths:
const safeName = sanitizeName(name);
const metaPath = path.join(cacheDir, \`\${safeName}.meta.json\`);Alternatively, add Zod validation for submodule names in MonorepoConfigSchema to reject names with path separators or traversal sequences.
🤖 Prompt for AI Agents
In @src/mcp/generation/justfileGenerator.ts around lines 298 - 310, The loop
that builds metaPath from config.submodules (see the for (const [name] of
Object.entries(config.submodules)) block and the metaPath constant) is
vulnerable to path traversal; fix it by sanitizing or validating each submodule
name before using it in path.join: add or call a sanitizer (e.g., a sanitizeName
function or existing validators in pathValidator.ts) to strip path
separators/traversal sequences and unsafe characters, then use the sanitized
name (safeName) when constructing metaPath and any other file paths;
alternatively enforce a Zod validation rule in MonorepoConfigSchema to reject
names containing '/' '\' or '..' and update the generated script to rely on that
validated/sanitized name.
| const tempDir = await createToolWorkspace(); | ||
| const styleKey = (options.style || 'xml') as keyof typeof defaultFilePathMap; | ||
| const outputFileName = defaultFilePathMap[styleKey]; | ||
| const outputFilePath = path.join(tempDir, outputFileName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Temporary directory is created but not cleaned up.
The createToolWorkspace() creates a temp directory, but there's no cleanup after reading the output file. This could lead to disk space accumulation over time, especially with frequent tool usage.
🧹 Suggested cleanup
+import fs from 'node:fs/promises';
+
async function generateSubmoduleContent(
rootDir: string,
submodulePath: string,
options: {
compress?: boolean;
style?: string;
includePatterns?: string;
ignorePatterns?: string;
showLineNumbers?: boolean;
removeComments?: boolean;
},
): Promise<{ output: string; fileCount: number; tokenCount: number }> {
const tempDir = await createToolWorkspace();
- const styleKey = (options.style || 'xml') as keyof typeof defaultFilePathMap;
- const outputFileName = defaultFilePathMap[styleKey];
- const outputFilePath = path.join(tempDir, outputFileName);
-
- // Validate submodule path to prevent path traversal
- const fullPath = validateSubmodulePath(submodulePath, rootDir);
-
- const cliOptions: CliOptions = {
- compress: options.compress,
- include: options.includePatterns,
- ignore: options.ignorePatterns,
- output: outputFilePath,
- style: styleKey,
- securityCheck: true,
- quiet: true,
- outputShowLineNumbers: options.showLineNumbers,
- removeComments: options.removeComments,
- };
-
- const result = await runCli(['.'], fullPath, cliOptions);
- if (!result) {
- throw new Error(`Failed to pack submodule: ${submodulePath}`);
- }
-
- const { packResult } = result;
- const output = await import('node:fs/promises').then((fs) => fs.readFile(outputFilePath, 'utf-8'));
-
- return {
- output,
- fileCount: packResult.totalFiles,
- tokenCount: packResult.totalTokens,
- };
+ try {
+ const styleKey = (options.style || 'xml') as keyof typeof defaultFilePathMap;
+ const outputFileName = defaultFilePathMap[styleKey];
+ const outputFilePath = path.join(tempDir, outputFileName);
+
+ // Validate submodule path to prevent path traversal
+ const fullPath = validateSubmodulePath(submodulePath, rootDir);
+
+ const cliOptions: CliOptions = {
+ compress: options.compress,
+ include: options.includePatterns,
+ ignore: options.ignorePatterns,
+ output: outputFilePath,
+ style: styleKey,
+ securityCheck: true,
+ quiet: true,
+ outputShowLineNumbers: options.showLineNumbers,
+ removeComments: options.removeComments,
+ };
+
+ const result = await runCli(['.'], fullPath, cliOptions);
+ if (!result) {
+ throw new Error(`Failed to pack submodule: ${submodulePath}`);
+ }
+
+ const { packResult } = result;
+ const output = await fs.readFile(outputFilePath, 'utf-8');
+
+ return {
+ output,
+ fileCount: packResult.totalFiles,
+ tokenCount: packResult.totalTokens,
+ };
+ } finally {
+ await fs.rm(tempDir, { recursive: true, force: true }).catch(() => {});
+ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const tempDir = await createToolWorkspace(); | |
| const styleKey = (options.style || 'xml') as keyof typeof defaultFilePathMap; | |
| const outputFileName = defaultFilePathMap[styleKey]; | |
| const outputFilePath = path.join(tempDir, outputFileName); | |
| import fs from 'node:fs/promises'; | |
| async function generateSubmoduleContent( | |
| rootDir: string, | |
| submodulePath: string, | |
| options: { | |
| compress?: boolean; | |
| style?: string; | |
| includePatterns?: string; | |
| ignorePatterns?: string; | |
| showLineNumbers?: boolean; | |
| removeComments?: boolean; | |
| }, | |
| ): Promise<{ output: string; fileCount: number; tokenCount: number }> { | |
| const tempDir = await createToolWorkspace(); | |
| try { | |
| const styleKey = (options.style || 'xml') as keyof typeof defaultFilePathMap; | |
| const outputFileName = defaultFilePathMap[styleKey]; | |
| const outputFilePath = path.join(tempDir, outputFileName); | |
| // Validate submodule path to prevent path traversal | |
| const fullPath = validateSubmodulePath(submodulePath, rootDir); | |
| const cliOptions: CliOptions = { | |
| compress: options.compress, | |
| include: options.includePatterns, | |
| ignore: options.ignorePatterns, | |
| output: outputFilePath, | |
| style: styleKey, | |
| securityCheck: true, | |
| quiet: true, | |
| outputShowLineNumbers: options.showLineNumbers, | |
| removeComments: options.removeComments, | |
| }; | |
| const result = await runCli(['.'], fullPath, cliOptions); | |
| if (!result) { | |
| throw new Error(`Failed to pack submodule: ${submodulePath}`); | |
| } | |
| const { packResult } = result; | |
| const output = await fs.readFile(outputFilePath, 'utf-8'); | |
| return { | |
| output, | |
| fileCount: packResult.totalFiles, | |
| tokenCount: packResult.totalTokens, | |
| }; | |
| } finally { | |
| await fs.rm(tempDir, { recursive: true, force: true }).catch(() => {}); | |
| } | |
| } |
🤖 Prompt for AI Agents
In @src/mcp/tools/getSubmoduleContextTool.ts around lines 196 - 199,
createToolWorkspace() creates a temporary directory (tempDir) that is never
removed; wrap the logic that reads the output file (using outputFilePath) inside
a try/finally so you always clean up the tempDir, and in the finally call an
async removal (e.g., fs.promises.rm or fs.rm with recursive/force equivalent)
awaiting its completion and handling any removal errors gracefully; update the
function containing tempDir (e.g., getSubmoduleContextTool) to ensure tempDir is
removed even on errors or early returns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces comprehensive monorepo support, including intelligent caching, a file-locking mechanism to prevent race conditions, and several security enhancements like path traversal protection and schema validation. The implementation is well-structured, introducing modular components for caching, dependency graph management, and project structure detection. The addition of new MCP tools for monorepos and automated justfile generation significantly improves the developer experience. My review identified a few issues: a high-severity bug in the generated cache script where --changed-only functionality is not correctly implemented, and a couple of medium-severity items related to a hardcoded version string in the cache metadata and fragile parsing of .gitmodules files. Overall, this is a strong feature addition that will greatly benefit users working with monorepos.
| export function generateCacheScript(): string { | ||
| return `#!/usr/bin/env node | ||
| /** | ||
| * Repomix Monorepo Cache Management Script | ||
| * Auto-generated by repomix init_monorepo_config | ||
| */ | ||
|
|
||
| import fs from 'node:fs/promises'; | ||
| import path from 'node:path'; | ||
| import { parseArgs } from 'node:util'; | ||
|
|
||
| const CONFIG_FILE = '.repomix-monorepo.json'; | ||
|
|
||
| async function main() { | ||
| const { values } = parseArgs({ | ||
| options: { | ||
| all: { type: 'boolean', default: false }, | ||
| module: { type: 'string' }, | ||
| 'changed-only': { type: 'boolean', default: false }, | ||
| regenerate: { type: 'boolean', default: false }, | ||
| invalidate: { type: 'boolean', default: false }, | ||
| status: { type: 'boolean', default: false }, | ||
| list: { type: 'boolean', default: false }, | ||
| dependencies: { type: 'boolean', default: false }, | ||
| }, | ||
| }); | ||
|
|
||
| // Load config | ||
| let config; | ||
| try { | ||
| const content = await fs.readFile(CONFIG_FILE, 'utf-8'); | ||
| config = JSON.parse(content); | ||
| } catch (error) { | ||
| console.error('Error: Could not load', CONFIG_FILE); | ||
| console.error('Run the init_monorepo_config MCP tool first.'); | ||
| process.exit(1); | ||
| } | ||
|
|
||
| const cacheDir = config.cache?.directory || '.repomix-cache'; | ||
|
|
||
| if (values.list) { | ||
| console.log('Configured submodules:'); | ||
| for (const [name, sub] of Object.entries(config.submodules)) { | ||
| const submodule = sub; | ||
| console.log(\` - \${name}: \${submodule.path}\${submodule.isGitSubmodule ? ' (git submodule)' : ''}\`); | ||
| } | ||
| return; | ||
| } | ||
|
|
||
| if (values.dependencies) { | ||
| console.log('Dependency graph:'); | ||
| for (const [name, sub] of Object.entries(config.submodules)) { | ||
| const submodule = sub; | ||
| const deps = submodule.dependencies || []; | ||
| console.log(\` \${name}: \${deps.length > 0 ? deps.join(', ') : '(no dependencies)'}\`); | ||
| } | ||
| return; | ||
| } | ||
|
|
||
| if (values.status) { | ||
| console.log('Cache status:'); | ||
| console.log(''); | ||
| console.log('Submodule'.padEnd(30), 'Status'.padEnd(15), 'Last Updated'); | ||
| console.log('-'.repeat(70)); | ||
|
|
||
| for (const [name] of Object.entries(config.submodules)) { | ||
| const metaPath = path.join(cacheDir, \`\${name}.meta.json\`); | ||
| try { | ||
| const metaContent = await fs.readFile(metaPath, 'utf-8'); | ||
| const meta = JSON.parse(metaContent); | ||
| const lastUpdated = new Date(meta.generatedAt).toLocaleString(); | ||
| console.log(name.padEnd(30), 'Cached'.padEnd(15), lastUpdated); | ||
| } catch { | ||
| console.log(name.padEnd(30), 'Not cached'.padEnd(15), '-'); | ||
| } | ||
| } | ||
| return; | ||
| } | ||
|
|
||
| // Determine which submodules to process | ||
| let submodulesToProcess = []; | ||
| if (values.module) { | ||
| if (!config.submodules[values.module]) { | ||
| console.error(\`Error: Submodule "\${values.module}" not found in configuration\`); | ||
| process.exit(1); | ||
| } | ||
| submodulesToProcess = [values.module]; | ||
| } else if (values.all || values['changed-only']) { | ||
| submodulesToProcess = Object.keys(config.submodules); | ||
| } else { | ||
| console.error('Error: Specify --all, --changed-only, or --module=<name>'); | ||
| process.exit(1); | ||
| } | ||
|
|
||
| // Process submodules | ||
| for (const name of submodulesToProcess) { | ||
| if (values.invalidate) { | ||
| const metaPath = path.join(cacheDir, \`\${name}.meta.json\`); | ||
| const contentPath = path.join(cacheDir, \`\${name}.xml\`); | ||
| await fs.rm(metaPath, { force: true }); | ||
| await fs.rm(contentPath, { force: true }); | ||
| console.log(\`Invalidated: \${name}\`); | ||
| } else if (values.regenerate) { | ||
| console.log(\`Regenerating: \${name} (use MCP tool for actual regeneration)\`); | ||
| // Note: Actual regeneration happens through MCP tool | ||
| // This script just manages cache files | ||
| } | ||
| } | ||
| } | ||
|
|
||
| main().catch(console.error); | ||
| `; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The generated repomix-cache.mjs script has a functional bug related to the --changed-only flag. The justfile recipes update-changed-caches and the post-commit git hook both use this flag, with comments indicating they will only process changed submodules. However, the generated script's logic (e.g., lines 320-321 in the template) treats --changed-only identically to --all, preparing to process all submodules regardless of their status. Furthermore, the script doesn't actually perform any update; it only logs a "Regenerating" message. This is misleading and renders the 'intelligent caching' aspect of the generated tooling non-functional. The script should implement proper change detection to fulfill its documented purpose.
| const content = await this.deps.fsReadFile(gitmodulesPath, 'utf-8'); | ||
|
|
||
| // Parse [submodule "xxx"] path = xxx | ||
| const pathRegex = /path\s*=\s*(.+)/g; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex for parsing .gitmodules is a bit too simple and may not correctly handle all valid git-config formats, especially paths that contain spaces and are quoted. For example, if a path is defined as path = "my libs/lib a", this regex will capture the path with quotes included. A more robust parsing approach would improve compatibility.
| fileCount: content.fileCount, | ||
| tokenCount: content.tokenCount, | ||
| dependencies: subConfig.dependencies, | ||
| repomixVersion: '1.0.0', // TODO: Get from package.json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The repomixVersion is hardcoded as '1.0.0'. This should be retrieved dynamically from package.json to ensure cache integrity across different versions of repomix, as the TODO comment indicates.
You can achieve this by:
- Importing
getVersionat the top of the file:import { getVersion } from '../../core/file/packageJsonParse.js';
- Fetching the version once before the loop inside the tool handler:
const repomixVersion = await getVersion();
- Using this
repomixVersionvariable in thecacheManager.setcall.
|
Hi @RyderFreeman4Logos, thank you for your interest in contributing to Repomix! For significant feature additions like this, I'd appreciate if you could open an issue first to discuss the proposal before implementation. This helps us understand the use case and explore whether it aligns with Repomix's direction. I'm curious about the specific problem this solves. What's the motivation for monorepo submodule support in the MCP server? Is there existing demand from users? The scope is quite large, and I want to be mindful of the maintenance burden. I'd love to hear more about the use case before moving forward. Thanks for understanding! |
Hi @yamadashy, Thank you for taking the time to review this PR and for your thoughtful feedback. You're absolutely right that we should have opened an issue first to discuss the feature before implementation. I apologize for not following that process. Context and MotivationI'm working on a nested Rust monorepo (mandate-monorepo) with 13 workspace members across multiple sub-repositories (~59,000 lines of Rust code): The typical workflow involves using Claude Code with Repomix MCP for AI-assisted development. However, I ran into a practical problem: The Problem: When using
Current Workaround: Manually packing individual submodules (e.g., The SolutionThe monorepo submodule support provides:
Token Savings Example:
Use CasesThis feature is designed for:
Maintenance ConsiderationsI completely understand your concern about maintenance burden. Here's what I've tried to address:
However, I acknowledge this is a significant feature addition (2,760+ lines). If you feel the maintenance cost outweighs the benefit, I completely respect that decision. User DemandRegarding existing user demand: I don't have data on how many other users need this feature. This was built to solve my own workflow pain point. If there's insufficient community interest, it may not be worth the maintenance burden. Next StepsI'm happy to:
Please let me know how you'd like to proceed. I genuinely appreciate the project and want to contribute in a way that helps rather than adds unwanted complexity. Thank you again for your time and for building such a valuable tool! Best regards, |
Summary
Checklist
npm run testnpm run lint🤖 Generated with Claude Code