Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
158 changes: 124 additions & 34 deletions src/core/ignore/RooIgnoreController.ts
Original file line number Diff line number Diff line change
Expand Up @@ -114,44 +114,134 @@ export class RooIgnoreController {
return undefined
}

// Split command into parts and get the base command
const parts = command.trim().split(/\s+/)
const baseCommand = parts[0].toLowerCase()

// Commands that read file contents
const fileReadingCommands = [
// Unix commands
"cat",
"less",
"more",
"head",
"tail",
"grep",
"awk",
"sed",
// PowerShell commands and aliases
"get-content",
"gc",
"type",
"select-string",
"sls",
// First, check for shell redirections and command substitutions that could read files
// These patterns can bypass simple command parsing
const dangerousPatterns = [
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance consideration: These regex patterns are evaluated for every command validation. Since validateCommand() might be called frequently, consider:

  1. Pre-compiling these patterns as class constants
  2. Using a single combined pattern where possible
  3. Short-circuiting on common non-matching cases

// Input redirection: < file, <file
/<\s*([^\s<>|;&]+)/g,
// Command substitution: $(cat file), `cat file`
/\$\([^)]*\b(cat|head|tail|less|more|grep|awk|sed|type|gc|get-content)\s+([^\s)]+)[^)]*\)/gi,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this regex approach robust enough for all shell escaping scenarios? Complex shell syntax like nested quotes, escaped characters, or multi-line commands might bypass these patterns. Consider using a proper shell parser library for more comprehensive detection, or document the known limitations.

/`[^`]*\b(cat|head|tail|less|more|grep|awk|sed|type|gc|get-content)\s+([^\s`]+)[^`]*`/gi,
// Process substitution: <(cat file)
/<\([^)]*\b(cat|head|tail|less|more|grep|awk|sed|type|gc|get-content)\s+([^\s)]+)[^)]*\)/gi,
// Here documents/strings that might reference files
/<<<?\s*([^\s<>|;&]+)/g,
]

if (fileReadingCommands.includes(baseCommand)) {
// Check each argument that could be a file path
for (let i = 1; i < parts.length; i++) {
const arg = parts[i]
// Skip command flags/options (both Unix and PowerShell style)
if (arg.startsWith("-") || arg.startsWith("/")) {
continue
for (const pattern of dangerousPatterns) {
const matches = command.matchAll(pattern)
for (const match of matches) {
// Get the potential file path from the match
// Different patterns have the file path at different indices
const potentialPaths = [match[1], match[2], match[3]].filter(Boolean)
for (const filePath of potentialPaths) {
if (filePath && !this.validateAccess(filePath)) {
return filePath
}
}
// Ignore PowerShell parameter names
if (arg.includes(":")) {
continue
}
}

// Check for piped commands that might expose file contents
// e.g., echo "$(cat file)" or echo `cat file`
const pipelineCommands = command.split(/[|;&]/).map((cmd) => cmd.trim())

for (const pipeCmd of pipelineCommands) {
// Split command into parts and get the base command
const parts = pipeCmd.split(/\s+/)
if (parts.length === 0) continue

const baseCommand = parts[0].toLowerCase()

// Commands that read file contents
const fileReadingCommands = [
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fileReadingCommands array is recreated for each pipeline command in the loop. Consider moving it outside the loop or making it a class constant to avoid repeated array creation.

// Unix commands
"cat",
"less",
"more",
"head",
"tail",
"grep",
"awk",
"sed",
"nl",
"tac",
"rev",
"cut",
"paste",
"sort",
"uniq",
"comm",
"diff",
"cmp",
"od",
"hexdump",
"xxd",
"strings",
"file",
// Additional Unix utilities
"zcat",
"zless",
"zmore",
"bzcat",
"xzcat",
"view",
// PowerShell commands and aliases
"get-content",
"gc",
"type",
"select-string",
"sls",
// Windows commands
"findstr",
"find",
"fc",
]

if (fileReadingCommands.includes(baseCommand)) {
// Check each argument that could be a file path
for (let i = 1; i < parts.length; i++) {
const arg = parts[i]
// Skip command flags/options (both Unix and PowerShell style)
if (arg.startsWith("-") || arg.startsWith("/")) {
continue
}
// Ignore PowerShell parameter names
if (arg.includes(":") && i > 0 && parts[i - 1].startsWith("-")) {
continue
}
// Skip empty arguments
if (!arg) {
continue
}
// Remove quotes if present
const cleanArg = arg.replace(/^["']|["']$/g, "")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The quote removal logic only handles quotes at the beginning and end. What about escaped quotes or mixed quote types within the string? Consider a more robust quote parsing approach or document this as a known limitation.

// Validate file access
if (!this.validateAccess(cleanArg)) {
return cleanArg
}
}
// Validate file access
if (!this.validateAccess(arg)) {
return arg
}

// Also check for commands that might read files indirectly
// e.g., xargs cat, find -exec cat, etc.
if (baseCommand === "xargs" || baseCommand === "find") {
// Look for file-reading commands in the arguments
const argsStr = parts.slice(1).join(" ")
for (const readCmd of fileReadingCommands) {
if (argsStr.includes(readCmd)) {
// Try to extract file paths from find patterns or xargs input
// This is complex, so we'll check common patterns
const filePatterns = argsStr.match(/(?:name|path)\s+["']?([^"'\s]+)["']?/gi)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The handling for xargs and find commands is limited. These commands can execute file operations in complex ways that might not be caught by the current pattern matching. Could we either enhance the detection or document these as potential bypass vectors that users should be aware of?

if (filePatterns) {
for (const pattern of filePatterns) {
const filePath = pattern.replace(/(?:name|path)\s+["']?([^"'\s]+)["']?/i, "$1")
if (!this.validateAccess(filePath)) {
return filePath
}
}
}
}
}
}
}
Expand Down
96 changes: 96 additions & 0 deletions src/core/ignore/__tests__/RooIgnoreController.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -275,6 +275,100 @@ describe("RooIgnoreController", () => {
expect(controller.validateCommand("npm install")).toBeUndefined()
})

/**
* Tests validation of shell redirections and command substitutions
*/
it("should block shell redirections that read ignored files", () => {
// Input redirection
expect(controller.validateCommand("wc -l < node_modules/package.json")).toBe("node_modules/package.json")
expect(controller.validateCommand("sort < secrets/api-keys.json")).toBe("secrets/api-keys.json")
expect(controller.validateCommand("grep pattern <.git/config")).toBe(".git/config")

// Should allow non-ignored files
expect(controller.validateCommand("wc -l < README.md")).toBeUndefined()
})

it("should block command substitutions that read ignored files", () => {
// $() command substitution
expect(controller.validateCommand("echo $(cat node_modules/package.json)")).toBe(
"node_modules/package.json",
)
expect(controller.validateCommand("result=$(head secrets/api-keys.json)")).toBe("secrets/api-keys.json")

// Backtick command substitution
expect(controller.validateCommand("echo `cat .git/config`")).toBe(".git/config")
expect(controller.validateCommand("data=`tail error.log`")).toBe("error.log")

// Process substitution
expect(controller.validateCommand("diff <(cat node_modules/index.js) file2")).toBe("node_modules/index.js")

// Should allow non-ignored files
expect(controller.validateCommand("echo $(cat README.md)")).toBeUndefined()
})

it("should block piped commands that read ignored files", () => {
// Commands in pipelines
expect(controller.validateCommand("cat node_modules/package.json | grep version")).toBe(
"node_modules/package.json",
)
expect(controller.validateCommand("echo test | tee secrets/output.log; cat secrets/output.log")).toBe(
"secrets/output.log",
)
expect(controller.validateCommand("ls && head .git/config")).toBe(".git/config")

// Should allow non-ignored files in pipelines
expect(controller.validateCommand("cat README.md | grep title")).toBeUndefined()
})

it("should detect additional file reading commands", () => {
// Additional Unix utilities
expect(controller.validateCommand("nl node_modules/package.json")).toBe("node_modules/package.json")
expect(controller.validateCommand("tac .git/config")).toBe(".git/config")
expect(controller.validateCommand("strings secrets/binary.dat")).toBe("secrets/binary.dat")
expect(controller.validateCommand("hexdump error.log")).toBe("error.log")
expect(controller.validateCommand("od -c node_modules/index.js")).toBe("node_modules/index.js")

// Compressed file readers
expect(controller.validateCommand("zcat secrets/data.gz")).toBe("secrets/data.gz")
expect(controller.validateCommand("bzcat node_modules/archive.bz2")).toBe("node_modules/archive.bz2")

// File comparison utilities
expect(controller.validateCommand("diff .git/config file2")).toBe(".git/config")
expect(controller.validateCommand("cmp secrets/key1 secrets/key2")).toBe("secrets/key1")

// Windows/PowerShell commands
expect(controller.validateCommand("findstr pattern node_modules/package.json")).toBe(
"node_modules/package.json",
)
expect(controller.validateCommand("fc .git/config file2")).toBe(".git/config")
})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good test coverage! Consider adding a few more edge cases:

  1. Commands with environment variables: cat $HOME/secrets/file
  2. Commands with glob patterns: cat secrets/*.json
  3. Commands with command aliases: alias mycat=cat; mycat secrets/file
  4. Multi-line commands with line continuations

it("should handle complex command patterns", () => {
// Commands with quotes
expect(controller.validateCommand('cat "node_modules/package.json"')).toBe("node_modules/package.json")
expect(controller.validateCommand("cat 'secrets/api-keys.json'")).toBe("secrets/api-keys.json")

// Multiple files in one command
expect(controller.validateCommand("cat README.md node_modules/index.js")).toBe("node_modules/index.js")

// Nested command substitutions
expect(controller.validateCommand("echo $(echo $(cat .git/config))")).toBe(".git/config")

// Mixed patterns
expect(controller.validateCommand("cat < node_modules/data.txt | grep pattern")).toBe(
"node_modules/data.txt",
)
})

it("should handle xargs and find commands that might read files", () => {
// xargs with file reading commands
expect(controller.validateCommand("find . -name '*.json' | xargs cat")).toBeUndefined() // Can't determine specific files
expect(controller.validateCommand("echo node_modules/package.json | xargs cat")).toBeUndefined() // File path is in stdin, not command

// find with -exec
expect(controller.validateCommand("find . -name 'package.json' -exec cat {} \\;")).toBeUndefined() // Can't determine specific files
})

/**
* Tests behavior when no .rooignore exists
*/
Expand All @@ -287,6 +381,8 @@ describe("RooIgnoreController", () => {
// All commands should be allowed
expect(emptyController.validateCommand("cat node_modules/package.json")).toBeUndefined()
expect(emptyController.validateCommand("grep pattern .git/config")).toBeUndefined()
expect(emptyController.validateCommand("echo $(cat secrets/file)")).toBeUndefined()
expect(emptyController.validateCommand("wc < .git/config")).toBeUndefined()
})
})

Expand Down
Loading