Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 72 additions & 9 deletions src/shared/__tests__/context-mentions.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,52 +3,71 @@ import { mentionRegex, mentionRegexGlobal } from "../context-mentions"
describe("mentionRegex and mentionRegexGlobal", () => {
// Test cases for various mention types
const testCases = [
// Basic file paths
// Basic file paths at line start
{ input: "@/path/to/file.txt", expected: ["@/path/to/file.txt"] },
{ input: "@/file.js", expected: ["@/file.js"] },
{ input: "@/folder/", expected: ["@/folder/"] },

// File paths with escaped spaces
// File paths with escaped spaces at line start
{ input: "@/path/to/file\\ with\\ spaces.txt", expected: ["@/path/to/file\\ with\\ spaces.txt"] },
{ input: "@/users/my\\ project/report\\ final.pdf", expected: ["@/users/my\\ project/report\\ final.pdf"] },
{ input: "@/folder\\ with\\ spaces/", expected: ["@/folder\\ with\\ spaces/"] },
{ input: "@/a\\ b\\ c.txt", expected: ["@/a\\ b\\ c.txt"] },

// URLs
// URLs at line start
{ input: "@http://example.com", expected: ["@http://example.com"] },
{ input: "@https://example.com/path?query=1", expected: ["@https://example.com/path?query=1"] },

// Other mentions
// Other mentions at line start
{ input: "@problems", expected: ["@problems"] },
{ input: "@git-changes", expected: ["@git-changes"] },
{ input: "@terminal", expected: ["@terminal"] },
{ input: "@a1b2c3d", expected: ["@a1b2c3d"] }, // Git commit hash (short)
{ input: "@a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0", expected: ["@a1b2c3d4e5f6a7b8c9d0e1f2a3b4c5d6e7f8a9b0"] }, // Git commit hash (long)

// Mentions within text
// Mentions after whitespace (valid)
{
input: "Check file @/path/to/file\\ with\\ spaces.txt for details.",
expected: ["@/path/to/file\\ with\\ spaces.txt"],
},
{ input: "See @problems and @terminal output.", expected: ["@problems", "@terminal"] },
{ input: "URL: @https://example.com.", expected: ["@https://example.com"] }, // Trailing punctuation
{ input: "URL: @https://example.com.", expected: ["@https://example.com"] }, // After colon and space
{ input: "Commit @a1b2c3d, then check @/file.txt", expected: ["@a1b2c3d", "@/file.txt"] },

// NEW: Test cases for mentions mid-line without whitespace (should NOT match)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a test case for mentions after punctuation with a space? For example, 'Hello! @problems' should work but 'Hello!@problems' shouldn't. This would explicitly document the expected behavior for this edge case.

{ input: "error@https://example.com/path", expected: null }, // @ mid-word before URL
{ input: "log:@/var/log/system.log", expected: null }, // @ after colon without space
{ input: "email@problems", expected: null }, // @ mid-word before "problems"
{ input: "path=@/usr/local/bin", expected: null }, // @ after equals without space
{ input: "Error at line 42@terminal output", expected: null }, // @ mid-line before "terminal"
{ input: "commit@a1b2c3d", expected: null }, // @ mid-word before git hash

// NEW: Test cases for pasted logs (should NOT match)
{ input: "Failed to fetch@https://api.example.com/endpoint", expected: null },
{ input: "Error loading resource@/assets/image.png", expected: null },
{ input: "Stack trace:@/home/user/project/file.js:42", expected: null },

// NEW: Valid mentions with various whitespace
{ input: "Check\t@/path/to/file.txt", expected: ["@/path/to/file.txt"] }, // Tab before @
{ input: "Multiple @problems here", expected: ["@problems"] }, // Multiple spaces
{ input: "Newline\n@terminal output", expected: ["@terminal"] }, // After newline

// Negative cases (should not match or match partially)
{ input: "@/path/with unescaped space.txt", expected: ["@/path/with"] }, // Unescaped space
{ input: "@ /path/leading-space.txt", expected: null }, // Space after @
{ input: "[email protected]", expected: null }, // Email address
{ input: "mention@", expected: null }, // Trailing @
{ input: "@/path/trailing\\", expected: null }, // Trailing backslash (invalid escape)
{ input: "@/path/to/file\\not-a-space", expected: null }, // Backslash not followed by space

// Escaped mentions (should not match due to negative lookbehind)
{ input: "This is not a mention: \\@/path/to/file.txt", expected: null },
{ input: "Escaped \\@problems word", expected: null },
{ input: "Text with \\@https://example.com", expected: null },
{ input: "Another \\@a1b2c3d hash", expected: null },
{ input: "Not escaped @terminal", expected: ["@terminal"] }, // Ensure non-escaped still works nearby
{ input: "Double escape \\\\@/should/match", expected: null }, // Double backslash escapes the backslash, currently incorrectly fails to match
{ input: "Text with \\@/escaped/path\\ with\\ spaces.txt", expected: null }, // Escaped mention with escaped spaces within the path part
{ input: "Not escaped @terminal", expected: ["@terminal"] }, // After space, should work
{ input: "Double escape \\\\@/should/match", expected: null }, // Double backslash escapes the backslash
{ input: "Text with \\@/escaped/path\\ with\\ spaces.txt", expected: null }, // Escaped mention with escaped spaces
]
testCases.forEach(({ input, expected }) => {
it(`should handle input: "${input}"`, () => {
Expand Down Expand Up @@ -83,4 +102,48 @@ describe("mentionRegex and mentionRegexGlobal", () => {
expect(matches[0][1]).toBe("/path/to/escaped\\ file.txt") // Group 1 should not include '@'
expect(matches[1][1]).toBe("problems")
})

// NEW: Additional tests for the boundary restriction
describe("boundary restrictions", () => {
it("should match mentions at the start of a line", () => {
const input = "@/path/to/file.txt"
const match = input.match(mentionRegex)
expect(match).not.toBeNull()
expect(match?.[0]).toBe("@/path/to/file.txt")
})

it("should match mentions after whitespace", () => {
const input = "Check @/path/to/file.txt"
const match = input.match(mentionRegex)
expect(match).not.toBeNull()
expect(match?.[0]).toBe("@/path/to/file.txt")
})

it("should NOT match mentions mid-word or after non-whitespace", () => {
const testCases = ["error@https://example.com", "path:@/var/log", "email@problems", "42@terminal"]

testCases.forEach((input) => {
const match = input.match(mentionRegex)
expect(match).toBeNull()
})
})

it("should handle multiline text correctly", () => {
const input = `First line
@/path/on/newline.txt
Mid-line@/should/not/match.txt
After space @/should/match.txt`

const matches = Array.from(input.matchAll(mentionRegexGlobal))
expect(matches.length).toBe(2)
expect(matches[0][0]).toBe("@/path/on/newline.txt")
expect(matches[1][0]).toBe("@/should/match.txt")
})

it("should not match mentions in pasted log entries", () => {
const logEntry = "Error: Failed to load resource@https://api.example.com/data status:404"
const matches = Array.from(logEntry.matchAll(mentionRegexGlobal))
expect(matches.length).toBe(0)
})
})
})
25 changes: 16 additions & 9 deletions src/shared/context-mentions.ts
Original file line number Diff line number Diff line change
@@ -1,19 +1,25 @@
/*
Mention regex:
- **Purpose**:
- To identify and highlight specific mentions in text that start with '@'.
- **Purpose**:
- To identify and highlight specific mentions in text that start with '@'.
- These mentions can be file paths, URLs, or the exact word 'problems'.
- Ensures that trailing punctuation marks (like commas, periods, etc.) are not included in the match, allowing punctuation to follow the mention without being part of it.
- Restricts @ parsing to line-start or after whitespace to avoid accidental loading from pasted logs.

- **Regex Breakdown**:
- `/@`:
- `(?:^|\s)`:
- **Non-Capturing Group (`(?:...)`)**: Groups the alternatives without capturing them.
- **Line Start or Whitespace (`^|\s`)**: The @ must be at the start of a line or preceded by whitespace.

- `(?<!\\)@`:
- **Negative Lookbehind (`(?<!\\)`)**: Ensures the @ is not escaped with a backslash.
- **@**: The mention must start with the '@' symbol.

- `((?:\/|\w+:\/\/)[^\s]+?|problems\b|git-changes\b)`:
- **Capturing Group (`(...)`)**: Captures the part of the string that matches one of the specified patterns.
- `(?:\/|\w+:\/\/)`:
- `(?:\/|\w+:\/\/)`:
- **Non-Capturing Group (`(?:...)`)**: Groups the alternatives without capturing them for back-referencing.
- `\/`:
- `\/`:
- **Slash (`/`)**: Indicates that the mention is a file or folder path starting with a '/'.
- `|`: Logical OR.
- `\w+:\/\/`:
Expand All @@ -25,7 +31,7 @@ Mention regex:
- **Escaped Space (`\\ `)**: Matches a backslash followed by a space (an escaped space).
- **Non-Greedy (`+?`)**: Ensures the smallest possible match, preventing the inclusion of trailing punctuation.
- `|`: Logical OR.
- `problems\b`:
- `problems\b`:
- **Exact Word ('problems')**: Matches the exact word 'problems'.
- **Word Boundary (`\b`)**: Ensures that 'problems' is matched as a whole word and not as part of another word (e.g., 'problematic').
- `|`: Logical OR.
Expand All @@ -34,9 +40,9 @@ Mention regex:
- **Word Boundary (`\b`)**: Ensures that 'terminal' is matched as a whole word and not as part of another word (e.g., 'terminals').
- `(?=[.,;:!?]?(?=[\s\r\n]|$))`:
- **Positive Lookahead (`(?=...)`)**: Ensures that the match is followed by specific patterns without including them in the match.
- `[.,;:!?]?`:
- `[.,;:!?]?`:
- **Optional Punctuation (`[.,;:!?]?`)**: Matches zero or one of the specified punctuation marks.
- `(?=[\s\r\n]|$)`:
- `(?=[\s\r\n]|$)`:
- **Nested Positive Lookahead (`(?=[\s\r\n]|$)`)**: Ensures that the punctuation (if present) is followed by a whitespace character, a line break, or the end of the string.

- **Summary**:
Expand All @@ -48,13 +54,14 @@ Mention regex:
- The exact word 'git-changes'.
- The exact word 'terminal'.
- It ensures that any trailing punctuation marks (such as ',', '.', '!', etc.) are not included in the matched mention, allowing the punctuation to follow the mention naturally in the text.
- **NEW**: The @ symbol must be at the start of a line or preceded by whitespace to prevent accidental matches in pasted logs.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding examples of valid vs invalid patterns in the comments to make it immediately clear for future developers. For instance, showing that '@/path/to/file.txt' at line start is valid, 'Check @problems here' after space is valid, but 'error@https://api.com' mid-word is invalid.


- **Global Regex**:
- `mentionRegexGlobal`: Creates a global version of the `mentionRegex` to find all matches within a given string.

*/
export const mentionRegex =
/(?<!\\)@((?:\/|\w+:\/\/)(?:[^\s\\]|\\ )+?|[a-f0-9]{7,40}\b|problems\b|git-changes\b|terminal\b)(?=[.,;:!?]?(?=[\s\r\n]|$))/
/(?:^|(?<=\s))(?<!\\)@((?:\/|\w+:\/\/)(?:[^\s\\]|\\ )+?|[a-f0-9]{7,40}\b|problems\b|git-changes\b|terminal\b)(?=[.,;:!?]?(?=[\s\r\n]|$))/
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The regex pattern has become quite complex with the addition of (?:^|(?<=\s)). Have you considered adding a helper function that validates mention boundaries separately? This could improve readability and make the logic easier to maintain.

export const mentionRegexGlobal = new RegExp(mentionRegex.source, "g")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it intentional that the regex doesn't use the multiline flag? Without it, ^ only matches the absolute start of the string. Mentions at the beginning of lines after newlines currently work because \n counts as whitespace in (?<=\s), but this might be worth clarifying in the documentation.


// Regex to match command mentions like /command-name anywhere in text
Expand Down
Loading