You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix: add Unicode support for file path mentions in slash commands
- Added Unicode flag (u) to mentionRegex and mentionRegexGlobal to properly match Unicode characters
- Added comprehensive tests for various Unicode scripts (Chinese, Japanese, Korean, Arabic, Cyrillic, etc.)
- Updated documentation to clarify Unicode support in file paths
- Fixes#7240
Copy file name to clipboardExpand all lines: src/shared/context-mentions.ts
+16-15Lines changed: 16 additions & 15 deletions
Original file line number
Diff line number
Diff line change
@@ -1,31 +1,31 @@
1
1
/*
2
2
Mention regex:
3
-
- **Purpose**:
4
-
- To identify and highlight specific mentions in text that start with '@'.
5
-
- These mentions can be file paths, URLs, or the exact word 'problems'.
3
+
- **Purpose**:
4
+
- To identify and highlight specific mentions in text that start with '@'.
5
+
- These mentions can be file paths (including Unicode characters), URLs, or specific keywords.
6
6
- Ensures that trailing punctuation marks (like commas, periods, etc.) are not included in the match, allowing punctuation to follow the mention without being part of it.
7
7
8
8
- **Regex Breakdown**:
9
-
- `/@`:
9
+
- `/@`:
10
10
- **@**: The mention must start with the '@' symbol.
- **Capturing Group (`(...)`)**: Captures the part of the string that matches one of the specified patterns.
14
-
- `(?:\/|\w+:\/\/)`:
14
+
- `(?:\/|\w+:\/\/)`:
15
15
- **Non-Capturing Group (`(?:...)`)**: Groups the alternatives without capturing them for back-referencing.
16
-
- `\/`:
16
+
- `\/`:
17
17
- **Slash (`/`)**: Indicates that the mention is a file or folder path starting with a '/'.
18
18
- `|`: Logical OR.
19
19
- `\w+:\/\/`:
20
20
- **Protocol (`\w+://`)**: Matches URLs that start with a word character sequence followed by '://', such as 'http://', 'https://', 'ftp://', etc.
21
21
- `(?:[^\s\\]|\\ )+?`:
22
22
- **Non-Capturing Group (`(?:...)`)**: Groups the alternatives without capturing them.
23
-
- **Non-Whitespace and Non-Backslash (`[^\s\\]`)**: Matches any character that is not whitespace or a backslash.
23
+
- **Non-Whitespace and Non-Backslash (`[^\s\\]`)**: Matches any character that is not whitespace or a backslash, including Unicode characters.
24
24
- **OR (`|`)**: Logical OR.
25
25
- **Escaped Space (`\\ `)**: Matches a backslash followed by a space (an escaped space).
26
26
- **Non-Greedy (`+?`)**: Ensures the smallest possible match, preventing the inclusion of trailing punctuation.
27
27
- `|`: Logical OR.
28
-
- `problems\b`:
28
+
- `problems\b`:
29
29
- **Exact Word ('problems')**: Matches the exact word 'problems'.
30
30
- **Word Boundary (`\b`)**: Ensures that 'problems' is matched as a whole word and not as part of another word (e.g., 'problematic').
31
31
- `|`: Logical OR.
@@ -34,28 +34,29 @@ Mention regex:
34
34
- **Word Boundary (`\b`)**: Ensures that 'terminal' is matched as a whole word and not as part of another word (e.g., 'terminals').
35
35
- `(?=[.,;:!?]?(?=[\s\r\n]|$))`:
36
36
- **Positive Lookahead (`(?=...)`)**: Ensures that the match is followed by specific patterns without including them in the match.
37
-
- `[.,;:!?]?`:
37
+
- `[.,;:!?]?`:
38
38
- **Optional Punctuation (`[.,;:!?]?`)**: Matches zero or one of the specified punctuation marks.
39
-
- `(?=[\s\r\n]|$)`:
39
+
- `(?=[\s\r\n]|$)`:
40
40
- **Nested Positive Lookahead (`(?=[\s\r\n]|$)`)**: Ensures that the punctuation (if present) is followed by a whitespace character, a line break, or the end of the string.
41
41
42
42
- **Summary**:
43
43
- The regex effectively matches:
44
-
- Mentions that are file or folder paths starting with '/' and containing any non-whitespace characters (including periods within the path).
45
-
- File paths can include spaces if they are escaped with a backslash (e.g., `@/path/to/file\ with\ spaces.txt`).
44
+
- Mentions that are file or folder paths starting with '/' and containing any non-whitespace characters, including Unicode characters like Chinese, Japanese, Korean, etc.
45
+
- File paths can include spaces if they are escaped with a backslash (e.g., `@/path/to/file\ with\ spaces.txt` or `@/路径/中文文件.txt`).
46
46
- URLs that start with a protocol (like 'http://') followed by any non-whitespace characters (including query parameters).
47
47
- The exact word 'problems'.
48
48
- The exact word 'git-changes'.
49
49
- The exact word 'terminal'.
50
50
- It ensures that any trailing punctuation marks (such as ',', '.', '!', etc.) are not included in the matched mention, allowing the punctuation to follow the mention naturally in the text.
51
+
- The 'u' flag enables full Unicode support, allowing the regex to properly match Unicode characters in file paths.
51
52
52
53
- **Global Regex**:
53
-
- `mentionRegexGlobal`: Creates a global version of the `mentionRegex` to find all matches within a given string.
54
+
- `mentionRegexGlobal`: Creates a global version of the `mentionRegex` with Unicode support to find all matches within a given string.
0 commit comments