Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 59 additions & 0 deletions src/__tests__/command-mentions.spec.ts
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,65 @@ npm install
}
})
})

it("should match commands with Chinese characters", () => {
const commandRegex =
/(?:^|\s)\/([a-zA-Z0-9_\.\-\u4e00-\u9fff\u3040-\u309f\u30a0-\u30ff\uac00-\ud7af]+)(?=\s|$)/g
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional? These new Unicode tests are using a hardcoded regex pattern instead of importing the actual commandRegexGlobal from the source file. This could lead to tests passing even if the actual implementation differs.


const chinesePatterns = [
"/中文命令",
"/测试文件",
"/配置文件",
"/部署脚本",
"Use /中文 command",
"Run /测试 now",
]

chinesePatterns.forEach((pattern) => {
const matches = pattern.match(commandRegex)
expect(matches).toBeTruthy()
expect(matches?.length).toBeGreaterThan(0)
})
})

it("should match commands with Japanese characters", () => {
const commandRegex =
/(?:^|\s)\/([a-zA-Z0-9_\.\-\u4e00-\u9fff\u3040-\u309f\u30a0-\u30ff\uac00-\ud7af]+)(?=\s|$)/g
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern here - hardcoding the regex pattern in tests rather than using the exported constant. Consider importing commandRegexGlobal to ensure tests match the actual implementation.


const japanesePatterns = ["/テスト", "/ファイル", "/デプロイ", "Use /日本語 command", "Run /テスト now"]

japanesePatterns.forEach((pattern) => {
const matches = pattern.match(commandRegex)
expect(matches).toBeTruthy()
expect(matches?.length).toBeGreaterThan(0)
})
})

it("should match commands with Korean characters", () => {
const commandRegex =
/(?:^|\s)\/([a-zA-Z0-9_\.\-\u4e00-\u9fff\u3040-\u309f\u30a0-\u30ff\uac00-\ud7af]+)(?=\s|$)/g
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another hardcoded regex. All these test cases should ideally import and use the actual commandRegexGlobal to ensure consistency.


const koreanPatterns = ["/테스트", "/파일", "/배포", "Use /한국어 command", "Run /테스트 now"]

koreanPatterns.forEach((pattern) => {
const matches = pattern.match(commandRegex)
expect(matches).toBeTruthy()
expect(matches?.length).toBeGreaterThan(0)
})
})

it("should match commands with mixed characters", () => {
const commandRegex =
/(?:^|\s)\/([a-zA-Z0-9_\.\-\u4e00-\u9fff\u3040-\u309f\u30a0-\u30ff\uac00-\ud7af]+)(?=\s|$)/g
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add edge case tests for commands with emoji, very long Unicode names, or Unicode combining characters? These edge cases could help ensure robustness.


const mixedPatterns = ["/test中文", "/deploy部署", "/file文件123", "/テストtest", "/한국어korean"]

mixedPatterns.forEach((pattern) => {
const matches = pattern.match(commandRegex)
expect(matches).toBeTruthy()
expect(matches?.length).toBe(1)
})
})
})

describe("command mention text transformation", () => {
Expand Down
4 changes: 3 additions & 1 deletion src/shared/context-mentions.ts
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,9 @@ export const mentionRegex =
export const mentionRegexGlobal = new RegExp(mentionRegex.source, "g")

// Regex to match command mentions like /command-name anywhere in text
export const commandRegexGlobal = /(?:^|\s)\/([a-zA-Z0-9_\.-]+)(?=\s|$)/g
// Updated to support Unicode characters including Chinese, Japanese, Korean, etc.
export const commandRegexGlobal =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we consider expanding Unicode support to include other scripts like Arabic (U+0600-U+06FF), Cyrillic (U+0400-U+04FF), Hebrew (U+0590-U+05FF), or Thai (U+0E00-U+0E7F)? Currently only CJK characters are supported.

/(?:^|\s)\/([a-zA-Z0-9_\.\-\u4e00-\u9fff\u3040-\u309f\u30a0-\u30ff\uac00-\ud7af]+)(?=\s|$)/g

export interface MentionSuggestion {
type: "file" | "folder" | "git" | "problems"
Expand Down
Loading