Skip to content

Commit f3d2a75

Browse files
authored
Add regexp/unicode-escape rule (#166)
1 parent d841adc commit f3d2a75

File tree

8 files changed

+299
-0
lines changed

8 files changed

+299
-0
lines changed

README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -133,6 +133,7 @@ The rules with the following star :star: are included in the `plugin:regexp/reco
133133
| [regexp/prefer-unicode-codepoint-escapes](https://ota-meshi.github.io/eslint-plugin-regexp/rules/prefer-unicode-codepoint-escapes.html) | enforce use of unicode codepoint escapes | :wrench: |
134134
| [regexp/prefer-w](https://ota-meshi.github.io/eslint-plugin-regexp/rules/prefer-w.html) | enforce using `\w` | :star::wrench: |
135135
| [regexp/sort-flags](https://ota-meshi.github.io/eslint-plugin-regexp/rules/sort-flags.html) | require regex flags to be sorted | :wrench: |
136+
| [regexp/unicode-escape](https://ota-meshi.github.io/eslint-plugin-regexp/rules/unicode-escape.html) | enforce consistent usage of unicode escape or unicode codepoint escape | :wrench: |
136137

137138
<!--RULES_TABLE_END-->
138139
<!--RULES_SECTION_END-->

docs/rules/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -61,3 +61,4 @@ The rules with the following star :star: are included in the `plugin:regexp/reco
6161
| [regexp/prefer-unicode-codepoint-escapes](./prefer-unicode-codepoint-escapes.md) | enforce use of unicode codepoint escapes | :wrench: |
6262
| [regexp/prefer-w](./prefer-w.md) | enforce using `\w` | :star::wrench: |
6363
| [regexp/sort-flags](./sort-flags.md) | require regex flags to be sorted | :wrench: |
64+
| [regexp/unicode-escape](./unicode-escape.md) | enforce consistent usage of unicode escape or unicode codepoint escape | :wrench: |

docs/rules/hexadecimal-escape.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,10 @@ var foo = /\x0a/;
6363

6464
</eslint-code-block>
6565

66+
## :couple: Related rules
67+
68+
- [regexp/unicode-escape](./unicode-escape.md)
69+
6670
## :mag: Implementation
6771

6872
- [Rule source](https://github.com/ota-meshi/eslint-plugin-regexp/blob/master/lib/rules/hexadecimal-escape.ts)

docs/rules/prefer-unicode-codepoint-escapes.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,8 @@ since: "v0.3.0"
1515

1616
This rule enforces the use of Unicode codepoint escapes instead of Unicode escapes using surrogate pairs.
1717

18+
If you want to enforce characters that do not use surrogate pairs into unicode escapes or unicode code point escapes, use the [regexp/unicode-escape] rule.
19+
1820
<eslint-code-block fix>
1921

2022
```js
@@ -34,6 +36,12 @@ var foo = /\ud83d\ude00/u
3436

3537
Nothing.
3638

39+
## :couple: Related rules
40+
41+
- [regexp/unicode-escape]
42+
43+
[regexp/unicode-escape]: ./unicode-escape.md
44+
3745
## :rocket: Version
3846

3947
This rule was introduced in eslint-plugin-regexp v0.3.0

docs/rules/unicode-escape.md

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
---
2+
pageClass: "rule-details"
3+
sidebarDepth: 0
4+
title: "regexp/unicode-escape"
5+
description: "enforce consistent usage of unicode escape or unicode codepoint escape"
6+
---
7+
# regexp/unicode-escape
8+
9+
> enforce consistent usage of unicode escape or unicode codepoint escape
10+
11+
- :exclamation: <badge text="This rule has not been released yet." vertical="middle" type="error"> ***This rule has not been released yet.*** </badge>
12+
- :wrench: The `--fix` option on the [command line](https://eslint.org/docs/user-guide/command-line-interface#fixing-problems) can automatically fix some of the problems reported by this rule.
13+
14+
## :book: Rule Details
15+
16+
This rule aims to enforce the consistent use of unicode escapes or unicode code point escapes.
17+
18+
This rule does not check for characters that require surrogate pairs (e.g. `\ud83d\ude00`, `\u{1f600}`) and patterns that do not have the `u` flag.
19+
20+
If you want to enforce a character that requires a surrogate pair to unicode code point escape, use the [regexp/prefer-unicode-codepoint-escapes] rule.
21+
22+
<eslint-code-block fix>
23+
24+
```js
25+
/* eslint regexp/unicode-escape: "error" */
26+
27+
/* ✓ GOOD */
28+
var foo = /\u{41}/u;
29+
var foo = /\u0041/; // do not have the `u` flag
30+
var foo = /\ud83d\ude00/u; // surrogate pair
31+
32+
/* ✗ BAD */
33+
var foo = /\u0041/u;
34+
```
35+
36+
</eslint-code-block>
37+
38+
## :wrench: Options
39+
40+
```json5
41+
{
42+
"regexp/unicode-escape": [
43+
"error",
44+
"unicodeCodePointEscape" // or "unicodeEscape"
45+
]
46+
}
47+
```
48+
49+
- `"unicodeCodePointEscape"` ... Unicode escape characters must always use unicode code point escapes. This is default.
50+
- `"unicodeEscape"` ... Unicode code point escape characters must always use unicode escapes.
51+
52+
### `"unicodeEscape"`
53+
54+
<eslint-code-block fix>
55+
56+
```js
57+
/* eslint regexp/unicode-escape: ["error", "unicodeEscape"] */
58+
59+
/* ✓ GOOD */
60+
var foo = /\u0041/u;
61+
62+
/* ✗ BAD */
63+
var foo = /\u{41}/u;
64+
```
65+
66+
</eslint-code-block>
67+
68+
## :couple: Related rules
69+
70+
- [regexp/hexadecimal-escape]
71+
- [regexp/prefer-unicode-codepoint-escapes]
72+
- [require-unicode-regexp]
73+
74+
[regexp/hexadecimal-escape]: ./hexadecimal-escape.md
75+
[regexp/prefer-unicode-codepoint-escapes]: ./prefer-unicode-codepoint-escapes.md
76+
[require-unicode-regexp]: https://eslint.org/docs/rules/require-unicode-regexp
77+
78+
## :mag: Implementation
79+
80+
- [Rule source](https://github.com/ota-meshi/eslint-plugin-regexp/blob/master/lib/rules/unicode-escape.ts)
81+
- [Test source](https://github.com/ota-meshi/eslint-plugin-regexp/blob/master/tests/lib/rules/unicode-escape.ts)

lib/rules/unicode-escape.ts

Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
import type { RegExpVisitor } from "regexpp/visitor"
2+
import type { Character } from "regexpp/ast"
3+
import type { RegExpContext } from "../utils"
4+
import {
5+
createRule,
6+
defineRegexpVisitor,
7+
getEscapeSequenceKind,
8+
EscapeSequenceKind,
9+
} from "../utils"
10+
11+
export default createRule("unicode-escape", {
12+
meta: {
13+
docs: {
14+
description:
15+
"enforce consistent usage of unicode escape or unicode codepoint escape",
16+
recommended: false,
17+
},
18+
fixable: "code",
19+
schema: [
20+
{
21+
enum: ["unicodeCodePointEscape", "unicodeEscape"], // default unicodeCodePointEscape
22+
},
23+
],
24+
messages: {
25+
expectedUnicodeCodePointEscape:
26+
"Expected unicode code point escape ('{{unicodeCodePointEscape}}'), but unicode escape ('{{unicodeEscape}}') is used.",
27+
expectedUnicodeEscape:
28+
"Expected unicode escape ('{{unicodeEscape}}'), but unicode code point escape ('{{unicodeCodePointEscape}}') is used.",
29+
},
30+
type: "suggestion", // "problem",
31+
},
32+
create(context) {
33+
const preferUnicodeCodePointEscape =
34+
context.options[0] !== "unicodeEscape"
35+
36+
/**
37+
* Verify for unicodeCodePointEscape
38+
*/
39+
function verifyForUnicodeCodePointEscape(
40+
{ node, getRegexpLocation, fixReplaceNode }: RegExpContext,
41+
kind: EscapeSequenceKind,
42+
cNode: Character,
43+
) {
44+
if (kind !== EscapeSequenceKind.unicode) {
45+
return
46+
}
47+
48+
const unicodeCodePointEscape = `\\u{${cNode.value.toString(16)}}`
49+
50+
context.report({
51+
node,
52+
loc: getRegexpLocation(cNode),
53+
messageId: "expectedUnicodeCodePointEscape",
54+
data: {
55+
unicodeCodePointEscape,
56+
unicodeEscape: cNode.raw,
57+
},
58+
fix: fixReplaceNode(cNode, unicodeCodePointEscape),
59+
})
60+
}
61+
62+
/**
63+
* Verify for unicodeEscape
64+
*/
65+
function verifyForUnicodeEscape(
66+
{ node, getRegexpLocation, fixReplaceNode }: RegExpContext,
67+
kind: EscapeSequenceKind,
68+
cNode: Character,
69+
) {
70+
if (kind !== EscapeSequenceKind.unicodeCodePoint) {
71+
return
72+
}
73+
const unicodeEscape = `\\u${cNode.value
74+
.toString(16)
75+
.padStart(4, "0")}`
76+
context.report({
77+
node,
78+
loc: getRegexpLocation(cNode),
79+
messageId: "expectedUnicodeEscape",
80+
data: {
81+
unicodeEscape,
82+
unicodeCodePointEscape: cNode.raw,
83+
},
84+
fix: fixReplaceNode(cNode, unicodeEscape),
85+
})
86+
}
87+
88+
const verify = preferUnicodeCodePointEscape
89+
? verifyForUnicodeCodePointEscape
90+
: verifyForUnicodeEscape
91+
92+
/**
93+
* Create visitor
94+
*/
95+
function createVisitor(
96+
regexpContext: RegExpContext,
97+
): RegExpVisitor.Handlers {
98+
const { flags } = regexpContext
99+
if (!flags.unicode) {
100+
return {}
101+
}
102+
return {
103+
onCharacterEnter(cNode) {
104+
if (cNode.value >= 0x10000) {
105+
return
106+
}
107+
const kind = getEscapeSequenceKind(cNode.raw)
108+
if (!kind) {
109+
return
110+
}
111+
112+
verify(regexpContext, kind, cNode)
113+
},
114+
}
115+
}
116+
117+
return defineRegexpVisitor(context, {
118+
createVisitor,
119+
})
120+
},
121+
})

lib/utils/rules.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@ import preferT from "../rules/prefer-t"
4949
import preferUnicodeCodepointEscapes from "../rules/prefer-unicode-codepoint-escapes"
5050
import preferW from "../rules/prefer-w"
5151
import sortFlags from "../rules/sort-flags"
52+
import unicodeEscape from "../rules/unicode-escape"
5253

5354
export const rules = [
5455
confusingQuantifier,
@@ -101,4 +102,5 @@ export const rules = [
101102
preferUnicodeCodepointEscapes,
102103
preferW,
103104
sortFlags,
105+
unicodeEscape,
104106
] as RuleModule[]

tests/lib/rules/unicode-escape.ts

Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
import { RuleTester } from "eslint"
2+
import rule from "../../../lib/rules/unicode-escape"
3+
4+
const tester = new RuleTester({
5+
parserOptions: {
6+
ecmaVersion: 2020,
7+
sourceType: "module",
8+
},
9+
})
10+
11+
tester.run("unicode-escape", rule as any, {
12+
valid: [
13+
String.raw`/a \x0a \cM \0 \u{ff} \u{100} \ud83d\ude00 \u{1f600}/u`,
14+
{
15+
code: String.raw`/a \x0a \cM \0 \u{ff} \u{100} \ud83d\ude00 \u{1f600}/u`,
16+
options: ["unicodeCodePointEscape"],
17+
},
18+
{
19+
code: String.raw`/a \x0a \cM \0 \u0100 \u00ff \ud83d\ude00 \u{1f600}/u`,
20+
options: ["unicodeEscape"],
21+
},
22+
23+
// no u flag
24+
{
25+
code: String.raw`/a \x0a \cM \0 \u0100 \u00ff \ud83d\ude00 \u{1f600}/`,
26+
options: ["unicodeCodePointEscape"],
27+
},
28+
],
29+
invalid: [
30+
{
31+
code: String.raw`/a \x0a \cM \0 \u0100 \u00ff \ud83d\ude00 \u{1f600}/u`,
32+
output: String.raw`/a \x0a \cM \0 \u{100} \u{ff} \ud83d\ude00 \u{1f600}/u`,
33+
errors: [
34+
{
35+
message:
36+
"Expected unicode code point escape ('\\u{100}'), but unicode escape ('\\u0100') is used.",
37+
column: 16,
38+
},
39+
{
40+
message:
41+
"Expected unicode code point escape ('\\u{ff}'), but unicode escape ('\\u00ff') is used.",
42+
column: 23,
43+
},
44+
],
45+
},
46+
{
47+
code: String.raw`/a \x0a \cM \0 \u0100 \u00ff \ud83d\ude00 \u{1f600}/u`,
48+
output: String.raw`/a \x0a \cM \0 \u{100} \u{ff} \ud83d\ude00 \u{1f600}/u`,
49+
options: ["unicodeCodePointEscape"],
50+
errors: [
51+
{
52+
message:
53+
"Expected unicode code point escape ('\\u{100}'), but unicode escape ('\\u0100') is used.",
54+
column: 16,
55+
},
56+
{
57+
message:
58+
"Expected unicode code point escape ('\\u{ff}'), but unicode escape ('\\u00ff') is used.",
59+
column: 23,
60+
},
61+
],
62+
},
63+
{
64+
code: String.raw`/a \x0a \cM \0 \u{ff} \u{100} \ud83d\ude00 \u{1f600}/u`,
65+
output: String.raw`/a \x0a \cM \0 \u00ff \u0100 \ud83d\ude00 \u{1f600}/u`,
66+
options: ["unicodeEscape"],
67+
errors: [
68+
{
69+
message:
70+
"Expected unicode escape ('\\u00ff'), but unicode code point escape ('\\u{ff}') is used.",
71+
column: 16,
72+
},
73+
{
74+
message:
75+
"Expected unicode escape ('\\u0100'), but unicode code point escape ('\\u{100}') is used.",
76+
column: 23,
77+
},
78+
],
79+
},
80+
],
81+
})

0 commit comments

Comments
 (0)