Skip to content

Commit 694ff66

Browse files
committed
Replace unicode with Unicode all over the book
1 parent 2af5586 commit 694ff66

File tree

3 files changed

+9
-9
lines changed

3 files changed

+9
-9
lines changed

01-regexp-introduction/article.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ There are only 6 of them in JavaScript:
5656
: Enables "dotall" mode, that allows a dot `pattern:.` to match newline character `\n` (covered in the chapter <info:regexp-character-classes>).
5757

5858
`pattern:u`
59-
: Enables full unicode support. The flag enables correct processing of surrogate pairs. More about that in the chapter <info:regexp-unicode>.
59+
: Enables full Unicode support. The flag enables correct processing of surrogate pairs. More about that in the chapter <info:regexp-unicode>.
6060

6161
`pattern:y`
6262
: "Sticky" mode: searching at the exact position in the text (covered in the chapter <info:regexp-sticky>)

03-regexp-unicode/article.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -4,9 +4,9 @@ JavaScript uses [Unicode encoding](https://en.wikipedia.org/wiki/Unicode) for st
44

55
That range is not big enough to encode all possible characters, that's why some rare characters are encoded with 4 bytes, for instance like `𝒳` (mathematical X) or `😄` (a smile), some hieroglyphs and so on.
66

7-
Here are the unicode values of some characters:
7+
Here are the Unicode values of some characters:
88

9-
| Character | Unicode | Bytes count in unicode |
9+
| Character | Unicode | Bytes count in Unicode |
1010
|------------|---------|--------|
1111
| a | `0x0061` | 2 |
1212
|| `0x2248` | 2 |
@@ -121,7 +121,7 @@ alert("number: xAF".match(regexp)); // xAF
121121

122122
Let's look for Chinese hieroglyphs.
123123

124-
There's a unicode property `Script` (a writing system), that may have a value: `Cyrillic`, `Greek`, `Arabic`, `Han` (Chinese) and so on, [here's the full list](https://en.wikipedia.org/wiki/Script_(Unicode)).
124+
There's a Unicode property `Script` (a writing system), that may have a value: `Cyrillic`, `Greek`, `Arabic`, `Han` (Chinese) and so on, [here's the full list](https://en.wikipedia.org/wiki/Script_(Unicode)).
125125

126126
To look for characters in a given writing system we should use `pattern:Script=<value>`, e.g. for Cyrillic letters: `pattern:\p{sc=Cyrillic}`, for Chinese hieroglyphs: `pattern:\p{sc=Han}`, and so on:
127127

@@ -135,7 +135,7 @@ alert( str.match(regexp) ); // 你,好
135135

136136
### Example: currency
137137

138-
Characters that denote a currency, such as `$`, ``, `¥`, have unicode property `pattern:\p{Currency_Symbol}`, the short alias: `pattern:\p{Sc}`.
138+
Characters that denote a currency, such as `$`, ``, `¥`, have Unicode property `pattern:\p{Currency_Symbol}`, the short alias: `pattern:\p{Sc}`.
139139

140140
Let's use it to look for prices in the format "currency, followed by a digit":
141141

08-regexp-character-sets-and-ranges/article.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -57,16 +57,16 @@ For instance:
5757
5858
- **\d** -- is the same as `pattern:[0-9]`,
5959
- **\w** -- is the same as `pattern:[a-zA-Z0-9_]`,
60-
- **\s** -- is the same as `pattern:[\t\n\v\f\r ]`, plus few other rare unicode space characters.
60+
- **\s** -- is the same as `pattern:[\t\n\v\f\r ]`, plus few other rare Unicode space characters.
6161
```
6262

6363
### Example: multi-language \w
6464

6565
As the character class `pattern:\w` is a shorthand for `pattern:[a-zA-Z0-9_]`, it can't find Chinese hieroglyphs, Cyrillic letters, etc.
6666

67-
We can write a more universal pattern, that looks for wordly characters in any language. That's easy with unicode properties: `pattern:[\p{Alpha}\p{M}\p{Nd}\p{Pc}\p{Join_C}]`.
67+
We can write a more universal pattern, that looks for wordly characters in any language. That's easy with Unicode properties: `pattern:[\p{Alpha}\p{M}\p{Nd}\p{Pc}\p{Join_C}]`.
6868

69-
Let's decipher it. Similar to `pattern:\w`, we're making a set of our own that includes characters with following unicode properties:
69+
Let's decipher it. Similar to `pattern:\w`, we're making a set of our own that includes characters with following Unicode properties:
7070

7171
- `Alphabetic` (`Alpha`) - for letters,
7272
- `Mark` (`M`) - for accents,
@@ -85,7 +85,7 @@ let str = `Hi 你好 12`;
8585
alert( str.match(regexp) ); // H,i,你,好,1,2
8686
```
8787

88-
Of course, we can edit this pattern: add unicode properties or remove them. Unicode properties are covered in more details in the article <info:regexp-unicode>.
88+
Of course, we can edit this pattern: add Unicode properties or remove them. Unicode properties are covered in more details in the article <info:regexp-unicode>.
8989

9090
```warn header="Unicode properties aren't supported in Edge and Firefox"
9191
Unicode properties `pattern:p{…}` are not yet implemented in Edge and Firefox. If we really need them, we can use library [XRegExp](http://xregexp.com/).

0 commit comments

Comments
 (0)