Skip to content

Commit ce5b2f2

Browse files
committed
clarify the extended mode syntax
binary, octal, decimal, and hexadecimal notations needed examples to show how they are used; change the unicode example to not be the dagger (since the dagger already had a meaning to refer to the note for that section)
1 parent 4a9c10a commit ce5b2f2

File tree

1 file changed

+17
-11
lines changed

1 file changed

+17
-11
lines changed

content/docs/searching.md

Lines changed: 17 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -364,18 +364,24 @@ If your caret is on word word2, "Find Next (Volatile)" will search the next word
364364

365365
In extended mode, these escape sequences (a backslash followed by a single character and optional material) have special meaning, and will not be interpreted literally.
366366

367-
* `\n`: the Line Feed control character LF (ASCII 0x0A)
367+
* `\n`: The Line Feed control character LF (ASCII 0x0A)
368368
* `\r`: The Carriage Return control character CR (ASCII 0x0D)
369-
* `\t`: the TAB control character (ASCII 0x09)
370-
* `\0`: the NUL control character (ASCII 0x00)
371-
* `\\`: the literal backslash character (ASCII 0x5C)
372-
* `\b`: the binary representation of a byte, made of 8 digits which are either 1's or 0's. †
373-
* `\o`: the octal representation of a byte, made of 3 digits in the 0-7 range
374-
* `\d`: the decimal representation of a byte, made of 3 digits in the 0-9 range
375-
* `\x`: the hexadecimal representation of a byte, made of 2 digits in the 0-9, A-F/a-f range.
376-
* `\u`: The hexadecimal representation of a two-byte character, made of 4 digits in the 0-9, A-F/a-f range. In Unicode builds, finds a Unicode character (for instance, `\u2020` matches the `` char, in an UTF-8 encoded file). In ANSI builds, finds characters requiring two bytes, like in the Shift-JIS encoding. †
377-
378-
†NOTE: While some of these Extended Search Mode escape sequences look like regular expression escape sequences, they are not identical. Ones marked with † are different from or not available in regular expressions.
369+
* `\t`: The TAB control character (ASCII 0x09)
370+
* `\0`: The NUL control character (ASCII 0x00) †
371+
* `\\`: The literal backslash character (ASCII 0x5C)
372+
* `\b`: The binary representation of a byte, made of 8 digits which are either 1's or 0's. †
373+
- `\b00100000` will match the SPACE character (ASCII 32 is "00100000" in 8-bit binary)
374+
* `\o`: The octal representation of a byte, made of 3 digits in the 0-7 range. †
375+
- `\b040` will match the SPACE character (ASCII 32 is "040" in 3-digit octal)
376+
* `\d`: The decimal representation of a byte, made of 3 digits in the 0-9 range. †
377+
- `\d032` will match the SPACE character (ASCII 32 is "032" in 3-digit decimal)
378+
* `\x`: The hexadecimal representation of a byte, made of 2 digits in the 0-9, A-F/a-f range.
379+
- `\x20` will match the SPACE character (ASCII 32 is "20" in 2-digit hexadecimal)
380+
* `\u`: The hexadecimal representation of a two-byte character, made of 4 digits in the 0-9, A-F/a-f range. †
381+
- In Unicode builds, finds a Unicode character: for example, `\u263A` matches the `` char, in an UTF-8 encoded file.
382+
- In ANSI builds, finds characters requiring two bytes, like in the Shift-JIS encoding.
383+
384+
† NOTE: While some of these Extended Search Mode escape sequences look like regular expression escape sequences, they are not identical. Ones marked with † are different from or not available in regular expressions.
379385

380386
## Regular Expressions
381387

0 commit comments

Comments
 (0)