Skip to content

Commit 5b9663e

Browse files
Use numbered list in regex section
Longer lists are best numbered, so it's to track and understand. Signed-off-by: David A. Wheeler <[email protected]>
1 parent a99e4f9 commit 5b9663e

File tree

1 file changed

+7
-7
lines changed

1 file changed

+7
-7
lines changed

secure_software_development_fundamentals.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1707,19 +1707,19 @@ There are three major families of regex language notations:
17071707

17081708
Here are some important things that vary:
17091709

1710-
* Sometimes there is an option or alternative method to match the entire input; if available, you can use that instead of the anchoring symbols. Make sure it matches the whole thing, though; some methods only check the beginning.
1710+
1. Sometimes there is an option or alternative method to match the entire input; if available, you can use that instead of the anchoring symbols. Make sure it matches the whole thing, though; some methods only check the beginning.
17111711

1712-
* Sometimes “**^**” matches the beginning of all the data, while in others it represents the beginning of any line in the data. This is often controlled by a *multiline* option.
1712+
2. Sometimes “**^**” matches the beginning of all the data, while in others it represents the beginning of any line in the data. This is often controlled by a *multiline* option.
17131713

1714-
* Sometimes “**$**” matches the end of all the data, while in others it represents the end of any line in the data. In some systems, an optional newline character (or similar) is also always accepted. In some systems you must use "**\z**" to match the end of the data, but in Python you must use "**\Z**".
1714+
3. Sometimes “**$**” matches the end of all the data, while in others it represents the end of any line in the data. In some systems, an optional newline character (or similar) is also always accepted. In some systems you must use "**\z**" to match the end of the data, but in Python you must use "**\Z**".
17151715

1716-
* The “**.**” for representing *“any character”* doesn’t always match the newline character (**\n**); often there is an option to turn this on or off.
1716+
4. The “**.**” for representing *“any character”* doesn’t always match the newline character (**\n**); often there is an option to turn this on or off.
17171717

1718-
* Does it properly support Unicode and the encoding you are using?
1718+
5. Some properly support Unicode and the encoding you are using; others do not.
17191719

1720-
* Can it handle data with the **NUL** character (byte value 0) within the data? If not, and your input data could have an embedded **NUL** character, you will need to validate the data first to make sure there are no **NUL** characters before passing the data to the regex implementation.
1720+
6. Some can handle data with the **NUL** character (byte value 0) within the data; others do not. If not, and your input data could have an embedded **NUL** character, you will need to validate the data first to make sure there are no **NUL** characters before passing the data to the regex implementation.
17211721

1722-
* Is matching case-sensitive? Usually it is case-sensitive by default, and there is a trivial way to make it case-insensitive. If it is case-insensitive, remember that exactly what characters have case-insensitive matches depends on the locale. For example, “**I**” and “**i**” match in the English (“**en**”) and the C locale (“**C**”), but not in the Turkish (“**tr**”). In the Turkish locale, the Unicode LATIN CAPITAL LETTER I matches the LATIN SMALL LETTER DOTLESS I - not a lowercase “**i**”.
1722+
7. Some do case-sensitive matching by default; others do not. Usually it is case-sensitive by default, and there is a trivial way to make it case-insensitive. If it is case-insensitive, remember that exactly what characters have case-insensitive matches depends on the locale. For example, “**I**” and “**i**” match in the English (“**en**”) and the C locale (“**C**”), but not in the Turkish (“**tr**”). In the Turkish locale, the Unicode LATIN CAPITAL LETTER I matches the LATIN SMALL LETTER DOTLESS I - not a lowercase “**i**”.
17231723

17241724
The following table shows how to create a regex pattern that matches an entire input string for some common platforms, as provided by [Correctly Using Regular Expressions for Secure Input Validation](https://best.openssf.org/Correctly-Using-Regular-Expressions). There's no need to memorize this; the point to understand is to make sure you use the correct symbols for the platform you're using:
17251725

0 commit comments

Comments
 (0)