Skip to content

Commit 213e00b

Browse files
committed
Proofread some code/docs
* `escapeable` should be replaced by `escapable`, but it is part of a pub fn
1 parent e7bd19d commit 213e00b

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

58 files changed

+147
-147
lines changed

.github/workflows/ci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ on:
1111
# `schedule` event. By specifying any permission explicitly all others are set
1212
# to none. By using the principle of least privilege the damage a compromised
1313
# workflow can do (because of an injection or compromised third party tool or
14-
# action) is restricted. Currently the worklow doesn't need any additional
14+
# action) is restricted. Currently, the workflow doesn't need any additional
1515
# permission except for pulling the code. Adding labels to issues, commenting
1616
# on pull-requests, etc. may need additional permissions:
1717
#

CHANGELOG.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ The new word boundary assertions are:
2525
* `\<` or `\b{start}`: a Unicode start-of-word boundary (`\W|\A` on the left,
2626
`\w` on the right).
2727
* `\>` or `\b{end}`: a Unicode end-of-word boundary (`\w` on the left, `\W|\z`
28-
on the right)).
28+
on the right).
2929
* `\b{start-half}`: half of a Unicode start-of-word boundary (`\W|\A` on the
3030
left).
3131
* `\b{end-half}`: half of a Unicode end-of-word boundary (`\W|\z` on the
@@ -139,7 +139,7 @@ Bug fixes:
139139

140140
* [BUG #934](https://github.com/rust-lang/regex/issues/934):
141141
Fix a performance bug where high contention on a single regex led to massive
142-
slow downs.
142+
slow-downs.
143143

144144

145145
1.9.4 (2023-08-26)
@@ -382,14 +382,14 @@ New features:
382382
Permit many more characters to be escaped, even if they have no significance.
383383
More specifically, any ASCII character except for `[0-9A-Za-z<>]` can now be
384384
escaped. Also, a new routine, `is_escapeable_character`, has been added to
385-
`regex-syntax` to query whether a character is escapeable or not.
385+
`regex-syntax` to query whether a character is escapable or not.
386386
* [FEATURE #547](https://github.com/rust-lang/regex/issues/547):
387387
Add `Regex::captures_at`. This fills a hole in the API, but doesn't otherwise
388388
introduce any new expressive power.
389389
* [FEATURE #595](https://github.com/rust-lang/regex/issues/595):
390390
Capture group names are now Unicode-aware. They can now begin with either a `_`
391391
or any "alphabetic" codepoint. After the first codepoint, subsequent codepoints
392-
can be any sequence of alpha-numeric codepoints, along with `_`, `.`, `[` and
392+
can be any sequence of alphanumeric codepoints, along with `_`, `.`, `[` and
393393
`]`. Note that replacement syntax has not changed.
394394
* [FEATURE #810](https://github.com/rust-lang/regex/issues/810):
395395
Add `Match::is_empty` and `Match::len` APIs.
@@ -433,7 +433,7 @@ Fix a number of issues with printing `Hir` values as regex patterns.
433433
* [BUG #610](https://github.com/rust-lang/regex/issues/610):
434434
Add explicit example of `foo|bar` in the regex syntax docs.
435435
* [BUG #625](https://github.com/rust-lang/regex/issues/625):
436-
Clarify that `SetMatches::len` does not (regretably) refer to the number of
436+
Clarify that `SetMatches::len` does not (regrettably) refer to the number of
437437
matches in the set.
438438
* [BUG #660](https://github.com/rust-lang/regex/issues/660):
439439
Clarify "verbose mode" in regex syntax documentation.
@@ -820,7 +820,7 @@ Bug fixes:
820820

821821
1.3.1 (2019-09-04)
822822
==================
823-
This is a maintenance release with no changes in order to try to work-around
823+
This is a maintenance release with no changes in order to try to work around
824824
a [docs.rs/Cargo issue](https://github.com/rust-lang/docs.rs/issues/400).
825825

826826

@@ -855,15 +855,15 @@ This release does a bit of house cleaning. Namely:
855855
Rust project.
856856
* Teddy has been removed from the `regex` crate, and is now part of the
857857
`aho-corasick` crate.
858-
[See `aho-corasick`'s new `packed` sub-module for details](https://docs.rs/aho-corasick/0.7.6/aho_corasick/packed/index.html).
858+
[See `aho-corasick`'s new `packed` submodule for details](https://docs.rs/aho-corasick/0.7.6/aho_corasick/packed/index.html).
859859
* The `utf8-ranges` crate has been deprecated, with its functionality moving
860860
into the
861861
[`utf8` sub-module of `regex-syntax`](https://docs.rs/regex-syntax/0.6.11/regex_syntax/utf8/index.html).
862862
* The `ucd-util` dependency has been dropped, in favor of implementing what
863863
little we need inside of `regex-syntax` itself.
864864

865865
In general, this is part of an ongoing (long term) effort to make optimizations
866-
in the regex engine easier to reason about. The current code is too convoluted
866+
in the regex engine easier to reason about. The current code is too convoluted,
867867
and thus it is very easy to introduce new bugs. This simplification effort is
868868
the primary motivation behind re-working the `aho-corasick` crate to not only
869869
bundle algorithms like Teddy, but to also provide regex-like match semantics
@@ -1065,7 +1065,7 @@ need or want to use these APIs.
10651065
New features:
10661066

10671067
* [FEATURE #493](https://github.com/rust-lang/regex/pull/493):
1068-
Add a few lower level APIs for amortizing allocation and more fine grained
1068+
Add a few lower level APIs for amortizing allocation and more fine-grained
10691069
searching.
10701070

10711071
Bug fixes:
@@ -1111,7 +1111,7 @@ of the regex library should be able to migrate to 1.0 by simply bumping the
11111111
version number. The important changes are as follows:
11121112

11131113
* We adopt Rust 1.20 as the new minimum supported version of Rust for regex.
1114-
We also tentativley adopt a policy that permits bumping the minimum supported
1114+
We also tentatively adopt a policy that permits bumping the minimum supported
11151115
version of Rust in minor version releases of regex, but no patch releases.
11161116
That is, with respect to semver, we do not strictly consider bumping the
11171117
minimum version of Rust to be a breaking change, but adopt a conservative
@@ -1198,7 +1198,7 @@ Bug fixes:
11981198

11991199
0.2.8 (2018-03-12)
12001200
==================
1201-
Bug gixes:
1201+
Bug fixes:
12021202

12031203
* [BUG #454](https://github.com/rust-lang/regex/pull/454):
12041204
Fix a bug in the nest limit checker being too aggressive.
@@ -1219,7 +1219,7 @@ New features:
12191219
* Full support for intersection, difference and symmetric difference of
12201220
character classes. These can be used via the `&&`, `--` and `~~` binary
12211221
operators within classes.
1222-
* A Unicode Level 1 conformat implementation of `\p{..}` character classes.
1222+
* A Unicode Level 1 conformant implementation of `\p{..}` character classes.
12231223
Things like `\p{scx:Hira}`, `\p{age:3.2}` or `\p{Changes_When_Casefolded}`
12241224
now work. All property name and value aliases are supported, and properties
12251225
are selected via loose matching. e.g., `\p{Greek}` is the same as
@@ -1342,7 +1342,7 @@ Bug fixes:
13421342
0.2.1
13431343
=====
13441344
One major bug with `replace_all` has been fixed along with a couple of other
1345-
touchups.
1345+
touch-ups.
13461346

13471347
* [BUG #312](https://github.com/rust-lang/regex/issues/312):
13481348
Fix documentation for `NoExpand` to reference correct lifetime parameter.
@@ -1491,7 +1491,7 @@ A number of bugs have been fixed:
14911491
* Fix bug #277.
14921492
* [PR #270](https://github.com/rust-lang/regex/pull/270):
14931493
Fixes bugs #264, #268 and an unreported where the DFA cache size could be
1494-
drastically under estimated in some cases (leading to high unexpected memory
1494+
drastically underestimated in some cases (leading to high unexpected memory
14951495
usage).
14961496

14971497
0.1.73

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,7 @@ assert!(matches.matched(6));
171171
### Usage: regex internals as a library
172172

173173
The [`regex-automata` directory](./regex-automata/) contains a crate that
174-
exposes all of the internal matching engines used by the `regex` crate. The
174+
exposes all the internal matching engines used by the `regex` crate. The
175175
idea is that the `regex` crate exposes a simple API for 99% of use cases, but
176176
`regex-automata` exposes oodles of customizable behaviors.
177177

@@ -192,7 +192,7 @@ recommended for general use.
192192

193193
### Crate features
194194

195-
This crate comes with several features that permit tweaking the trade off
195+
This crate comes with several features that permit tweaking the trade-off
196196
between binary size, compilation time and runtime performance. Users of this
197197
crate can selectively disable Unicode tables, or choose from a variety of
198198
optimizations performed by this crate to disable.
@@ -230,7 +230,7 @@ searches are "fast" in practice.
230230

231231
While the first interpretation is pretty unambiguous, the second one remains
232232
nebulous. While nebulous, it guides this crate's architecture and the sorts of
233-
the trade offs it makes. For example, here are some general architectural
233+
the trade-offs it makes. For example, here are some general architectural
234234
statements that follow as a result of the goal to be "fast":
235235

236236
* When given the choice between faster regex searches and faster _Rust compile

UNICODE.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -207,21 +207,21 @@ Finally, Unicode word boundaries can be disabled, which will cause ASCII word
207207
boundaries to be used instead. That is, `\b` is a Unicode word boundary while
208208
`(?-u)\b` is an ASCII-only word boundary. This can occasionally be beneficial
209209
if performance is important, since the implementation of Unicode word
210-
boundaries is currently sub-optimal on non-ASCII text.
210+
boundaries is currently suboptimal on non-ASCII text.
211211

212212

213213
## RL1.5 Simple Loose Matches
214214

215215
[UTS#18 RL1.5](https://unicode.org/reports/tr18/#Simple_Loose_Matches)
216216

217-
The regex crate provides full support for case insensitive matching in
217+
The regex crate provides full support for case-insensitive matching in
218218
accordance with RL1.5. That is, it uses the "simple" case folding mapping. The
219219
"simple" mapping was chosen because of a key convenient property: every
220220
"simple" mapping is a mapping from exactly one code point to exactly one other
221-
code point. This makes case insensitive matching of character classes, for
221+
code point. This makes case-insensitive matching of character classes, for
222222
example, straight-forward to implement.
223223

224-
When case insensitive mode is enabled (e.g., `(?i)[a]` is equivalent to `a|A`),
224+
When case-insensitive mode is enabled (e.g., `(?i)[a]` is equivalent to `a|A`),
225225
then all characters classes are case folded as well.
226226

227227

@@ -248,7 +248,7 @@ Given Rust's strong ties to UTF-8, the following guarantees are also provided:
248248
* All matches are reported on valid UTF-8 code unit boundaries. That is, any
249249
match range returned by the public regex API is guaranteed to successfully
250250
slice the string that was searched.
251-
* By consequence of the above, it is impossible to match surrogode code points.
251+
* By consequence of the above, it is impossible to match surrogate code points.
252252
No support for UTF-16 is provided, so this is never necessary.
253253

254254
Note that when Unicode mode is disabled, the fundamental atom of matching is

record/compile-test/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
This directory contains the results of compilation tests. Specifically,
2-
the results are from testing both the from scratch compilation time and
2+
the results are from testing both the from-scratch compilation time and
33
relative binary size increases of various features for both the `regex` and
44
`regex-automata` crates.
55

regex-automata/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ Below is an outline of how `unsafe` is used in this crate.
6666

6767
* `util::pool::Pool` makes use of `unsafe` to implement a fast path for
6868
accessing an element of the pool. The fast path applies to the first thread
69-
that uses the pool. In effect, the fast path is fast because it avoid a mutex
69+
that uses the pool. In effect, the fast path is fast because it avoids a mutex
7070
lock. `unsafe` is also used in the no-std version of `Pool` to implement a spin
7171
lock for synchronization.
7272
* `util::lazy::Lazy` uses `unsafe` to implement a variant of
@@ -112,6 +112,6 @@ In the end, I do still somewhat consider this crate an experiment. It is
112112
unclear whether the strong boundaries between components will be an impediment
113113
to ongoing development or not. De-coupling tends to lead to slower development
114114
in my experience, and when you mix in the added cost of not introducing
115-
breaking changes all of the time, things can get quite complicated. But, I
115+
breaking changes all the time, things can get quite complicated. But, I
116116
don't think anyone has ever release the internals of a regex engine as a
117117
library before. So it will be interesting to see how it plays out!

regex-automata/src/dfa/automaton.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2202,7 +2202,7 @@ where
22022202
///
22032203
/// Specifically, this tries to succinctly distinguish the different types of
22042204
/// states: dead states, quit states, accelerated states, start states and
2205-
/// match states. It even accounts for the possible overlappings of different
2205+
/// match states. It even accounts for the possible overlapping of different
22062206
/// state types.
22072207
pub(crate) fn fmt_state_indicator<A: Automaton>(
22082208
f: &mut core::fmt::Formatter<'_>,

regex-automata/src/dfa/dense.rs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2810,7 +2810,7 @@ impl OwnedDFA {
28102810
}
28112811

28122812
// Collect all our non-DEAD start states into a convenient set and
2813-
// confirm there is no overlap with match states. In the classicl DFA
2813+
// confirm there is no overlap with match states. In the classical DFA
28142814
// construction, start states can be match states. But because of
28152815
// look-around, we delay all matches by a byte, which prevents start
28162816
// states from being match states.
@@ -3461,7 +3461,7 @@ impl TransitionTable<Vec<u32>> {
34613461
// Normally, to get a fresh state identifier, we would just
34623462
// take the index of the next state added to the transition
34633463
// table. However, we actually perform an optimization here
3464-
// that premultiplies state IDs by the stride, such that they
3464+
// that pre-multiplies state IDs by the stride, such that they
34653465
// point immediately at the beginning of their transitions in
34663466
// the transition table. This avoids an extra multiplication
34673467
// instruction for state lookup at search time.
@@ -4509,7 +4509,7 @@ impl<T: AsRef<[u32]>> MatchStates<T> {
45094509
+ (self.pattern_ids().len() * PatternID::SIZE)
45104510
}
45114511

4512-
/// Valides that the match state info is itself internally consistent and
4512+
/// Validates that the match state info is itself internally consistent and
45134513
/// consistent with the recorded match state region in the given DFA.
45144514
fn validate(&self, dfa: &DFA<T>) -> Result<(), DeserializeError> {
45154515
if self.len() != dfa.special.match_len(dfa.stride()) {
@@ -4767,7 +4767,7 @@ impl<'a, T: AsRef<[u32]>> Iterator for StateIter<'a, T> {
47674767

47684768
/// An immutable representation of a single DFA state.
47694769
///
4770-
/// `'a` correspondings to the lifetime of a DFA's transition table.
4770+
/// `'a` corresponding to the lifetime of a DFA's transition table.
47714771
pub(crate) struct State<'a> {
47724772
id: StateID,
47734773
stride2: usize,

regex-automata/src/dfa/determinize.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -466,7 +466,7 @@ impl<'a> Runner<'a> {
466466
) -> Result<(StateID, bool), BuildError> {
467467
// Compute the look-behind assertions that are true in this starting
468468
// configuration, and the determine the epsilon closure. While
469-
// computing the epsilon closure, we only follow condiional epsilon
469+
// computing the epsilon closure, we only follow conditional epsilon
470470
// transitions that satisfy the look-behind assertions in 'look_have'.
471471
let mut builder_matches = self.get_state_builder().into_matches();
472472
util::determinize::set_lookbehind_from_start(

regex-automata/src/dfa/mod.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -271,7 +271,7 @@ memory.) Conversely, compiling the same regex without Unicode support, e.g.,
271271
`(?-u)\w{50}`, takes under 1 millisecond and about 15KB of memory. For this
272272
reason, you should only use Unicode character classes if you absolutely need
273273
them! (They are enabled by default though.)
274-
* This module does not support Unicode word boundaries. ASCII word bondaries
274+
* This module does not support Unicode word boundaries. ASCII word boundaries
275275
may be used though by disabling Unicode or selectively doing so in the syntax,
276276
e.g., `(?-u:\b)`. There is also an option to
277277
[heuristically enable Unicode word boundaries](crate::dfa::dense::Config::unicode_word_boundary),

0 commit comments

Comments
 (0)