Skip to content

Commit 929f371

Browse files
committed
Proofread some code/docs
* `escapeable` should be replaced by `escapable`, but it is part of a pub fn
1 parent ab88aa5 commit 929f371

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

58 files changed

+147
-147
lines changed

.github/workflows/ci.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ on:
1111
# `schedule` event. By specifying any permission explicitly all others are set
1212
# to none. By using the principle of least privilege the damage a compromised
1313
# workflow can do (because of an injection or compromised third party tool or
14-
# action) is restricted. Currently the worklow doesn't need any additional
14+
# action) is restricted. Currently, the workflow doesn't need any additional
1515
# permission except for pulling the code. Adding labels to issues, commenting
1616
# on pull-requests, etc. may need additional permissions:
1717
#

CHANGELOG.md

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -81,7 +81,7 @@ The new word boundary assertions are:
8181
* `\<` or `\b{start}`: a Unicode start-of-word boundary (`\W|\A` on the left,
8282
`\w` on the right).
8383
* `\>` or `\b{end}`: a Unicode end-of-word boundary (`\w` on the left, `\W|\z`
84-
on the right)).
84+
on the right).
8585
* `\b{start-half}`: half of a Unicode start-of-word boundary (`\W|\A` on the
8686
left).
8787
* `\b{end-half}`: half of a Unicode end-of-word boundary (`\W|\z` on the
@@ -195,7 +195,7 @@ Bug fixes:
195195

196196
* [BUG #934](https://github.com/rust-lang/regex/issues/934):
197197
Fix a performance bug where high contention on a single regex led to massive
198-
slow downs.
198+
slow-downs.
199199

200200

201201
1.9.4 (2023-08-26)
@@ -438,14 +438,14 @@ New features:
438438
Permit many more characters to be escaped, even if they have no significance.
439439
More specifically, any ASCII character except for `[0-9A-Za-z<>]` can now be
440440
escaped. Also, a new routine, `is_escapeable_character`, has been added to
441-
`regex-syntax` to query whether a character is escapeable or not.
441+
`regex-syntax` to query whether a character is escapable or not.
442442
* [FEATURE #547](https://github.com/rust-lang/regex/issues/547):
443443
Add `Regex::captures_at`. This fills a hole in the API, but doesn't otherwise
444444
introduce any new expressive power.
445445
* [FEATURE #595](https://github.com/rust-lang/regex/issues/595):
446446
Capture group names are now Unicode-aware. They can now begin with either a `_`
447447
or any "alphabetic" codepoint. After the first codepoint, subsequent codepoints
448-
can be any sequence of alpha-numeric codepoints, along with `_`, `.`, `[` and
448+
can be any sequence of alphanumeric codepoints, along with `_`, `.`, `[` and
449449
`]`. Note that replacement syntax has not changed.
450450
* [FEATURE #810](https://github.com/rust-lang/regex/issues/810):
451451
Add `Match::is_empty` and `Match::len` APIs.
@@ -489,7 +489,7 @@ Fix a number of issues with printing `Hir` values as regex patterns.
489489
* [BUG #610](https://github.com/rust-lang/regex/issues/610):
490490
Add explicit example of `foo|bar` in the regex syntax docs.
491491
* [BUG #625](https://github.com/rust-lang/regex/issues/625):
492-
Clarify that `SetMatches::len` does not (regretably) refer to the number of
492+
Clarify that `SetMatches::len` does not (regrettably) refer to the number of
493493
matches in the set.
494494
* [BUG #660](https://github.com/rust-lang/regex/issues/660):
495495
Clarify "verbose mode" in regex syntax documentation.
@@ -876,7 +876,7 @@ Bug fixes:
876876

877877
1.3.1 (2019-09-04)
878878
==================
879-
This is a maintenance release with no changes in order to try to work-around
879+
This is a maintenance release with no changes in order to try to work around
880880
a [docs.rs/Cargo issue](https://github.com/rust-lang/docs.rs/issues/400).
881881

882882

@@ -911,15 +911,15 @@ This release does a bit of house cleaning. Namely:
911911
Rust project.
912912
* Teddy has been removed from the `regex` crate, and is now part of the
913913
`aho-corasick` crate.
914-
[See `aho-corasick`'s new `packed` sub-module for details](https://docs.rs/aho-corasick/0.7.6/aho_corasick/packed/index.html).
914+
[See `aho-corasick`'s new `packed` submodule for details](https://docs.rs/aho-corasick/0.7.6/aho_corasick/packed/index.html).
915915
* The `utf8-ranges` crate has been deprecated, with its functionality moving
916916
into the
917917
[`utf8` sub-module of `regex-syntax`](https://docs.rs/regex-syntax/0.6.11/regex_syntax/utf8/index.html).
918918
* The `ucd-util` dependency has been dropped, in favor of implementing what
919919
little we need inside of `regex-syntax` itself.
920920

921921
In general, this is part of an ongoing (long term) effort to make optimizations
922-
in the regex engine easier to reason about. The current code is too convoluted
922+
in the regex engine easier to reason about. The current code is too convoluted,
923923
and thus it is very easy to introduce new bugs. This simplification effort is
924924
the primary motivation behind re-working the `aho-corasick` crate to not only
925925
bundle algorithms like Teddy, but to also provide regex-like match semantics
@@ -1121,7 +1121,7 @@ need or want to use these APIs.
11211121
New features:
11221122

11231123
* [FEATURE #493](https://github.com/rust-lang/regex/pull/493):
1124-
Add a few lower level APIs for amortizing allocation and more fine grained
1124+
Add a few lower level APIs for amortizing allocation and more fine-grained
11251125
searching.
11261126

11271127
Bug fixes:
@@ -1167,7 +1167,7 @@ of the regex library should be able to migrate to 1.0 by simply bumping the
11671167
version number. The important changes are as follows:
11681168

11691169
* We adopt Rust 1.20 as the new minimum supported version of Rust for regex.
1170-
We also tentativley adopt a policy that permits bumping the minimum supported
1170+
We also tentatively adopt a policy that permits bumping the minimum supported
11711171
version of Rust in minor version releases of regex, but no patch releases.
11721172
That is, with respect to semver, we do not strictly consider bumping the
11731173
minimum version of Rust to be a breaking change, but adopt a conservative
@@ -1254,7 +1254,7 @@ Bug fixes:
12541254

12551255
0.2.8 (2018-03-12)
12561256
==================
1257-
Bug gixes:
1257+
Bug fixes:
12581258

12591259
* [BUG #454](https://github.com/rust-lang/regex/pull/454):
12601260
Fix a bug in the nest limit checker being too aggressive.
@@ -1275,7 +1275,7 @@ New features:
12751275
* Full support for intersection, difference and symmetric difference of
12761276
character classes. These can be used via the `&&`, `--` and `~~` binary
12771277
operators within classes.
1278-
* A Unicode Level 1 conformat implementation of `\p{..}` character classes.
1278+
* A Unicode Level 1 conformant implementation of `\p{..}` character classes.
12791279
Things like `\p{scx:Hira}`, `\p{age:3.2}` or `\p{Changes_When_Casefolded}`
12801280
now work. All property name and value aliases are supported, and properties
12811281
are selected via loose matching. e.g., `\p{Greek}` is the same as
@@ -1398,7 +1398,7 @@ Bug fixes:
13981398
0.2.1
13991399
=====
14001400
One major bug with `replace_all` has been fixed along with a couple of other
1401-
touchups.
1401+
touch-ups.
14021402

14031403
* [BUG #312](https://github.com/rust-lang/regex/issues/312):
14041404
Fix documentation for `NoExpand` to reference correct lifetime parameter.
@@ -1547,7 +1547,7 @@ A number of bugs have been fixed:
15471547
* Fix bug #277.
15481548
* [PR #270](https://github.com/rust-lang/regex/pull/270):
15491549
Fixes bugs #264, #268 and an unreported where the DFA cache size could be
1550-
drastically under estimated in some cases (leading to high unexpected memory
1550+
drastically underestimated in some cases (leading to high unexpected memory
15511551
usage).
15521552

15531553
0.1.73

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,7 @@ assert!(matches.matched(6));
171171
### Usage: regex internals as a library
172172

173173
The [`regex-automata` directory](./regex-automata/) contains a crate that
174-
exposes all of the internal matching engines used by the `regex` crate. The
174+
exposes all the internal matching engines used by the `regex` crate. The
175175
idea is that the `regex` crate exposes a simple API for 99% of use cases, but
176176
`regex-automata` exposes oodles of customizable behaviors.
177177

@@ -192,7 +192,7 @@ recommended for general use.
192192

193193
### Crate features
194194

195-
This crate comes with several features that permit tweaking the trade off
195+
This crate comes with several features that permit tweaking the trade-off
196196
between binary size, compilation time and runtime performance. Users of this
197197
crate can selectively disable Unicode tables, or choose from a variety of
198198
optimizations performed by this crate to disable.
@@ -230,7 +230,7 @@ searches are "fast" in practice.
230230

231231
While the first interpretation is pretty unambiguous, the second one remains
232232
nebulous. While nebulous, it guides this crate's architecture and the sorts of
233-
the trade offs it makes. For example, here are some general architectural
233+
the trade-offs it makes. For example, here are some general architectural
234234
statements that follow as a result of the goal to be "fast":
235235

236236
* When given the choice between faster regex searches and faster _Rust compile

UNICODE.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -207,21 +207,21 @@ Finally, Unicode word boundaries can be disabled, which will cause ASCII word
207207
boundaries to be used instead. That is, `\b` is a Unicode word boundary while
208208
`(?-u)\b` is an ASCII-only word boundary. This can occasionally be beneficial
209209
if performance is important, since the implementation of Unicode word
210-
boundaries is currently sub-optimal on non-ASCII text.
210+
boundaries is currently suboptimal on non-ASCII text.
211211

212212

213213
## RL1.5 Simple Loose Matches
214214

215215
[UTS#18 RL1.5](https://unicode.org/reports/tr18/#Simple_Loose_Matches)
216216

217-
The regex crate provides full support for case insensitive matching in
217+
The regex crate provides full support for case-insensitive matching in
218218
accordance with RL1.5. That is, it uses the "simple" case folding mapping. The
219219
"simple" mapping was chosen because of a key convenient property: every
220220
"simple" mapping is a mapping from exactly one code point to exactly one other
221-
code point. This makes case insensitive matching of character classes, for
221+
code point. This makes case-insensitive matching of character classes, for
222222
example, straight-forward to implement.
223223

224-
When case insensitive mode is enabled (e.g., `(?i)[a]` is equivalent to `a|A`),
224+
When case-insensitive mode is enabled (e.g., `(?i)[a]` is equivalent to `a|A`),
225225
then all characters classes are case folded as well.
226226

227227

@@ -248,7 +248,7 @@ Given Rust's strong ties to UTF-8, the following guarantees are also provided:
248248
* All matches are reported on valid UTF-8 code unit boundaries. That is, any
249249
match range returned by the public regex API is guaranteed to successfully
250250
slice the string that was searched.
251-
* By consequence of the above, it is impossible to match surrogode code points.
251+
* By consequence of the above, it is impossible to match surrogate code points.
252252
No support for UTF-16 is provided, so this is never necessary.
253253

254254
Note that when Unicode mode is disabled, the fundamental atom of matching is

record/compile-test/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
This directory contains the results of compilation tests. Specifically,
2-
the results are from testing both the from scratch compilation time and
2+
the results are from testing both the from-scratch compilation time and
33
relative binary size increases of various features for both the `regex` and
44
`regex-automata` crates.
55

regex-automata/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -66,7 +66,7 @@ Below is an outline of how `unsafe` is used in this crate.
6666

6767
* `util::pool::Pool` makes use of `unsafe` to implement a fast path for
6868
accessing an element of the pool. The fast path applies to the first thread
69-
that uses the pool. In effect, the fast path is fast because it avoid a mutex
69+
that uses the pool. In effect, the fast path is fast because it avoids a mutex
7070
lock. `unsafe` is also used in the no-std version of `Pool` to implement a spin
7171
lock for synchronization.
7272
* `util::lazy::Lazy` uses `unsafe` to implement a variant of
@@ -112,6 +112,6 @@ In the end, I do still somewhat consider this crate an experiment. It is
112112
unclear whether the strong boundaries between components will be an impediment
113113
to ongoing development or not. De-coupling tends to lead to slower development
114114
in my experience, and when you mix in the added cost of not introducing
115-
breaking changes all of the time, things can get quite complicated. But, I
115+
breaking changes all the time, things can get quite complicated. But, I
116116
don't think anyone has ever release the internals of a regex engine as a
117117
library before. So it will be interesting to see how it plays out!

regex-automata/src/dfa/automaton.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2202,7 +2202,7 @@ where
22022202
///
22032203
/// Specifically, this tries to succinctly distinguish the different types of
22042204
/// states: dead states, quit states, accelerated states, start states and
2205-
/// match states. It even accounts for the possible overlappings of different
2205+
/// match states. It even accounts for the possible overlapping of different
22062206
/// state types.
22072207
pub(crate) fn fmt_state_indicator<A: Automaton>(
22082208
f: &mut core::fmt::Formatter<'_>,

regex-automata/src/dfa/dense.rs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2810,7 +2810,7 @@ impl OwnedDFA {
28102810
}
28112811

28122812
// Collect all our non-DEAD start states into a convenient set and
2813-
// confirm there is no overlap with match states. In the classicl DFA
2813+
// confirm there is no overlap with match states. In the classical DFA
28142814
// construction, start states can be match states. But because of
28152815
// look-around, we delay all matches by a byte, which prevents start
28162816
// states from being match states.
@@ -3461,7 +3461,7 @@ impl TransitionTable<Vec<u32>> {
34613461
// Normally, to get a fresh state identifier, we would just
34623462
// take the index of the next state added to the transition
34633463
// table. However, we actually perform an optimization here
3464-
// that premultiplies state IDs by the stride, such that they
3464+
// that pre-multiplies state IDs by the stride, such that they
34653465
// point immediately at the beginning of their transitions in
34663466
// the transition table. This avoids an extra multiplication
34673467
// instruction for state lookup at search time.
@@ -4515,7 +4515,7 @@ impl<T: AsRef<[u32]>> MatchStates<T> {
45154515
+ (self.pattern_ids().len() * PatternID::SIZE)
45164516
}
45174517

4518-
/// Valides that the match state info is itself internally consistent and
4518+
/// Validates that the match state info is itself internally consistent and
45194519
/// consistent with the recorded match state region in the given DFA.
45204520
fn validate(&self, dfa: &DFA<T>) -> Result<(), DeserializeError> {
45214521
if self.len() != dfa.special.match_len(dfa.stride()) {
@@ -4773,7 +4773,7 @@ impl<'a, T: AsRef<[u32]>> Iterator for StateIter<'a, T> {
47734773

47744774
/// An immutable representation of a single DFA state.
47754775
///
4776-
/// `'a` correspondings to the lifetime of a DFA's transition table.
4776+
/// `'a` corresponding to the lifetime of a DFA's transition table.
47774777
pub(crate) struct State<'a> {
47784778
id: StateID,
47794779
stride2: usize,

regex-automata/src/dfa/determinize.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -466,7 +466,7 @@ impl<'a> Runner<'a> {
466466
) -> Result<(StateID, bool), BuildError> {
467467
// Compute the look-behind assertions that are true in this starting
468468
// configuration, and the determine the epsilon closure. While
469-
// computing the epsilon closure, we only follow condiional epsilon
469+
// computing the epsilon closure, we only follow conditional epsilon
470470
// transitions that satisfy the look-behind assertions in 'look_have'.
471471
let mut builder_matches = self.get_state_builder().into_matches();
472472
util::determinize::set_lookbehind_from_start(

regex-automata/src/dfa/mod.rs

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -271,7 +271,7 @@ memory.) Conversely, compiling the same regex without Unicode support, e.g.,
271271
`(?-u)\w{50}`, takes under 1 millisecond and about 15KB of memory. For this
272272
reason, you should only use Unicode character classes if you absolutely need
273273
them! (They are enabled by default though.)
274-
* This module does not support Unicode word boundaries. ASCII word bondaries
274+
* This module does not support Unicode word boundaries. ASCII word boundaries
275275
may be used though by disabling Unicode or selectively doing so in the syntax,
276276
e.g., `(?-u:\b)`. There is also an option to
277277
[heuristically enable Unicode word boundaries](crate::dfa::dense::Config::unicode_word_boundary),

0 commit comments

Comments
 (0)