-
Notifications
You must be signed in to change notification settings - Fork 14.1k
byte_pattern: share the TwoWaySearcher between byte and str
#135931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
byte_pattern: share the TwoWaySearcher between byte and str
#135931
Conversation
| SearchStep::Reject(a, mut b) => { | ||
| byte_pattern::SearchStep::Reject(a, mut b) => { | ||
| // skip to next char boundary | ||
| while !self.haystack.is_char_boundary(b) { | ||
| b += 1; | ||
| } | ||
| searcher.position = cmp::max(b, searcher.position); | ||
| SearchStep::Reject(a, b) | ||
| } | ||
| otherwise => otherwise, | ||
| byte_pattern::SearchStep::Match(a, b) => SearchStep::Match(a, b), | ||
| byte_pattern::SearchStep::Done => SearchStep::Done, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I duplicated SearchStep because it is a public type and its documentation refers to the Searcher trait. The byte_pattern module will have it's own Searcher trait (or ByteSearcher maybe) and so that documentation would be misleading one way or the other.
library/core/src/str/pattern.rs
Outdated
| if let Some(result) = simd_contains(self, haystack) { | ||
| if let Some(result) = simd_contains(self.as_bytes(), haystack.as_bytes()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this function just takes a slice of bytes now. From what I can see the implementation does not rely on the input being UTF8 at all.
This comment has been minimized.
This comment has been minimized.
ea951c7 to
0b23d41
Compare
This comment has been minimized.
This comment has been minimized.
0b23d41 to
f3cb4ca
Compare
This comment has been minimized.
This comment has been minimized.
f3cb4ca to
c631191
Compare
|
☔ The latest upstream changes (presumably #144393) made this pull request unmergeable. Please resolve the merge conflicts. |
tracking issue: #134149
An attempt to break up #134350 into more manageable pieces.
From what I can see, the
TwoWaySearcherimplementation does not have special logic for UTF8 boundaries, so it should work just as well on any&[u8]. So this PR just moves theTwoWaySearcherimplementation toslice/byte_pattern.rs, and then uses it fromstr/pattern.rs. No functional changes, no additional API surface.r? @BurntSushi