Skip to content

Conversation

@folkertdev
Copy link
Contributor

tracking issue: #134149

An attempt to break up #134350 into more manageable pieces.

From what I can see, the TwoWaySearcher implementation does not have special logic for UTF8 boundaries, so it should work just as well on any &[u8]. So this PR just moves the TwoWaySearcher implementation to slice/byte_pattern.rs, and then uses it from str/pattern.rs. No functional changes, no additional API surface.

r? @BurntSushi

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Jan 23, 2025
Comment on lines -1152 to +1164
SearchStep::Reject(a, mut b) => {
byte_pattern::SearchStep::Reject(a, mut b) => {
// skip to next char boundary
while !self.haystack.is_char_boundary(b) {
b += 1;
}
searcher.position = cmp::max(b, searcher.position);
SearchStep::Reject(a, b)
}
otherwise => otherwise,
byte_pattern::SearchStep::Match(a, b) => SearchStep::Match(a, b),
byte_pattern::SearchStep::Done => SearchStep::Done,
Copy link
Contributor Author

@folkertdev folkertdev Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I duplicated SearchStep because it is a public type and its documentation refers to the Searcher trait. The byte_pattern module will have it's own Searcher trait (or ByteSearcher maybe) and so that documentation would be misleading one way or the other.

Comment on lines 1000 to 1001
if let Some(result) = simd_contains(self, haystack) {
if let Some(result) = simd_contains(self.as_bytes(), haystack.as_bytes()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function just takes a slice of bytes now. From what I can see the implementation does not rely on the input being UTF8 at all.

@rust-log-analyzer

This comment has been minimized.

@folkertdev folkertdev force-pushed the byte-pattern-two-way-searcher branch from ea951c7 to 0b23d41 Compare January 23, 2025 12:54
@rust-log-analyzer

This comment has been minimized.

@folkertdev folkertdev force-pushed the byte-pattern-two-way-searcher branch from 0b23d41 to f3cb4ca Compare January 23, 2025 13:58
@rust-log-analyzer

This comment has been minimized.

@folkertdev folkertdev force-pushed the byte-pattern-two-way-searcher branch from f3cb4ca to c631191 Compare January 23, 2025 20:00
@bors
Copy link
Collaborator

bors commented Jul 30, 2025

☔ The latest upstream changes (presumably #144393) made this pull request unmergeable. Please resolve the merge conflicts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants