Handle all Unicode whitespace for UnicodeBreakProperties.find_words#575
Handle all Unicode whitespace for UnicodeBreakProperties.find_words#575phy1729 wants to merge 1 commit intomgeisler:masterfrom
Conversation
|
Hi @phy1729, Thanks for the PR and happy new year!
I see... I think the API I made is a bit clumsy or misleading. Textwrap is generally meant to wrap a single paragraph of text at a time. This is reflected in how However, there is of course nothing about this in the documentation for In general, I was hoping that textwrap/examples/wasm/src/lib.rs Lines 44 to 51 in c9bd8b0 However, I realize that this idea isn't how people want to use the current API and I would like to improve on this. |
|
Looking at this some more, I think the output between the use textwrap::core::Word;
use textwrap::WordSeparator::{AsciiSpace, UnicodeBreakProperties};
fn main() {
let text = "foo \nbar";
dbg!(UnicodeBreakProperties.find_words(text).collect::<Vec<_>>());
dbg!(AsciiSpace.find_words(text).collect::<Vec<_>>());gives me |
|
Can I ask what your use-case is for constructing words yourself? Also, have you tried the Custom variant? I recently realized that the constructor for |
|
I'm using the result of I don't think the words for |
Currently
UnicodeBreakProperties.find_words("foo \nbar")results in two Words where the first hasword: "foo \n", whitespace: "". I think it makes more sense when following the Unicode line break algorithm to include all whitespace not just spaces as UAX 14 breaks after a newline.I left
Word::from_unicodepub(crate)to not change the public API.