Releases: bminixhofer/nlprule
Releases · bminixhofer/nlprule
Release 0.6.4
Internal improvements
- Decrease time it takes to load the
Tokenizerby ~ 40% (#70). - Tag lookup is backed by a vector instead of a hashmap now.
Breaking changes
- The tagger now returns iterators over tags instead of allocating a vector.
- Remove
get_group_membersfunction.
Release 0.6.3
Release 0.6.2
Internal improvements
Speed up loading the Tokenizer by ~ 25% (#66).
Release 0.6.1
Release 0.6.0
- Fix a significant bug where text with multiple sentences would sometimes cause an error if one of the latter sentences matches some pattern (#61, #63, thanks @drahnr!).
Breaking changes
- Remove
multiword_tagson tokens (now part of the regular tags). - Make fields of the
Wordprivate and add getter methods. Wordconstructor is now callednewinstead ofnew_with_tags.
New features
- Adds
as_strconvenience method to multiple structs (WordId,PosId,Word).
Release 0.5.3
- CI failed for Release 0.5.2
Release 0.5.2
Release 0.5.1
Breaking changes
- Changes the focus from
Vec<Token>toSentence(#54).pipeandsentencizereturn iterators overSentence/IncompleteSentencenow. - Removes the special
SENT_STARTtoken (now only used internally). Each token corresponds to at least one character in the input text now. - Makes the fields of
TokenandIncompleteTokenprivate and adds getter methods (#54). char_spanandbyte_spanare replaced by aSpanstruct which keeps track of char and byte indices at the same time (#54). To e.g. get the byte range, usetoken.span().byte().- Spans are relative to the input text now, not anymore to sentence boundaries (#53, thanks @drahnr!).
New features
- The regex backend can now be chosen from Oniguruma or fancy-regex with the features
regex-onigandregex-fancy.regex-onigis the default. - nlprule now compiles to WebAssembly. WebAssembly support is guaranteed for future versions and tested in CI.
- A new selector API to select individual rules (details documented in
nlprule::rule::id). For example:
use nlprule::{Tokenizer, Rules, rule::id::Category};
use std::convert::TryInto;
let mut rules = Rules::new("path/to/en_rules.bin")?;
// disable rules named "confusion_due_do" in category "confused_words"
rules
.select_mut(
&Category::new("confused_words")
.join("confusion_due_do")
.into(),
)
.for_each(|rule| rule.disable());
// disable all grammar rules
rules
.select_mut(&Category::new("grammar").into())
.for_each(|rule| rule.disable());
// a string syntax where slashes are the separator is also supported
rules
.select_mut(&"confused_words/confusion_due_do".try_into()?)
.for_each(|rule| rule.enable());Release 0.5.0
- Superseded by 0.5.1. The release script for 0.5.0 did not finish.
Release 0.4.6
Breaking changes
.validate()innlprule-buildnow returns aResult<()>to encourage calling it after.postprocess().
Fixes
- Fixes an error where
Cursorposition innlprule-buildwas not reset appropriately. - Use
fs_erreverywhere for better error messages.