You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* This update decouples the creation of is_required_phrase rules from
updating existing rules in a separate CLI. This makes it easier to
control which rule are used as required phrases.
* This now skip to process more rules when adding required phrases to
existing rules: any rule that cannot be matched approximately is
skipped and only tiny rules, but also many other rules.
* This checks that no rule get a required phrase added that would
break in the middle of a URL, email, or copyright. This is done by
checking that no required phrase injection changes the set of
ignorables of a rule and could break a URL making it no longer a
proper URL. Same for emails or copyrights.
* This extends "skipping" the collection of required phrases to skip
a rule from both required phrases collection for generationg new rules
AND injection of new required phrases in rule text. This allow to
handle exceptions more easily.
* The "is_required_phrase" rules creation now creates rules using
improved content: the case and punctuation of the phrase text are
preserved; the rule is created as "is_license_reference" which is
going to be correct in the vast majority of the cases.
* When matched, the "is_required_phrase" rules are treated the same
as continuous rules and can only be matched exactly.
* The "is_required_phrase" rules are now validated extensively to
ensure that there is no conflict with other rule flags.
* The code to "trace" the source of a required_phase inject now uses
the new standard "source" rule field, and the code related to handling
this field has been simplified.
* Required phrases injection has not yet been tested as working.
Signed-off-by: Philippe Ombredanne <[email protected]>
0 commit comments