Skip to content

Feature request: exclusion sets #18

@johnhtodd

Description

@johnhtodd

For each rule subset (basically: everywhere a matching string can be specified) we need to have an exclusion capability.

Example: matching for "paypal" often ends up with "people" as a matching element, either nyiis, or soundex, or metaphone or possibly other ways like levenschtein and others if they're loose enough. We should exclude any matches for "people" but that becomes a manual process of tuning.

  1. We need to have a way to include strings that never trigger a match even though a rule might match on them, either in the file itself (an array) or a call-out to an external file that bulk-loads the strings on startup or HUP of the process.

This would be done after the matching process but before counters.

  1. There also needs to be a metric which counts how many times matching elements have been dropped due to the exception strings, so we can understand when load is dramatic but output is small.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions