Skip to content

Configure to ignore words that are split by underscores/hyphensΒ #29

@sdankel

Description

@sdankel

Motivation

I am finding a lot of false positive where the word is split up by _ and -. I get that evasion detection is a feature but for my use case I want to allow evasive words. So far I haven't been able to find a way to configure it as such. Please let me know if there's a way to do so.

For example, I want test_fun to not be censored. I tried:

    let (censored, analysis) = Censor::from_str("test_fun")
        .with_censor_threshold(
            !Type::EVASIVE & RustrictType::MODERATE_OR_HIGHER
        )
        .censor_and_analyze();

But it seems that it's not getting marked as evasive. It censors to tes****n

Proposed solution

It would be nice to have an option to only censor the word if the full word is a profanity, so I can split by _ or whatever I want and pass each word into the profanity filter myself. This would also prevent false positives like Lifshitz.

Something like this:

    let (censored, analysis) = Censor::from_str("test_fun")
        .with_censor_threshold(RustrictType::MODERATE_OR_HIGHER)
        .with_only_full_words(true) // defaults to false
        .censor_and_analyze();

Context

I am using rustrict version 0.7.33

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions