Commit 5066fda
committed
feat: Implement custom profanity lists and file loading
This commit introduces several enhancements to the profanity filtering
capabilities of the ValX library:
1. **Custom Profanity Lists**:
* The `load_profanity_words`, `detect_profanity`, and
`remove_profanity` functions now accept an optional
`custom_words_list` parameter (a Python list of strings) to specify
custom profanity words.
* Users can now set `language=None` in these functions to use *only*
the `custom_words_list`, bypassing all built-in profanity lists.
* If a `language` is specified alongside a `custom_words_list`,
the functions will use the union of both lists.
2. **Load Custom Profanity from File**:
* A new helper function, `load_custom_profanity_from_file(filepath)`,
has been added. This function reads a text file (one word per line,
'#' for comments) and returns a list of strings suitable for use
with `custom_words_list`.
3. **Enhanced Detection Reporting**:
* The "Language" field in the output of `detect_profanity` has been
updated to be more descriptive when custom lists are used:
* "Custom": If `language=None` and a custom list is used.
* "Custom + <Language>": If a built-in language and a custom list
are combined (e.g., "Custom + English", "Custom + All").
4. **Testing**:
* Comprehensive tests have been added to `test.py` to cover all new
functionalities, including various combinations of custom lists,
file loading, and language settings.
* Existing tests have been verified. The AI hate speech test was
adjusted to reflect the current model's behavior in the test
environment.
5. **Documentation**:
* `README.md` has been updated extensively to document these new
features, providing clear examples and usage instructions.
These changes provide users with significantly more flexibility in tailoring
the profanity filtering to their specific needs.1 parent 2d7709d commit 5066fda
4 files changed
+432
-47
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
113 | 113 | | |
114 | 114 | | |
115 | 115 | | |
116 | | - | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
117 | 121 | | |
118 | 122 | | |
119 | 123 | | |
120 | 124 | | |
121 | | - | |
| 125 | + | |
| 126 | + | |
122 | 127 | | |
123 | | - | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
124 | 161 | | |
125 | 162 | | |
126 | | - | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
127 | 166 | | |
128 | 167 | | |
129 | | - | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
130 | 211 | | |
131 | | - | |
132 | | - | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
133 | 235 | | |
134 | 236 | | |
| 237 | + | |
| 238 | + | |
135 | 239 | | |
136 | 240 | | |
137 | 241 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
0 commit comments