You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Updates README.md to include a new "Changes in 0.2.5" section.
This section details the recently added features for custom profanity
lists, including:
- Use of `custom_words_list` parameter.
- Standalone vs. combined custom list usage.
- `load_custom_profanity_from_file()` helper and file format.
- Updated language reporting in detection results.
Copy file name to clipboardExpand all lines: README.md
+126-7Lines changed: 126 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,21 @@ An open-source Python library for data cleaning tasks. It includes functions for
16
16
> [!NOTE]
17
17
> ValX will automatically install a version of `scikit-learn` that is compatible with your device if you don't have one already.
18
18
19
+
## Changes in 0.2.5
20
+
21
+
ValX v0.2.5 introduces enhanced flexibility for profanity filtering by adding support for custom profanity lists:
22
+
23
+
-**Custom Profanity Word Lists**: Users can now provide their own lists of profane words directly as Python lists to the `detect_profanity` and `remove_profanity` functions via the new `custom_words_list` parameter.
24
+
-**Standalone Custom Lists**: Utilize your custom profanity list exclusively by setting the `language` parameter to `None`. ValX will then only use the words provided in `custom_words_list`.
25
+
-**Combined Lists**: Use a custom list in conjunction with ValX's built-in language-specific wordlists. Simply provide both a `language` (e.g., "English") and your `custom_words_list`. ValX will use the combined set of words.
26
+
-**Loading Custom Lists from File**: A new helper function, `load_custom_profanity_from_file(filepath)`, allows you to easily load custom profanity words from a text file.
27
+
-**File Format**: The file should contain one profanity word per line.
28
+
- Lines starting with a hash symbol (`#`) are treated as comments and ignored.
29
+
- Empty lines or lines containing only whitespace are also ignored.
30
+
-**Updated Detection Reporting**: The `detect_profanity` function's output now specifies the source of detected profanity more clearly (e.g., "Custom", "Custom + English").
31
+
32
+
These features give users greater control over the profanity filtering process, allowing for more tailored and specific use cases.
33
+
19
34
## Changes in 0.2.4
20
35
21
36
Fixed a major incompatibility issue with `scikit-learn` due to version changes in `scikit-learn v1.3.0` which causes compatibility issues with versions later than `1.2.2`. ValX can now be used with `scikit-learn` versions earlier and later than `1.3.0`!
@@ -113,25 +128,129 @@ Below is a complete list of all the available supported languages for ValX's pro
113
128
114
129
## Usage
115
130
116
-
### Detect Profanity
131
+
### Profanity Detection and Removal
132
+
133
+
ValX allows for flexible profanity filtering using built-in language lists, custom word lists (provided as Python lists or loaded from files), or a combination of both.
0 commit comments