Commit d8d447c
authored
⚡️ Speed up method
Here’s an optimized version of your program. The main bottleneck is `re.sub`, which is relatively slow for simple tasks like filtering ASCII ranges, especially in tight loops. You can greatly speed this up by using `str.translate` with a translation table that drops the unwanted control characters. This avoids regex overhead and is much faster in practice.
**Why is this faster?**
- `str.translate` does pure C-level translation and omission in a single pass, no regex engine overhead.
- The translation table is created only once per instance.
- No function-call overhead inside loops.
**Guaranteed same results:** Control chars `chr(0)`–`chr(31)` and `chr(127)` are omitted, just as with your regex.
This will significantly reduce the time per call as shown in your profile. If you want even more speed and you're always working with ASCII, you can potentially use bytes, but `str.translate` is already highly efficient for this use case.CharacterRemover.remove_control_characters by 46%1 parent 0e5f79f commit d8d447c
1 file changed
+9
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1 | | - | |
2 | | - | |
3 | | - | |
4 | 1 | | |
5 | 2 | | |
6 | 3 | | |
| 4 | + | |
| 5 | + | |
7 | 6 | | |
8 | 7 | | |
9 | 8 | | |
10 | | - | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
0 commit comments