⚡️ Speed up method CharacterRemover.remove_control_characters by 46%

codeflash-ai[bot] · web-flow · commit d8d447c487a3 · 2025-06-03T23:44:28.000Z
Here’s an optimized version of your program. The main bottleneck is `re.sub`, which is relatively slow for simple tasks like filtering ASCII ranges, especially in tight loops. You can greatly speed this up by using `str.translate` with a translation table that drops the unwanted control characters. This avoids regex overhead and is much faster in practice.



**Why is this faster?**
- `str.translate` does pure C-level translation and omission in a single pass, no regex engine overhead.
- The translation table is created only once per instance.
- No function-call overhead inside loops.

**Guaranteed same results:** Control chars `chr(0)`–`chr(31)` and `chr(127)` are omitted, just as with your regex.

This will significantly reduce the time per call as shown in your profile. If you want even more speed and you're always working with ASCII, you can potentially use bytes, but `str.translate` is already highly efficient for this use case.
diff --git a/code_to_optimize/remove_control_chars.py b/code_to_optimize/remove_control_chars.py
@@ -1,10 +1,15 @@
-import re
-
-
 class CharacterRemover:
     def __init__(self):
         self.version = "0.1"
+        # Build translation table once in init.
+        self._ctrl_table = self._make_ctrl_table()
 
     def remove_control_characters(self, s) -> str:
         """Remove control characters from the string."""
-        return re.sub("[\\x00-\\x1F\\x7F]", "", s) if s else ""
+        return s.translate(self._ctrl_table) if s else ""
+
+    def _make_ctrl_table(self):
+        # Map delete (ASCII 127) and 0-31 to None
+        ctrl_chars = dict.fromkeys(range(32), None)
+        ctrl_chars[127] = None
+        return str.maketrans(ctrl_chars)