Commit 75a71cf
authored
Here is an optimized version of your program.
Key improvements.
- Remove the regular expression and use the built-in `splitlines(keepends=True)`, which is **significantly** faster for splitting text into lines, especially on large files.
- Use `extend` instead of repeated `append` calls for cases with two appends.
- Minor local optimizations (localize function, reduce attribute lookups).
**Performance explanation**.
- The regex-based splitting was responsible for a significant portion of time. `str.splitlines(keepends=True)` is implemented in C and avoids unnecessary regex matching.
- Using local variable lookups (e.g. `append = diff_output.append`) is slightly faster inside loops that append frequently.
- `extend` is ever-so-slightly faster (in CPython) than multiple `append` calls for the rare "no newline" case.
---
**This code produces exactly the same output as your original, but should be much faster (especially for large inputs).**
1 parent 90014bd commit 75a71cf
1 file changed
+10
-9
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2 | 2 | | |
3 | 3 | | |
4 | 4 | | |
5 | | - | |
6 | 5 | | |
7 | 6 | | |
8 | 7 | | |
| |||
16 | 15 | | |
17 | 16 | | |
18 | 17 | | |
19 | | - | |
20 | | - | |
| 18 | + | |
21 | 19 | | |
22 | | - | |
23 | | - | |
24 | | - | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
25 | 23 | | |
26 | 24 | | |
27 | 25 | | |
28 | 26 | | |
29 | 27 | | |
30 | 28 | | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
31 | 32 | | |
32 | 33 | | |
33 | | - | |
| 34 | + | |
34 | 35 | | |
35 | | - | |
36 | | - | |
| 36 | + | |
| 37 | + | |
37 | 38 | | |
38 | 39 | | |
39 | 40 | | |
| |||
0 commit comments