Commit 79e0bcc
committed
diff: track total cost of search and bail if high
This is the last piece of the puzzle to get somewhat comparable to GNU
diff performance without implementing all of its tricks - although this
one is also used by GNU diff, in its own way. It brings down a diff
which still takes over a minute with the previous commit to under a
second.
Benchmark 1: diff test-data/b.cpp test-data/c.cpp
Time (mean ± σ): 2.533 s ± 0.011 s [User: 2.494 s, System: 0.027 s]
Range (min … max): 2.519 s … 2.553 s 10 runs
Warning: Ignoring non-zero exit code.
Benchmark 2: ./target/release/diffutils.local-heuristics diff test-data/b.cpp test-data/c.cpp
Time (mean ± σ): 65.798 s ± 1.080 s [User: 65.367 s, System: 0.053 s]
Range (min … max): 64.962 s … 68.137 s 10 runs
Warning: Ignoring non-zero exit code.
Benchmark 3: ./target/release/diffutils diff test-data/b.cpp test-data/c.cpp
Time (mean ± σ): 580.6 ms ± 6.5 ms [User: 521.9 ms, System: 38.8 ms]
Range (min … max): 570.7 ms … 589.6 ms 10 runs
Warning: Ignoring non-zero exit code.
Summary
./target/release/diffutils diff test-data/b.cpp test-data/c.cpp ran
4.36 ± 0.05 times faster than diff test-data/b.cpp test-data/c.cpp
113.33 ± 2.26 times faster than ./target/release/diffutils.local-heuristics diff test-data/b.cpp test-data/c.cpp
It basically keeps track of how much work we have done overall for a
diff job and enables giving up completely on trying to find ideal split
points if the cost implies we had to trigger the "too expensive"
heuristic too often.
From that point forward it only does naive splitting of the work.
This should not generate diffs which are much worse than doing the
diagonal search, as it should only trigger in cases in which the
files are so different it won't find good split points anyway.
This is another case in which GNU diff's additional work with hashing
and splitting large chunks of inclusion / deletion from the diff work
and trying harder to find ideal splits seem to cause it to perform
slightly poorer:
That said, GNU diff probably still generates better diffs not due to
this, but due to its post-processing of the results, trying to create
more hunks with nearby changes staying close to each other, which we
do not do (but we didn't do that before anyway).1 parent ed1bbfa commit 79e0bcc
3 files changed
+126
-13
lines changedSome generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| 25 | + | |
25 | 26 | | |
26 | 27 | | |
27 | 28 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
6 | 6 | | |
7 | 7 | | |
8 | 8 | | |
| 9 | + | |
9 | 10 | | |
10 | 11 | | |
11 | 12 | | |
| |||
43 | 44 | | |
44 | 45 | | |
45 | 46 | | |
| 47 | + | |
46 | 48 | | |
47 | 49 | | |
48 | 50 | | |
49 | 51 | | |
50 | 52 | | |
51 | 53 | | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
52 | 85 | | |
53 | 86 | | |
54 | 87 | | |
| |||
87 | 120 | | |
88 | 121 | | |
89 | 122 | | |
90 | | - | |
91 | | - | |
92 | | - | |
93 | | - | |
94 | | - | |
95 | | - | |
96 | | - | |
97 | | - | |
98 | | - | |
99 | 123 | | |
100 | 124 | | |
101 | 125 | | |
102 | 126 | | |
103 | 127 | | |
| 128 | + | |
| 129 | + | |
104 | 130 | | |
105 | 131 | | |
106 | 132 | | |
| |||
253 | 279 | | |
254 | 280 | | |
255 | 281 | | |
256 | | - | |
| 282 | + | |
| 283 | + | |
257 | 284 | | |
258 | 285 | | |
259 | 286 | | |
| |||
262 | 289 | | |
263 | 290 | | |
264 | 291 | | |
| 292 | + | |
265 | 293 | | |
266 | 294 | | |
267 | 295 | | |
| |||
296 | 324 | | |
297 | 325 | | |
298 | 326 | | |
299 | | - | |
| 327 | + | |
300 | 328 | | |
301 | 329 | | |
302 | 330 | | |
| |||
321 | 349 | | |
322 | 350 | | |
323 | 351 | | |
324 | | - | |
325 | | - | |
| 352 | + | |
| 353 | + | |
326 | 354 | | |
327 | 355 | | |
328 | 356 | | |
| |||
0 commit comments