Skip to content

Broadword Find Unmatched Close Far

John Ky edited this page Apr 25, 2020 · 8 revisions

The library provides a broadword implementation of the findUnmatchedCloseFar function as well as a reference implementation.

The reference implementation is faster for scanning up to 15 bits. The broadword implementation doesn't scan at all so it has the same performance regardless of input. If the reference implementation were to scan for more than 15 bits, then the broadword implementation would be faster.

The graph below shows the amount of time for one call of the broadword implementation (orange) and reference implementation (gray)

Run Length Broadword Bit at a time
0 52.15 3.672
1 52.43 7.787
2 52.35 11.850
3 52.38 15.000
4 52.49 18.620
5 52.23 21.790
6 52.08 25.300
7 52.46 28.060
8 52.07 31.570
9 52.60 35.240
10 51.97 38.760
11 52.01 41.730
12 52.56 45.070
13 52.97 48.310
14 52.83 51.580
15 52.72 55.000
16 52.17 58.470
17 51.97 61.470
18 51.97 64.630
19 52.42 72.850
20 52.38 77.510
21 52.37 80.590
22 53.01 83.630
23 52.40 87.090
24 52.30 90.880
25 52.96 93.990
26 52.29 97.090
27 52.12 100.700
28 51.81 103.900
29 51.71 107.100
30 51.72 110.400
31 52.09 113.900
32 41.78 116.200
33 41.10 121.700
34 41.75 121.400
35 41.14 119.900
36 41.05 119.700
37 41.14 117.900
38 41.32 117.000
39 41.18 116.000
40 41.29 116.200
41 41.17 114.200
42 41.26 114.000
43 41.30 113.100
44 41.49 111.800
45 41.24 106.900
46 41.20 105.800
47 41.43 104.000
48 41.58 104.100
49 41.78 105.300
50 41.36 102.500
51 42.01 102.100
52 41.89 101.400
53 41.31 99.590
54 41.35 99.140
55 41.15 98.640
56 41.13 98.150
57 41.73 97.610
58 41.28 95.640
59 40.92 96.330
60 40.99 93.700
61 41.09 93.970
62 41.15 92.970
63 41.66 92.380
64 41.39 93.110
Clone this wiki locally