@@ -9,96 +9,96 @@ Containers<index>` is performance so we would be remiss not to produce this
9
9
page with comparisons.
10
10
11
11
The source for all benchmarks can be found under the "tests" directory in the
12
- files prefixed "benchmark." Measurements are made from the min, max, and mean
13
- of 5 repetitions. In the graphs below, the line follows the mean and at each
14
- point, the min/max displays the bounds . Note that the axes are log-log so
15
- properly reading two different lines would describe one metric as "X times"
16
- faster rather than "X seconds" faster . In all graphs, lower is
17
- better. Measurements are made by powers of ten: 100 through 1 ,000,000.
18
-
19
- Measurements up to 10,000,000,000 elements have been successfully tested and
12
+ files prefixed "benchmark." Measurements are made from the min, max, and median
13
+ of 5 repetitions. In the graphs below, the line follows the median at each
14
+ point. Note that the axes are log-log so properly reading two different lines
15
+ would describe one metric as "X times" faster rather than "X seconds "
16
+ faster. In all graphs, lower is better. Measurements are made by powers of ten:
17
+ 100 through 10 ,000,000.
18
+
19
+ Measurements up to ten billion elements have been successfully tested and
20
20
benchmarked. Read :doc: `performance-scale ` for details. Only a couple
21
21
implementations (including :doc: `Sorted Containers<index> `) are capable of
22
22
handling so many elements. The major limiting factor at that size is
23
23
memory. Consider the simple case of storing CPython's integers in a
24
24
:doc: `sortedlist `. Each integer object requires ~24 bytes so one hundred
25
25
million elements will require about three gigabytes of memory. If the
26
- implemenation adds significant overhead then most systems will run out of
26
+ implementation adds significant overhead then most systems will run out of
27
27
memory. For all datasets which may be kept in memory, :doc: `Sorted
28
28
Containers<index>` is an excellent choice.
29
29
30
- A good effort has been made to find competing implementations. Six in total
30
+ A good effort has been made to find competing implementations. Seven in total
31
31
were found with various list, set, and dict implementations.
32
32
33
- blist
34
- Provides list, dict, and set containers based on the blist data-type.
35
- Implemented in Python and C . Last updated March, 2014. `blist on PyPI
36
- <https://pypi.org/project/blist/> `_
37
-
38
- bintrees
39
- Provides several tree-based implementations for dict and set containers.
40
- Fastest were AVL and Red-Black trees. Extends the conventional API to provide
41
- set operations for the dict type. Implemented in C. Last updated April, 2017.
42
- ` bintrees on PyPI < https://pypi.org/project/bintrees/ > `_
43
-
44
- banyan
45
- Provides a fast, C++-implementation for dict and set data types. Offers some
46
- features also found in sortedcontainers like accessing the n-th item in a set
47
- or dict. Last updated April, 2013. ` banyan on PyPI
48
- <https://pypi.org/project/Banyan/> `_
49
-
50
- treap
51
- Uses Cython for improved performance and provides a dict container. Last
52
- updated June, 2017. ` treap on PyPI < https://pypi.org/project/treap/ >`_
53
-
54
- skiplistcollections
55
- Pure-Python implementation based on skip-lists providing a limited API for
56
- dict and set types. Last updated January, 2014. ` skiplistcollections on PyPI
57
- <https://pypi.org/project/skiplistcollections/> `_
58
-
59
- sortedcollection
60
- Pure-Python implementation of sorted list based solely on a list.
61
- Feature-poor and inefficient for writes but included because it is written by
62
- Raymond Hettinger and linked from the official Python docs. Last updated
63
- April, 2011. `sortedcollection on ActiveState
64
- <http://code.activestate.com/recipes/577197-sortedcollection/> `_
65
-
66
- Several competing implementations were omitted because they were not easily
67
- installable or failed to build.
68
-
69
- rbtree
70
- C-implementation that only supports Python 2. Last updated
71
- March, 2012. Provides a fast, C-implementation for dict and set data types.
72
- ` rbtree on PyPI < https://pypi.org/project/rbtree/ >`_
73
-
74
- ruamel.ordereddict.sorteddict
75
- C-implementation that only supports Python 2. Performance was measured in
76
- correspondence with the module author. Performance was generally very good
77
- except for `` __delitem__ ``. At scale, deleting entries became exceedingly
78
- slow. Last updated July, 2017. ` ruamel.ordereddict on PyPI
79
- <https://pypi.org/project/ruamel.ordereddict/> `_
80
-
81
- rbtree from NewCenturyComputers
82
- Pure-Python tree-based implementation. Not sure when this was last updated.
83
- Unlikely to be fast. ` rbtree from NewCenturyComputers
84
- <http://newcenturycomputers.net/projects/rbtree.html> `_
85
-
86
- python-avl-tree from Github user pgrafov
87
- Pure-Python tree-based implementation. Last updated October, 2010. Unlikely
88
- to be fast. ` python-avl-tree from Github user pgrafov
89
- <https://github.com/pgrafov/python-avl-tree> `_
90
-
91
- pyavl
92
- C- implementation for AVL tree-based dict and set containers. Claims to be
93
- fast. Lacking documentation and failed to build. Last updated December, 2008.
94
- ` pyavl on PyPI < https://pypi.org/project/pyavl/ >`_
95
-
96
- Several projects have deprecated themselves in favor of :doc: ` Sorted
97
- Containers<index>`. Most notably those are ` bintrees
98
- <https://pypi.org/project/bintrees/> `_ and ` sorteddict
99
- <https://pypi.org/project/sorteddict/> `_. All of the projects above also use
100
- Python 2 semantics for :doc: ` sorteddict ` data types. Wherever possible,
101
- :doc: ` Sorted Containers<index> ` has adopted Python 3 semantics.
33
+ 1. * blist * -- Provides list, dict, and set containers based on the blist
34
+ data-type. Uses a ` B-Tree `_ data structure. Implemented in Python and C. BSD
35
+ License . Last updated March, 2014. `blist on PyPI `_
36
+
37
+ 2. * bintrees * -- Provides several tree-based implementations for dict and set
38
+ containers. Fastest were AVL-Tree and Red-Black-Tree data
39
+ structures.. Extends the conventional API to provide set operations for the
40
+ dict type. Now deprecated in favor of :doc: ` Sorted Containers<index> `
41
+ Implemented in C. MIT License. Last updated April, 2017. ` bintrees on
42
+ PyPI `_
43
+
44
+ 3. * sortedmap * -- Provides a fast, C++ implemenation for dict data types. Uses
45
+ the C++ standard library ` std::map ` data structure which is usually a
46
+ red-black tree. Last updated February, 2016. ` sortedmap on PyPI `_
47
+
48
+ 4. * banyan * -- Provides a fast, C++ implementation for dict and set data
49
+ types. Offers some features also found in sortedcontainers like accessing
50
+ the n-th item in a set or dict. Uses sources from the ` tree implementation `_
51
+ in GNU libstdc++. GPLv3 License. Last updated April, 2013. ` banyan on PyPI `_
52
+
53
+ 5. * treap * -- Uses Cython for improved performance and provides a dict
54
+ container. Apache V2 License. Last updated June, 2017. ` treap on PyPI `_
55
+
56
+ 6. * skiplistcollections * -- Pure-Python implementation based on skip-lists
57
+ providing a limited API for dict and set types. MIT License. Last updated
58
+ January, 2014. ` skiplistcollections on PyPI `_
59
+
60
+ 7. * sortedcollection * -- Pure-Python implementation of sorted list based solely
61
+ on a list. Feature-poor and inefficient for writes but included because it
62
+ is written by Raymond Hettinger and linked from the official Python
63
+ docs. MIT License. Last updated April, 2011. `sortedcollection recipe `_
64
+
65
+ Several alternative implementations were omitted for reasons documented below:
66
+
67
+ A. * rbtree * -- C-implementation that only supports Python 2. Provides a fast,
68
+ C-implementation for dict and set data types. GPLv3 License. Last updated
69
+ March, 2012. ` rbtree on PyPI `_
70
+
71
+ B. * ruamel.ordereddict.sorteddict * -- C-implementation that only supports
72
+ Python 2. Performance was measured in correspondence with the module
73
+ author. Performance was generally very good except for `` __delitem__ ``. At
74
+ scale, deleting entries became exceedingly slow. MIT License. Last updated
75
+ July, 2017. ` ruamel.ordereddict on PyPI `_
76
+
77
+ C. * pyskiplist * -- Pure-Python skip-list based implementation supporting a
78
+ sorted-list-like interface. Now deprecated in favor of :doc: ` Sorted
79
+ Containers<index>`. MIT License. Last updated July, 2015. ` pyskiplist on
80
+ PyPI `_
81
+
82
+ D. * sorteddict * -- Pure-Python lazily-computed sorted dict implementation. Now
83
+ deprecated in favor of :doc: ` Sorted Containers<index> `. GPLv3 License. Last
84
+ updated September, 2007. ` sorteddict on PyPI `_
85
+
86
+ E. * rbtree from NewCenturyComputers * -- Pure-Python tree-based
87
+ implementation. Not sure when this was last updated. Unlikely to be
88
+ fast. Unknown license. Unknown last update. ` rbtree from
89
+ NewCenturyComputers `_
90
+
91
+ F. * python-avl-tree from GitHub user pgrafov * -- Pure-Python tree-based
92
+ implementation. Unlikely to be fast. MIT License. Last updated
93
+ October, 2010. ` python-avl-tree from GitHub user pgrafov `_
94
+
95
+ G. * pyavl * -- C-implementation for AVL tree-based dict and set
96
+ containers. Claims to be fast. Lacking documentation and failed to
97
+ build. Public Domain License. Last updated December, 2008. ` pyavl on PyPI `_
98
+
99
+ H. * skiplist * -- C-implementation of sorted list based on skip-list data
100
+ structure. Only supports Python 2. Zlib/libpng License. Last updated
101
+ Septemeber, 2013. ` skiplist from Bitbucket user mojaves `_
102
102
103
103
The most similar module to :doc: `Sorted Containers<index> ` is
104
104
skiplistcollections given that each is implemented in Python. But as is
@@ -123,16 +123,51 @@ been made to simulate real-world workloads. The :doc:`simulated workload
123
123
performance comparison<performance-workload>` contains examples with
124
124
comparisons to other implementations, load factors, and runtimes.
125
125
126
- A couple final notes about the graphs below. Missing data indicates the
127
- benchmark either took too long or failed. The set operations with tiny, small,
128
- medium, and large variations indicate the size of the container involved in the
126
+ Some final notes about the graphs below. Missing data indicates the benchmark
127
+ either took too long or failed. The set operations with tiny, small, medium ,
128
+ and large variations indicate the size of the container involved in the
129
129
right-hand-side of the operation: tiny is exactly 10 elements; small is 10% of
130
130
the size of the left-hand-side; medium is 50%; and large is 100%. :doc: `Sorted
131
131
Containers<index>` uses a different algorithm based on the size of the
132
132
right-hand-side of the operation for a dramatic improvement in performance.
133
133
134
+ The legends of the graphs below correlate the underlying data structure used
135
+ the Python project. The correlation is as follows:
136
+
134
137
.. currentmodule :: sortedcontainers
135
138
139
+ ====================== ==================================
140
+ Data Structure Project
141
+ ====================== ==================================
142
+ :class: `SortedList ` :doc: `Sorted Containers<index> `
143
+ :class: `SortedKeyList ` :doc: `Sorted Containers<index> `
144
+ B-Tree `blist on PyPI `_
145
+ List `sortedcollection recipe `_
146
+ AVL-Tree `bintrees on PyPI `_
147
+ RB-Tree `banyan on PyPI `_
148
+ Skip-List `skiplistcollections on PyPI `_
149
+ std::map `sortedmap on PyPI `_
150
+ Treap `treap on PyPI `_
151
+ ====================== ==================================
152
+
153
+ .. _`B-Tree` : https://en.wikipedia.org/wiki/B-tree
154
+ .. _`blist on PyPI` : https://pypi.org/project/blist/
155
+ .. _`bintrees on PyPI` : https://pypi.org/project/bintrees/
156
+ .. _`sortedmap on PyPI` : https://pypi.org/project/sortedmap/
157
+ .. _`sorteddict on PyPI` : https://pypi.org/project/sorteddict/
158
+ .. _`pyskiplist on PyPI` : https://pypi.org/project/pyskiplist/
159
+ .. _`banyan on PyPI` : https://pypi.org/project/Banyan/
160
+ .. _`treap on PyPI` : https://pypi.org/project/treap/
161
+ .. _`skiplistcollections on PyPI` : https://pypi.org/project/skiplistcollections/
162
+ .. _`sortedcollection recipe` : http://code.activestate.com/recipes/577197-sortedcollection/
163
+ .. _`rbtree on PyPI` : https://pypi.org/project/rbtree/
164
+ .. _`ruamel.ordereddict on PyPI` : https://pypi.org/project/ruamel.ordereddict/
165
+ .. _`rbtree from NewCenturyComputers` : http://newcenturycomputers.net/projects/rbtree.html
166
+ .. _`python-avl-tree from GitHub user pgrafov` : https://github.com/pgrafov/python-avl-tree
167
+ .. _`pyavl on PyPI` : https://pypi.org/project/pyavl/
168
+ .. _`skiplist from Bitbucket user mojaves` : https://bitbucket.org/mojaves/pyskiplist/
169
+ .. _`tree implementation` : https://gcc.gnu.org/onlinedocs/libstdc%2B%2B/ext/pb_ds/tree_based_containers.html
170
+
136
171
Sorted List
137
172
-----------
138
173
0 commit comments