Skip to content

Commit 40379ac

Browse files
Abduqodiri Qurbonzodailya-g
authored andcommitted
Add implementation details of default persistent collection impls
Add time complexity information about essentially all operations
1 parent 6cdb275 commit 40379ac

File tree

1 file changed

+59
-4
lines changed

1 file changed

+59
-4
lines changed

proposal.md

Lines changed: 59 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -199,18 +199,73 @@ implementing such methods.
199199
200200
### Persistent collection implementations
201201
202-
TODO: describe some implementation details and performance characteristics of our persistent collection impls.
203-
204202
#### Default persistent list
205203
206-
#### Default persistent ordered hash set
204+
It's backed by a bit-mapped trie with branching factor of 32.
205+
206+
Time complexity of operations:
207+
- `get(index)`, `set(index, element)` - O(log<sub>32</sub>N), where N is the size of the instance the operations are applied on.
208+
- `add(index, element)`, `removeAt(index)`, `remove(element)` - O(N).
209+
- `addAll(elements)` - O(N).
210+
- `addAll(index, elements)`, `removeAll(elements)`, `removeAll(predicate)` - O(N M), optimizable to O(N+M), where M is the number of elements to be inserted/removed.
211+
- Iterating elements - O(N).
212+
213+
To optimize frequently used `add(element)` and `removeAt(size - 1)` operations rightmost leaf is referenced directly from the persistent list instance.
214+
This allows to avoid path-copying and gives O(1) time complexity for these two operations.
215+
216+
Small persistent lists, with up to 32 elements, are backed by arrays of corresponding size.
207217
208218
#### Persistent unordered hash set
209219
210-
#### Default persistent ordered hash map
220+
It's backed by a hash array mapped trie (a.k.a HAMT). Every node has up to 32 children or elements.
221+
222+
Time complexity of operations:
223+
- `add(element)`, `remove(element)`, `contains(element)` - O(log<sub>32</sub>N) in general, but strongly depends on hash codes of the stored elements.
224+
- `addAll(elements)`, `removeAll(elements)`, `removeAll(predicate)`, `containsAll(elements)` - O(M log<sub>32</sub>N) in general.
225+
- Iterating elements - O(N).
211226
212227
#### Persistent unordered hash map
213228
229+
It's backed by a compressed hash-array mapped prefix-tree (a.k.a CHAMP). Every node has up to 32 children or entries.
230+
231+
Time complexity of operations:
232+
- `get(key)`, `put(key, value)`, `remove(key)`, `remove(key, value)`, `containsKey(key)` - O(log<sub>32</sub>N) in average, but strongly depends on hash codes of the stored keys.
233+
- `containsValue(value)` - O(N).
234+
- `putAll(map)` - O(M log<sub>32</sub>N), where M is the number of elements added.
235+
- Iterating `keys`, `values`, `entries` - O(N).
236+
237+
#### Default persistent ordered hash set
238+
239+
It's backed by the _persistent unordered hash map_, which maps every element in this set to the previous and next elements in insertion order.
240+
241+
Every operation on this set turns into one or more operations on the backing map, e.g.:
242+
- `add(element)` turns into updating the next reference of the last element (new element becomes the next) and putting new entry with key equal to the specified element.
243+
- `remove(element)` turns into removing the entry with the key equal to the specified element and updating values for the next and previous elements.
244+
- `contains(element)` turns into `containsKey(element)`.
245+
246+
Iterating elements in this set takes O(N log<sub>32</sub>N) time.
247+
248+
#### Default persistent ordered hash map
249+
250+
It is implemented the same way as the _persistent ordered hash set_,
251+
except that the backing map stores also value beside next and previous keys.
252+
253+
#### Builders
254+
255+
Builders of the _persistent list_, _persistent unordered hash set_, and _persistent unordered hash map_
256+
are backed by the same backing data structures as the corresponding persistent collections.
257+
Thus, `persistentCollection.builder()` takes constant time consisting of passing backing storage to the new builder instance.
258+
But instead of copying every node to be modified, builder makes sure the node has not already been
259+
copied by marking copies it makes with its unique identifier. Nodes marked by the builder's identifier can be
260+
modified in-place by that builder.
261+
`builder.build()` also takes constant time, it consists of passing backing storage to the new persistent collection instance
262+
and updating builder's identifier, as nodes marked by the current identifier are reachable from the built instance and
263+
cannot by modified in-place any more.
264+
265+
Builders of the _persistent ordered hash set_ and _persistent ordered hash map_ are backed by the builder of the backing map.
266+
267+
Although time complexity of all operations on a builder is the same as in its corresponding persistent collection,
268+
avoiding memory allocations in modification operations leads to significant performance improvement in practice.
214269
215270
### Extension functions
216271

0 commit comments

Comments
 (0)