|
1 | 1 | # PersistentCollections.jl |
2 | 2 |
|
3 | | -Julia AbstractDict and AbstractSet data structures persisted (ACID) to disk. |
| 3 | +Julia `Dict` and `Set` data structures safely persisted to disk. |
| 4 | + |
| 5 | +All collections are backed by [LMDB](https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Database) - a super fast B-Tree based embedded KV database with ACID guaranties. |
| 6 | +As with other B-Tree based databases reads are generally faster than writes. LMDB is not an exception, although write performance is relatively good to (expect 1k-10k TPS). |
| 7 | + |
| 8 | +Care was taken to make the datastructures thread-safe. LMDB handles most of the locking well, we just have to serialise the writes to an LMDB Environment in julia so that |
| 9 | +multiple threads do not attempt to write at once (deadlock will occur). |
4 | 10 |
|
5 | 11 | ## Quick Start |
6 | 12 |
|
7 | | -TODO |
| 13 | +1. Install this package: |
| 14 | + ```julia |
| 15 | + import Pkg |
| 16 | + Pkg.add("https://github.com/blenessy/PersistentCollections.jl.git") |
| 17 | + ``` |
| 18 | +1. Create an `LMDB.Environment` in a directory called `data` (in your current working directory): |
| 19 | + ```julia |
| 20 | + using PersistentCollections |
| 21 | + env = LMDB.Environment("data") |
| 22 | + ``` |
| 23 | +1. Create an `AbstractDict` in your LMDB environment: |
| 24 | + ```julia |
| 25 | + dict = PersistentDict{String,String}(env) |
| 26 | + ``` |
| 27 | +1. Use it as any other dict: |
| 28 | + ```julia |
| 29 | + dict["foo"] = "bar" |
| 30 | + @assert dict["foo"] == "bar" |
| 31 | + @assert collect(keys(dict)) == ["foo"] |
| 32 | + @assert collect(values(dict)) == ["bar"] |
| 33 | + ``` |
| 34 | +1. (Optional) note the asymetric performance characteristic of LMDB (B-Tree) based database: |
| 35 | + ```julia |
| 36 | + @time dict["bar"] = "baz"; # Writes to LMDB (B-Tree) are relatively slow |
| 37 | + @time dict["bar"]; # Reads are very fast though :) |
| 38 | + ``` |
| 39 | + |
| 40 | +## User Guide |
| 41 | + |
| 42 | +### Dynamic types |
| 43 | + |
| 44 | +It is possible to create persistent collection of `Any` type although some methods will not be able to convert the value to the correct type because no metadata is stored for this in DB. |
| 45 | +Most notably the `getindex` method (e.g. `dict["foo"]`) will not return a converted value. To mitigate this limitation, use the `get` method, which includes a default value. |
| 46 | +The type of the default value (if other than `nothing`) will be used to convert the value to the desired type. |
| 47 | + |
| 48 | +```julia |
| 49 | +env = LMDB.Environment("data") |
| 50 | +dict = PersistentDict{Any,Any}(env) |
| 51 | +dict["foo"] == "bar" |
| 52 | +dict["foo"] # PersistentCollections.LMDB.MDBValue{Nothing}(0x0000000000000003, Ptr{Nothing} @0x000000012c806ffd, nothing) |
| 53 | +get(dict, "foo", "") # "bar" |
| 54 | +convert(String, dict["foo"]) # "bar" |
| 55 | +``` |
| 56 | + |
| 57 | +### Multiple persistent collections in the same LMDB Environment |
| 58 | + |
| 59 | +It is possible if you need transactional consistency between multiple persistent collections: |
| 60 | + |
| 61 | +1. Create your `LMDB.Environment` with "named database" support by specifying the number of persistent collections yoy want with the `maxdbs` keyword argument: |
| 62 | + ```julia |
| 63 | + env = LMDB.Environment("data", maxdbs=2) |
| 64 | + ``` |
| 65 | +2. Instantiate your persistent collections with a unique (within LMDB env.) id: |
| 66 | + ```julia |
| 67 | + dict1 = PersistentDict{String,String}(env, id="mydict1") |
| 68 | + dict2 = PersistentDict{String,Int}(env, id="mydict2") |
| 69 | + ``` |
| 70 | + |
| 71 | +### Danger Zone: Manual sync writes to disc |
| 72 | + |
| 73 | +Yes, you can expect significant increase with write throughput if you are willing to risk loosing your last written transactions. |
| 74 | +Please note that database integrity (risk of curruption) is not in danger here. |
| 75 | + |
| 76 | +```julia |
| 77 | +unsafe_env = LMDB.Environment("data", flags=LMDB.MDB_NOSYNC) |
| 78 | +unsafe_dict = PersistentDict{String,String}(unsafe_env) |
| 79 | +flush(unsafe_env) do |
| 80 | + unsafe_dict["foo"] = "bar" |
| 81 | + unsafe_dict["foo"] = "baz" |
| 82 | +end # <== data is flushed to disk here |
| 83 | +``` |
| 84 | + |
| 85 | +This is equvalent to: |
| 86 | + |
| 87 | +```julia |
| 88 | +unsafe_env = LMDB.Environment("data", flags=LMDB.MDB_NOSYNC) |
| 89 | +unsafe_dict = PersistentDict{String,String}(unsafe_env) |
| 90 | +try |
| 91 | + unsafe_dict["foo"] = "bar" |
| 92 | + unsafe_dict["foo"] = "baz" |
| 93 | +finally |
| 94 | + flush(unsafe_env) |
| 95 | +end |
| 96 | +``` |
8 | 97 |
|
9 | 98 | ## Running Tests |
10 | 99 |
|
|
0 commit comments