|
| 1 | +AshDB |
| 2 | +======= |
| 3 | + |
| 4 | +The AshDB library provides a persistant indexed storage system. Data are stored in _segments_ that have a configurable maximum size. Data is retrived by index. |
| 5 | + |
| 6 | +## Database Types |
| 7 | + |
| 8 | +The `ashdb::AshDB` class is the main database type. This is a templated class which accepts a single parameter -- the type of object that is to be stored in the database. |
| 9 | + |
| 10 | +The constructor itself accepts two parameters, the first is a `cosnt std::string&` to the name of the database, and the second is an optional argument of type `ashdb::Options`. These arguments are discussed later in the section _Opening a Database_. |
| 11 | + |
| 12 | +For example, to create a database of `std::uint32_t` integers, the database object would be declared as follows: |
| 13 | + |
| 14 | +```cpp |
| 15 | +#include <ashdb/ashdb.h> |
| 16 | + |
| 17 | +ashdb::AshDB<std::uint32_t> db("integers"); |
| 18 | +``` |
| 19 | +
|
| 20 | +AshDB supports C++ intergral and float types. AshDB also supports `std::string` by default, hence one could declare a database of strings like so: |
| 21 | +
|
| 22 | +```cpp |
| 23 | +ashdb::AshDB<std::string> stringdb("strings"); |
| 24 | +``` |
| 25 | + |
| 26 | +The primitive types are implemented in `include/ashdb/primitives.h`. |
| 27 | + |
| 28 | +### Custom Data Types |
| 29 | + |
| 30 | +Any data type not natively supported by AshDB can be used with AshDB by implementing read and write functions for that type. The read and and write functions can be composed from the primitive functions `ashdb_write()` and `ashdb_read()`. |
| 31 | + |
| 32 | +For example consider a 3D `Point` structure that consists of `x`,`y` and `z` members. |
| 33 | + |
| 34 | +```cpp |
| 35 | +struct Point |
| 36 | +{ |
| 37 | + std::uint32_t x; |
| 38 | + std::uint32_t y; |
| 39 | + std::uint32_t z; |
| 40 | +}; |
| 41 | +``` |
| 42 | + |
| 43 | +In order to use this type with AshDB we implement `ashdb_write` and `ashdb_read`. These functions must be defined in the same namepsace as the type itself. |
| 44 | + |
| 45 | + |
| 46 | +```cpp |
| 47 | +void ashdb_read(std::istream& stream, Point& p) |
| 48 | +{ |
| 49 | + ashdb::ashdb_read(stream, p.x); |
| 50 | + ashdb::ashdb_read(stream, p.y); |
| 51 | + ashdb::ashdb_read(stream, p.z); |
| 52 | +} |
| 53 | + |
| 54 | +void ashdb_write(std::ostream& stream, const Point& p) |
| 55 | +{ |
| 56 | + ashdb::ashdb_write(stream, p.x); |
| 57 | + ashdb::ashdb_write(stream, p.y); |
| 58 | + ashdb::ashdb_write(stream, p.z); |
| 59 | +} |
| 60 | +``` |
| 61 | +
|
| 62 | +**It is important to note that the two functions must read and write the same data members in the same order**. |
| 63 | +
|
| 64 | +Now we can declare an AshDB that supports our `Point` type. |
| 65 | +
|
| 66 | +```cpp |
| 67 | +ashdb::AshDB<Point> pointdb("points"); |
| 68 | +``` |
| 69 | + |
| 70 | +Types can also be composed. For example if we wanted to define a `Triangle` type of three `Point` objects then we can compose the read and write functions like so: |
| 71 | + |
| 72 | +```cpp |
| 73 | +struct Triangle |
| 74 | +{ |
| 75 | + Point a; |
| 76 | + Point b; |
| 77 | + Point c; |
| 78 | +}; |
| 79 | + |
| 80 | +void ashdb_read(std::istream& stream, Triangle& t) |
| 81 | +{ |
| 82 | + ashdb_read(stream, t.a); |
| 83 | + ashdb_read(stream, t.b); |
| 84 | + ashdb_read(stream, t.c); |
| 85 | +} |
| 86 | + |
| 87 | +void ashdb_write(std::ostream& stream, const Triangle& t) |
| 88 | +{ |
| 89 | + ashdb_write(stream, t.a); |
| 90 | + ashdb_write(stream, t.b); |
| 91 | + ashdb_write(stream, t.c); |
| 92 | +} |
| 93 | +``` |
| 94 | +
|
| 95 | +Notice that we used our previously defined `ashdb_read` and `ashdb_write` functions, which means we had to use the same namespace. Now we can define a `Triangle` database: |
| 96 | +
|
| 97 | +```cpp |
| 98 | +ashdb::AshDB<Triangle> triangledb("triangles"); |
| 99 | +``` |
| 100 | + |
| 101 | +## Opening a Database |
| 102 | + |
| 103 | +As discussed above, the `AshDB` constructor accepts one parameter, the name of the database, and an optional parameter of type `Options`. |
| 104 | + |
| 105 | +The name of the AshDB corresponds to a file system directory. All contents of the database are stored in this directory. The string can be a relative path, a full path to the database, or simply a name which means the process look for the database in the current working folder. |
| 106 | + |
| 107 | +The `Options` parameter defines some behavior for the database. For example the default behavior will create the database folder in the `AshDB` constructor if it does not exist, thus the follow example will create the folder and open the database for reading and writing: |
| 108 | + |
| 109 | +```cpp |
| 110 | +#include <ashdb/ashdb.h> |
| 111 | + |
| 112 | +... |
| 113 | +ashdb::AshDB<std::string> stringdb("StringDB"); |
| 114 | +ashdb::OpenStatus status = stringdb.open(); |
| 115 | +assert(status == ashdb::OpenStatus::OK); |
| 116 | +``` |
| 117 | +
|
| 118 | +This will create a directory named "StringDB" in the current working directory if it doesn't already exist. If we instead want to generate an error if the folder already exists then the default behavior can be override with the `Options` parameter on the constructor: |
| 119 | +
|
| 120 | +```cpp |
| 121 | +#include <asdhb/options.h> |
| 122 | +
|
| 123 | +ashdb::Options options; |
| 124 | +options.error_if_exists = true; |
| 125 | +
|
| 126 | +ashdb::AshDB<std::string> stringdb("StringDB", options); |
| 127 | +ashdb::OpenStatus status = stringdb.open(); |
| 128 | +assert(status == ashdb::OpenStatus::OK); // will fail if folder exists |
| 129 | +``` |
| 130 | + |
| 131 | +### Options |
| 132 | + |
| 133 | +The `Options` struct is located in `include/ashdb/options.h`. |
| 134 | + |
| 135 | +* `create_if_missing`: Create the database if it doesn't exist. Existence is defined by whether or not the folder itself exists and not by any particular file or files. This condition is checked in the `AshDB<T>::open()` method. The default is true. |
| 136 | + |
| 137 | +* `error_if_exists`: Generate an error if the database exists. Existence is defined by whether or not the folder itself exists and not by any particular file or files. This condition is checked in the `AshDB<T>::open()` method. The default is false. |
| 138 | + |
| 139 | +* `filesize_max`: An unsigned integer that defines in bytes the upper size limit of each segment. Segment files can exceed this value since individual records are not split across segments. The default is 0 which means the segments have no size limit and everything will be stored in one data file. |
| 140 | + |
| 141 | +* `database_max`: The upper size limit of the database size. Database size is calculated by the sum of all the database files. Index files are **not** included in this total. When the total size is exceeded, the oldest data file is deleted. The default value is 0 which means the database has no size limit. |
| 142 | + |
| 143 | +* `prefix`: The prefix used for data files and index files. For example a value of "data" would give a filename like `data-00001.ash`. The default value is "data". |
| 144 | + |
| 145 | +* `extension`: The extension used for data files, and prefixed for the index file extensions. For example, a value of "bin" would give a data file with a name "data-00001.bin" and an index file with the name of "data-00001.binidx". The default is "ash". |
| 146 | + |
| 147 | +## Closing a Database |
| 148 | + |
| 149 | +Closing a database will reset the state of the database object to what it was before it was opened. |
| 150 | + |
| 151 | +```cpp |
| 152 | +ashdb::AshDB<std::string> stringdb("stringdb"); |
| 153 | +assert(db.opened() == false); |
| 154 | +assert(db.open() == ashdb::OpenStatus::OK); |
| 155 | +assert(db.opened() == true); |
| 156 | +db.close(); |
| 157 | +assert(db.opened() == false); |
| 158 | +``` |
| 159 | +
|
| 160 | +## Writing Data |
| 161 | +
|
| 162 | +### Batch Writing |
| 163 | +
|
| 164 | +## Reading Data |
| 165 | +
|
| 166 | +### Batch Reads |
| 167 | +
|
| 168 | +### Iteration |
| 169 | +
|
| 170 | +## Truncating Data |
| 171 | +
|
| 172 | +The database can be truncated to a specific number of records. For example, to truncate the size down to the first 50 records we can use `AshDB::truncate()` |
| 173 | +
|
| 174 | +```cpp |
| 175 | +ashdb::AshDB<std::string> strindb("strings"); |
| 176 | +assert(db.open() == ashdb::OpenStatus::OK); |
| 177 | +db.truncate(50); |
| 178 | +``` |
| 179 | + |
| 180 | +Data can only be removed from the end of the database. |
| 181 | +## Segments |
| 182 | + |
| 183 | +All data are stored in segments. Each segment consists of a database file and an index file that are located in the database folder. The data files are named using the format: |
| 184 | + |
| 185 | +``` |
| 186 | +<prefix>-<segment>.<extension> |
| 187 | +``` |
| 188 | + |
| 189 | +The `<prefix>` and `<extension>` can be set using `Options::prefix` and `Options::extension` respectively. The `<segment>` corresponds to the segment number which is determined by the maxium size of each segment, which can be set using `Options::filesize_max` when constructing the database object. |
| 190 | + |
| 191 | +Each data file has a corresponding index file that is named: |
| 192 | + |
| 193 | +``` |
| 194 | +<prefix>-<segment>.<extension>idx |
| 195 | +``` |
| 196 | + |
| 197 | +The "idx" suffix is hardcoded and cannot be changed. |
| 198 | + |
| 199 | + |
| 200 | + |
| 201 | + |
0 commit comments