Skip to content

Commit f11c7b8

Browse files
authored
Added better documentation; (#12)
1 parent 98ae905 commit f11c7b8

File tree

4 files changed

+221
-24
lines changed

4 files changed

+221
-24
lines changed

README.md

Lines changed: 4 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,10 @@ This project spawned from my desire to write a cryptocoin from scratch which can
1313

1414
This is designed as a write-once-read-many database. The database is stored in segments, the max size of each segment can be configured. Likewise, the database can have a limit as a whole which when exceeded, the oldest segments of the database will be deleted.
1515

16+
# Documentation
17+
18+
[AshDB documentation](doc/README.md) is bundled with the source code.
19+
1620
# Example
1721

1822
Write the digits 0-99 to a new database, then read those numbers back out and print them.
@@ -66,22 +70,6 @@ The CMake build options include:
6670
* `BUILD_EXAMPLES`
6771
* `BUILD_UNIT_TESTS`
6872
* `CODE_COVERAGE`
69-
70-
## Options
71-
72-
The database options can be configured through the `ashdb::Options` struct located in `include/ashdb/options.h`.
73-
74-
* `create_if_missing`: A boolean the defines if the database should be created if it doesn't exist. Existence is defined by whether or not the folder itself exists and not by any particular file or files. The default is true.
75-
76-
* `error_if_exists`: A boolean that controls if an error should be generated if the database folder exists. The default is false.
77-
78-
* `filesize_max`: An unsigned integer that defines in bytes the upper threshold of each segment. Segment files can exceed this value since individual records are not split across segments. The default is 0 which means the segments have no size limit and everything will be stored in one data file.
79-
80-
* `database_max`: The upper threshold of the database size. Database size is calculated by the sum of all the database files. Index files are **not** included in this total. When the total size is exceeded, the oldest data file is deleted. The default value is 0 which means the database has no size limit.
81-
82-
* `prefix`: The prefix used for data files and index files. For example a value of "data" would give a filename like `data-00001.ash`. The default value is "data".
83-
84-
* `extension`: The extension used for data files, and prefixed for the index file extensions. For example, a value of "bin" would give a data file with a name "data-00001.bin" and an index file with the name of "data-00001.binidx". The default is "ash".
8573
`
8674
## Performance
8775

doc/README.md

Lines changed: 201 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,201 @@
1+
AshDB
2+
=======
3+
4+
The AshDB library provides a persistant indexed storage system. Data are stored in _segments_ that have a configurable maximum size. Data is retrived by index.
5+
6+
## Database Types
7+
8+
The `ashdb::AshDB` class is the main database type. This is a templated class which accepts a single parameter -- the type of object that is to be stored in the database.
9+
10+
The constructor itself accepts two parameters, the first is a `cosnt std::string&` to the name of the database, and the second is an optional argument of type `ashdb::Options`. These arguments are discussed later in the section _Opening a Database_.
11+
12+
For example, to create a database of `std::uint32_t` integers, the database object would be declared as follows:
13+
14+
```cpp
15+
#include <ashdb/ashdb.h>
16+
17+
ashdb::AshDB<std::uint32_t> db("integers");
18+
```
19+
20+
AshDB supports C++ intergral and float types. AshDB also supports `std::string` by default, hence one could declare a database of strings like so:
21+
22+
```cpp
23+
ashdb::AshDB<std::string> stringdb("strings");
24+
```
25+
26+
The primitive types are implemented in `include/ashdb/primitives.h`.
27+
28+
### Custom Data Types
29+
30+
Any data type not natively supported by AshDB can be used with AshDB by implementing read and write functions for that type. The read and and write functions can be composed from the primitive functions `ashdb_write()` and `ashdb_read()`.
31+
32+
For example consider a 3D `Point` structure that consists of `x`,`y` and `z` members.
33+
34+
```cpp
35+
struct Point
36+
{
37+
std::uint32_t x;
38+
std::uint32_t y;
39+
std::uint32_t z;
40+
};
41+
```
42+
43+
In order to use this type with AshDB we implement `ashdb_write` and `ashdb_read`. These functions must be defined in the same namepsace as the type itself.
44+
45+
46+
```cpp
47+
void ashdb_read(std::istream& stream, Point& p)
48+
{
49+
ashdb::ashdb_read(stream, p.x);
50+
ashdb::ashdb_read(stream, p.y);
51+
ashdb::ashdb_read(stream, p.z);
52+
}
53+
54+
void ashdb_write(std::ostream& stream, const Point& p)
55+
{
56+
ashdb::ashdb_write(stream, p.x);
57+
ashdb::ashdb_write(stream, p.y);
58+
ashdb::ashdb_write(stream, p.z);
59+
}
60+
```
61+
62+
**It is important to note that the two functions must read and write the same data members in the same order**.
63+
64+
Now we can declare an AshDB that supports our `Point` type.
65+
66+
```cpp
67+
ashdb::AshDB<Point> pointdb("points");
68+
```
69+
70+
Types can also be composed. For example if we wanted to define a `Triangle` type of three `Point` objects then we can compose the read and write functions like so:
71+
72+
```cpp
73+
struct Triangle
74+
{
75+
Point a;
76+
Point b;
77+
Point c;
78+
};
79+
80+
void ashdb_read(std::istream& stream, Triangle& t)
81+
{
82+
ashdb_read(stream, t.a);
83+
ashdb_read(stream, t.b);
84+
ashdb_read(stream, t.c);
85+
}
86+
87+
void ashdb_write(std::ostream& stream, const Triangle& t)
88+
{
89+
ashdb_write(stream, t.a);
90+
ashdb_write(stream, t.b);
91+
ashdb_write(stream, t.c);
92+
}
93+
```
94+
95+
Notice that we used our previously defined `ashdb_read` and `ashdb_write` functions, which means we had to use the same namespace. Now we can define a `Triangle` database:
96+
97+
```cpp
98+
ashdb::AshDB<Triangle> triangledb("triangles");
99+
```
100+
101+
## Opening a Database
102+
103+
As discussed above, the `AshDB` constructor accepts one parameter, the name of the database, and an optional parameter of type `Options`.
104+
105+
The name of the AshDB corresponds to a file system directory. All contents of the database are stored in this directory. The string can be a relative path, a full path to the database, or simply a name which means the process look for the database in the current working folder.
106+
107+
The `Options` parameter defines some behavior for the database. For example the default behavior will create the database folder in the `AshDB` constructor if it does not exist, thus the follow example will create the folder and open the database for reading and writing:
108+
109+
```cpp
110+
#include <ashdb/ashdb.h>
111+
112+
...
113+
ashdb::AshDB<std::string> stringdb("StringDB");
114+
ashdb::OpenStatus status = stringdb.open();
115+
assert(status == ashdb::OpenStatus::OK);
116+
```
117+
118+
This will create a directory named "StringDB" in the current working directory if it doesn't already exist. If we instead want to generate an error if the folder already exists then the default behavior can be override with the `Options` parameter on the constructor:
119+
120+
```cpp
121+
#include <asdhb/options.h>
122+
123+
ashdb::Options options;
124+
options.error_if_exists = true;
125+
126+
ashdb::AshDB<std::string> stringdb("StringDB", options);
127+
ashdb::OpenStatus status = stringdb.open();
128+
assert(status == ashdb::OpenStatus::OK); // will fail if folder exists
129+
```
130+
131+
### Options
132+
133+
The `Options` struct is located in `include/ashdb/options.h`.
134+
135+
* `create_if_missing`: Create the database if it doesn't exist. Existence is defined by whether or not the folder itself exists and not by any particular file or files. This condition is checked in the `AshDB<T>::open()` method. The default is true.
136+
137+
* `error_if_exists`: Generate an error if the database exists. Existence is defined by whether or not the folder itself exists and not by any particular file or files. This condition is checked in the `AshDB<T>::open()` method. The default is false.
138+
139+
* `filesize_max`: An unsigned integer that defines in bytes the upper size limit of each segment. Segment files can exceed this value since individual records are not split across segments. The default is 0 which means the segments have no size limit and everything will be stored in one data file.
140+
141+
* `database_max`: The upper size limit of the database size. Database size is calculated by the sum of all the database files. Index files are **not** included in this total. When the total size is exceeded, the oldest data file is deleted. The default value is 0 which means the database has no size limit.
142+
143+
* `prefix`: The prefix used for data files and index files. For example a value of "data" would give a filename like `data-00001.ash`. The default value is "data".
144+
145+
* `extension`: The extension used for data files, and prefixed for the index file extensions. For example, a value of "bin" would give a data file with a name "data-00001.bin" and an index file with the name of "data-00001.binidx". The default is "ash".
146+
147+
## Closing a Database
148+
149+
Closing a database will reset the state of the database object to what it was before it was opened.
150+
151+
```cpp
152+
ashdb::AshDB<std::string> stringdb("stringdb");
153+
assert(db.opened() == false);
154+
assert(db.open() == ashdb::OpenStatus::OK);
155+
assert(db.opened() == true);
156+
db.close();
157+
assert(db.opened() == false);
158+
```
159+
160+
## Writing Data
161+
162+
### Batch Writing
163+
164+
## Reading Data
165+
166+
### Batch Reads
167+
168+
### Iteration
169+
170+
## Truncating Data
171+
172+
The database can be truncated to a specific number of records. For example, to truncate the size down to the first 50 records we can use `AshDB::truncate()`
173+
174+
```cpp
175+
ashdb::AshDB<std::string> strindb("strings");
176+
assert(db.open() == ashdb::OpenStatus::OK);
177+
db.truncate(50);
178+
```
179+
180+
Data can only be removed from the end of the database.
181+
## Segments
182+
183+
All data are stored in segments. Each segment consists of a database file and an index file that are located in the database folder. The data files are named using the format:
184+
185+
```
186+
<prefix>-<segment>.<extension>
187+
```
188+
189+
The `<prefix>` and `<extension>` can be set using `Options::prefix` and `Options::extension` respectively. The `<segment>` corresponds to the segment number which is determined by the maxium size of each segment, which can be set using `Options::filesize_max` when constructing the database object.
190+
191+
Each data file has a corresponding index file that is named:
192+
193+
```
194+
<prefix>-<segment>.<extension>idx
195+
```
196+
197+
The "idx" suffix is hardcoded and cannot be changed.
198+
199+
200+
201+

examples/example2.cpp

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -15,20 +15,20 @@ struct Point
1515
std::uint32_t z;
1616
};
1717

18-
void ashdb_write(std::ostream& stream, const Point& p)
19-
{
20-
ashdb::ashdb_write(stream, p.x);
21-
ashdb::ashdb_write(stream, p.y);
22-
ashdb::ashdb_write(stream, p.z);
23-
}
24-
2518
void ashdb_read(std::istream& stream, Point& p)
2619
{
2720
ashdb::ashdb_read(stream, p.x);
2821
ashdb::ashdb_read(stream, p.y);
2922
ashdb::ashdb_read(stream, p.z);
3023
}
3124

25+
void ashdb_write(std::ostream& stream, const Point& p)
26+
{
27+
ashdb::ashdb_write(stream, p.x);
28+
ashdb::ashdb_write(stream, p.y);
29+
ashdb::ashdb_write(stream, p.z);
30+
}
31+
3232
} // namespace app
3333

3434
namespace std

include/ashdb/ashdb.h

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ std::string BuildFilename(const std::string& folder,
3131
std::vector<std::size_t> ReadIndexFile(const std::string& filename);
3232

3333
template<class ThingT>
34-
class AshDB
34+
class AshDB final
3535
{
3636

3737
public:
@@ -42,6 +42,12 @@ class AshDB
4242
// 1 - the index of the offset within the segment index (i.e. _segmentIndicies[x][y])
4343
using IndexDetails = std::tuple<std::size_t, std::size_t>;
4444

45+
explicit AshDB(const std::string& folder)
46+
: AshDB(folder, Options{})
47+
{
48+
// nothing to do
49+
}
50+
4551
AshDB(const std::string& folder, const Options& options)
4652
: _dbfolder{ folder },
4753
_options{ options }
@@ -83,6 +89,8 @@ class AshDB
8389
std::optional<std::size_t> startIndex() const { return _startIndex; }
8490
std::optional<std::size_t> lastIndex() const { return _lastIndex; }
8591

92+
bool opened() const noexcept { return _open; }
93+
8694
// returns the size of all the "data-0001.dat" files on the disk
8795
// but does NOT include the size of the corresponding index
8896
// files (i.e. "data-0001.datidx")

0 commit comments

Comments
 (0)