Skip to content

Latest commit

Β 

History

History
373 lines (279 loc) Β· 12.2 KB

File metadata and controls

373 lines (279 loc) Β· 12.2 KB

RedoxQL-darkmode RedoxQL-lightmode

πŸ¦€ RedoxQL is an L-Store database written in Rust and Python πŸš€

Rust Python

RedoxQL is our implementation of an L-Store database.

image

Important

Read the Structure section β€” We use both Rust and Python and they go in different places

Setup

Create a virtual envirement

python3 -m venv venv

Source the virtual envirement

source venv/bin/activate

Install maturin

pip install -r requirements.txt

Running

Build the Rust code

maturin build --release

Install the module (Note: the version will change so check the exact filename in target/wheels/)

pip install target/wheels/lstore-0.1.0-cp312-cp312-manylinux_2_34_x86_64.whl --force-reinstall

Run the database benchmark

python3 __main__.py

You should see this for milestone 1.

(venv) ecs-165a-database (main) Ξ» p __main__.py
Inserting 10k records took:  			 0.0077650810000000035
Updating 10k records took:  			 0.020893269
Selecting 10k records took:  			 0.016048745000000003
Aggregate 10k of 100 record batch took:	 0.0039221569999999956
Deleting 10k records took:  			 0.002314741000000009
(venv) ecs-165a-database (main) Ξ»

Attribution

  • Keanu - Secondary indexes, page.rs and all the page stuff, index.rs and all of the index stuff
  • Lucas & Andrew - update
  • Lucas - Merge, select_version, sum_version, matching
  • Abdulrasol - Merge, BaseContainer, TailContainer, PageDirectory, insert into containers, RecordAddress and Record
  • Jake - Persistence, RTable, RDatabase, RQuery, new RIndex, python bindings, inserts and reads for all props

Structure

RedoxQL system diagram

File Structure

.
β”œβ”€β”€ benches
β”‚Β Β  └── db_benchmark.rs
β”œβ”€β”€ Cargo.lock
β”œβ”€β”€ Cargo.toml
β”œβ”€β”€ docs
β”‚Β Β  └── old-system-diagrams.md
β”œβ”€β”€ LICENSE
β”œβ”€β”€ main_checking.py
β”œβ”€β”€ __main__.py
β”œβ”€β”€ Makefile
β”œβ”€β”€ pyproject.toml
β”œβ”€β”€ python
β”‚Β Β  β”œβ”€β”€ benchmarks
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ graph_scale.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ scaling_tester.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ simple_example.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ simple_tester.py
β”‚Β Β  β”‚Β Β  └── speedtests.py
β”‚Β Β  β”œβ”€β”€ lstore
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ db.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ __init__.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ query.py
β”‚Β Β  β”‚Β Β  β”œβ”€β”€ transaction.py
β”‚Β Β  β”‚Β Β  └── transaction_worker.py
β”‚Β Β  └── tests
β”‚Β Β      β”œβ”€β”€ __init__.py
β”‚Β Β      └── test_main.py
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt
└── src
    β”œβ”€β”€ container.rs
    β”œβ”€β”€ database.rs
    β”œβ”€β”€ index.rs
    β”œβ”€β”€ lib.rs
    β”œβ”€β”€ pagerange.rs
    β”œβ”€β”€ page.rs
    β”œβ”€β”€ query.rs
    β”œβ”€β”€ record.rs
    β”œβ”€β”€ system.rs
    └── table.rs

lstore

The lstore (./lstore) directory is where the Python code goes. This is what gets called by the tests in __main__.py. The lstore directory can be referred to as a Python module.

src

The src (./src) directory is where the Rust code goes. This gets called by the code in the lstore module.

system.rs - all the functions and structs related to getting information from the system machine database.rs - the database struct and the related functions

Demo Project

In the demo project, we made an API that uses our database to store grades. Here we are sending a PUT and then doing a GET.

image

Testing

Rust testing

cargo test

Here is what the correct output should look like. You should see multiple tests passing.

image

Unit Tests

Rust unit tests are located in each Rust file and can be found in ./src

Integration Tests

The integration tests are located at ./tests and are also run with cargo test

Python testing

pytest

Python tests are located in a separate directory called tests located in ./python

Rust Docs

Rust has a way of making docs from the source code. Run cargo doc and view the produced HTML page in your browser. Adding comments to yor code starting with /// will be put into these docs.

Speed Analysis

Using flamegraph to benchmark

You may need to install flamegraph with cargo install flamegraph

cargo flamegraph --test update_tests
# Open the svg (It's nice to view in a browser)
firefox flamegraph.svg

Preview:

image

Running cargo benchmarks

This will take a long time but you can run benchmarks separately.

cargo bench

You can use cargo bench to see if your changes significantly affect performance.

image

Often small changes can happen randomly. Like this has no change in the code. Try to run the bench as the only thing running on the system.

Using maturin in release mode

Maturin is what builds our Python package. Maturin has a mode called --release that significantly improves the speed. This is because release mode runs Cargo in release mode, and this does all of the optimization tricks to speed up the executable as much as possible.

perf_chart tests_chart

Data for both graphs can be found here

Scaling of Insert Operation

scaling

update_scale

Using release build settings

With compiler options

(venv) redoxql (main) Ξ» p __main__.py
Inserting 10k records took:                      0.006803145
Updating 10k records took:                       0.018702902999999996
Selecting 10k records took:                      0.016315803000000004
Aggregate 10k of 100 record batch took:  0.005981531999999998
Deleting 10k records took:                       0.002332115999999995
(venv) redoxql (main) Ξ» time p m1_tester.py
Insert finished

real    0m0.117s
user    0m0.106s
sys     0m0.010s
user    0m0.106s
sys     0m0.010s
(venv) redoxql (main) Ξ» time p exam_tester_m1.py
Insert finished

real    0m0.282s
user    0m0.272s
sys     0m0.010s
(venv) redoxql (main) Ξ»

Without compiler options

(venv) redoxql (main) Ξ» p __main__.py
Inserting 10k records took:                      0.007401888000000002
Updating 10k records took:                       0.018957811999999997
Selecting 10k records took:                      0.015054888999999995
Aggregate 10k of 100 record batch took:  0.003300163000000002
Deleting 10k records took:                       0.002181812999999991
(venv) redoxql (main) Ξ» time p m1_tester.py
Insert finished

real    0m0.112s
user    0m0.108s
sys     0m0.004s
(venv) redoxql (main) Ξ» time p exam_tester_m1.py
Insert finished

real    0m0.268s
user    0m0.254s
sys     0m0.014s

Running with debug or info logging

To use the logger, import the debug, error, or info macro from the log crate. Then you can add the macros to code like debug!("Start database!");. When you go to run the code, you can set the env var RUST_LOG=debug. Docs: https://docs.rs/env_logger/latest/env_logger/.

image

We tried Pypy

We started to try using Pypy which is a runtime for Python that is supposedly faster. Because of Amdahl's law, we actually can't get all that much performance out of it. We also found that the newest version of Pypy cannot use the newest version of Pyo3, so future work is needed to get them to run together.

Future questions:

  • How much time does the Python part take up?
  • How do we measure the improvement from Python to Pypy
  • How do we downgrade Pypy to work with Py03

We opted to go for C Python because it's much faster in this use case. We show the results of some tests here.

Profiling Python

We use py-spy to profile Python and we got expected results. It shows that the main use of Python is just the update function when testM2.py is run.

Here is how we ran it:

py-spy record -o profile.svg -- python3 testM2.py

profile

Moving everything possible to Rust

We used to make a ReturnRecord object for every single row! We also would turn the result of rquery into a list, wrap each in ReturnRecord and then make that back into a new list. ReturnRecord would also do list truncation each time it was initialized (for each row). We moved all of this logic into Rust can simply return what the Rust function returns. This change improved the performance by nearly 30%. The speed of testM2.py went from 1.30 seconds to 0.92 seconds. These results we consistent across 3 different runs each done in an interwoven fashion (with change, then without change, then with change again, etc.) for a total of 6 runs. The change can be seen in PR#162.

-   class ReturnRecord:
-       def __init__(self, columns: List[int]):
-           self.columns = columns[4:]
-
-       def __str__(self):
-           return f"Record({self.columns})"

-   return [
-       ReturnRecord(
-           list(
-               self.rquery.select_version(
-                   search_key,
-                   search_key_index,
-                   projected_columns_index,
-                   relative_version,
-               )
-           )
-       )
-   ]
+   return self.rquery.select_version(
+       search_key,
+       search_key_index,
+       projected_columns_index,
+       relative_version,
+   )

This includes insert and update.

Using FxHashMap instead of default HashMap

The Rust defualt HashMap is a great general purpose HashMap implementation and is very fast but FxHashMap is a decent bit faster. After changing the page directory to use FxHashMap in PR#186, the speed of many reads and writes improved by over 28% and the overall speed of all the benchmarks improved by by over 26%.

use crate::container::{ReservedColumns, NUM_RESERVED_COLUMNS};
use crate::index::RIndexHandle;
use pyo3::prelude::*;
use rustc_hash::FxHashMap;
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use std::sync::{Arc, Mutex, RwLock};

type RedoxQLHasher<K, V> = FxHashMap<K, V>;

#[derive(Default, Clone, Serialize, Deserialize)]
pub struct PageDirectoryMetadata {
    pub directory: HashMap<i64, RecordMetadata>,
    pub directory: RedoxQLHasher<i64, RecordMetadata>,
}

#[derive(Default, Clone)]
pub struct PageDirectory {
    pub directory: HashMap<i64, Record>,
    pub directory: RedoxQLHasher<i64, Record>,
}
// -- snip --

Using Rust compilation flags

You can set specific build flags that improve the optimization.

[profile.release]
lto = "fat"
codegen-units = 1
opt-level = 3
  • lto = "fat" is the max setting for "Link time optimization", meaning that when it links the binary together, it does the maximum amount of optimizations.

  • codegen-units = 1 means that rustc will not split up the source code into different parts to compile separately on different threads. Because of this, it's able to apply optimization tricks like inlining between libraries and our project source code.

  • opt-level = 3 is just setting the optimization level. The reason you might want it lower is compilation speed.