LangDB is an educational SQL database implementation written in Rust. It provides a minimal but functional in-memory SQL database with support for basic SQL operations. This project demonstrates core database concepts including SQL parsing, query execution, and data storage.
- 💾 In-memory database with table management
- 🔍 SQL parser built with the nom parsing library
- 📊 Support for basic SQL statements:
- CREATE TABLE with column types
- INSERT with value lists
- SELECT with WHERE clauses
- 📋 REPL interface with special commands
- 🔒 Thread-safe operations for concurrent access
- 📝 Data types: INTEGER, TEXT, and NULL values
- ⚡ Extensible architecture for future enhancements
- Rust and Cargo (1.70+ recommended)
-
Clone the repository:
git clone https://github.com/Okemwag/langdb.git cd langdb -
Build the project:
cargo build --release
-
Run LangDB:
cargo run --release
-
(Optional) Run the test script:
./test_run.sh
For a detailed getting started guide, see QUICKSTART.md.
LangDB provides an interactive SQL prompt where you can enter SQL commands:
=================================================
LangDB - A Simple SQL Database
=================================================
Type SQL commands to execute them.
Commands end with semicolon (;)
Special commands:
.help - Display this help message
.exit, .quit - Exit the program
.tables - Show all tables
Examples:
CREATE TABLE users (id INTEGER, name TEXT);
INSERT INTO users VALUES (1, 'Alice');
SELECT * FROM users;
=================================================
langdb>
.help- Display help information.exitor.quit- Exit the program.tables- List all tables in the database
CREATE TABLE users (id INTEGER, name TEXT, age INTEGER);INSERT INTO users VALUES (1, 'Alice', 30);
INSERT INTO users VALUES (2, 'Bob', 25), (3, 'Charlie', 35);INSERT INTO users (id, name) VALUES (4, 'Dave');SELECT * FROM users;
SELECT name, age FROM users WHERE age > 25;- CREATE TABLE with column definitions
- Column types: INTEGER, TEXT
- NULL/NOT NULL constraints
- INSERT statements
- Full row inserts
- Column-specific inserts
- Multiple row inserts
- SELECT statements
- Column projection (specific columns or *)
- Basic WHERE clause with comparisons (=, <>, >, <, >=, <=)
- Table scans
- No support for JOIN operations
- No support for aggregate functions (SUM, COUNT, etc.)
- No support for ORDER BY or GROUP BY
- No persistent storage (in-memory only)
- Limited data types (INTEGER and TEXT only)
- Basic WHERE clause (no AND/OR support)
LangDB follows a layered architecture pattern, separating concerns into distinct modules that work together to provide a complete SQL database system.
For a detailed architecture overview, see ARCHITECTURE.md
┌─────────────────────────────────────────────────────────┐
│ REPL Interface │
│ (main.rs) │
│ - User input handling │
│ - Command routing │
│ - Result formatting │
└────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ SQL Parser │
│ (parser/mod.rs) │
│ - Lexical analysis │
│ - Syntax parsing (nom combinators) │
│ - AST generation │
└────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Query Executor │
│ (executor/mod.rs) │
│ - Statement routing │
│ - Query planning │
│ - WHERE clause evaluation │
│ - Column projection │
└────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Storage Engine │
│ (storage/mod.rs) │
│ - Table management │
│ - Row storage (in-memory) │
│ - Thread-safe operations (RwLock) │
│ - Schema validation │
└────────────────────┬────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Type System │
│ (types/mod.rs) │
│ - Data type definitions │
│ - Value representation │
│ - Schema structures │
│ - Type validation & conversion │
└─────────────────────────────────────────────────────────┘
The entry point and user interaction layer:
- Implements a Read-Eval-Print Loop for interactive SQL execution
- Handles special commands (
.help,.exit,.tables) - Manages multi-line SQL input (statements ending with
;) - Formats and displays query results
- Provides error handling and user feedback
Transforms SQL text into executable statements:
- Built using the
nomparser combinator library - Supports three statement types:
CREATE TABLE: Table definition with columns and typesINSERT: Data insertion with optional column specificationSELECT: Data retrieval with column projection and filtering
- Generates Abstract Syntax Tree (AST) representations
- Provides detailed error messages for syntax errors
Key parsing components:
- Identifier parsing (table/column names)
- Literal parsing (strings, integers, NULL)
- Operator parsing (=, <>, >, <, >=, <=)
- Clause parsing (WHERE conditions)
Executes parsed SQL statements:
- Routes statements to appropriate execution handlers
- Implements query logic:
- CREATE TABLE: Validates schema and creates table structure
- INSERT: Validates data types and inserts rows
- SELECT: Performs table scans, applies filters, and projects columns
- Handles WHERE clause evaluation
- Manages column projection (selecting specific columns or
*) - Converts execution results into
ResultSetobjects
Execution flow:
- Receive parsed statement from parser
- Validate against storage schema
- Execute operation on storage layer
- Format results for display
Manages data persistence and retrieval:
- In-memory storage using
HashMap<String, Table> - Thread-safe operations via
Arc<RwLock<>> - Table structure:
- Metadata (name, schema)
- Row collection (Vec)
- Operations:
- Table creation/deletion
- Row insertion (single and batch)
- Table scanning
- Row filtering
- Schema validation on all write operations
Concurrency model:
- Read-write locks allow multiple concurrent readers
- Exclusive write access for modifications
- Prevents data races and ensures consistency
Defines core data structures:
- DataType: Supported SQL types (INTEGER, TEXT)
- Value: Runtime value representation (Integer, Text, Null)
- Column: Column definition with name, type, and nullability
- Schema: Collection of columns defining table structure
- Row: Collection of values representing a table row
- ResultSet: Query results with schema and rows
Type operations:
- Type validation and conversion
- Value comparison (for WHERE clauses)
- Schema validation
- Result formatting
User Input → Parser → Executor → Storage → Result
↓ ↓ ↓ ↓ ↓
"SELECT" Statement Execute Scan Format
SQL text AST Query Table Display
- User enters:
SELECT name FROM users WHERE id = 1; - Parser creates
SelectStatementwith:- columns:
["name"] - table_name:
"users" - where_clause:
Condition { column: "id", op: Equals, value: Integer(1) }
- columns:
- Executor:
- Retrieves table schema from storage
- Scans all rows from
userstable - Filters rows where
id = 1 - Projects only
namecolumn
- Storage returns matching rows
- Executor creates
ResultSetwith filtered/projected data - REPL formats and displays results
- Layered Architecture: Clear separation of concerns across modules
- Parser Combinator: Composable parsing functions using
nom - Repository Pattern: Storage layer abstracts data access
- Visitor Pattern: Executor visits different statement types
- Builder Pattern: Schema and row construction
- Thread-Safe Singleton: Database instance with Arc<RwLock<>>
LangDB uses Rust's ownership system and synchronization primitives:
Arc<RwLock<HashMap>>for shared database access- Multiple readers can access data simultaneously
- Writers get exclusive access
- Prevents data races at compile time
Comprehensive error types for each layer:
ParseError: Syntax and parsing errorsExecutionError: Query execution failuresStorageError: Data access and validation errorsTypeError: Type conversion and validation errors
All errors implement thiserror::Error for consistent error handling.
LangDB is organized into several modules, each handling a specific database component:
-
parser: SQL parsing using the nom library
- Parses SQL strings into structured Statement objects
- Handles lexical analysis and syntactic validation
-
types: Core data types and structures
- Defines Value, DataType, and other foundational types
- Implements schema validation and type conversion
-
storage: Data storage and retrieval
- Manages tables and their data
- Provides thread-safe access to the database
-
executor: Query execution
- Executes parsed SQL statements
- Routes operations to the storage engine
- Handles result formatting
-
main: REPL interface
- Provides the interactive command-line interface
- Processes user input and displays results
langdb> CREATE TABLE products (id INTEGER, name TEXT, price INTEGER);
langdb> INSERT INTO products VALUES (1, 'Laptop', 1200), (2, 'Phone', 800);
Inserted rows. Total rows: 2
langdb> SELECT * FROM products;
| id | name | price |
+----+--------+-------+
| 1 | Laptop | 1200 |
| 2 | Phone | 800 |
2 row(s) returned
langdb> SELECT name, price FROM products WHERE price > 1000;
| name | price |
+--------+-------+
| Laptop | 1200 |
1 row(s) returned
LangDB includes comprehensive tests to verify functionality:
# Run all tests
cargo test
# Run specific test modules
cargo test --test integration_tests- Unit tests: Individual module testing
- Integration tests: End-to-end testing of the database functionality
- Edge case testing: Validation of error handling and corner cases
langdb/
├── Cargo.toml # Project configuration
├── src/
│ ├── main.rs # REPL implementation
│ ├── parser/ # SQL parsing
│ ├── types/ # Core data types
│ ├── storage/ # Data storage
│ └── executor/ # Query execution
└── tests/
└── integration_tests.rs # Integration tests
- Persistent storage (file-based)
- Support for more SQL features (JOIN, GROUP BY, etc.)
- Additional data types (FLOAT, BOOLEAN, DATE, etc.)
- Indexing for improved query performance
- Transaction support
- More complex WHERE clause expressions
Contributions are welcome! Here's how you can contribute to LangDB:
- Fork the repository
- Create a feature branch:
git checkout -b my-new-feature - Make your changes and commit them:
git commit -am 'Add some feature' - Push to the branch:
git push origin my-new-feature - Submit a pull request
Please make sure your code follows the existing style and includes appropriate tests.
Want to understand the thought process, challenges, and learnings behind this project? Read the detailed journey:
- PROJECT_JOURNEY.md - A deep dive into why and how this database was built
See CHANGELOG.md for a detailed history of changes and version information.
This project is licensed under the MIT License - see the LICENSE file for details.
LangDB was created as an educational project to learn about database internals, SQL parsing, and Rust programming. It is not intended for production use.