Rust-based SQL development tooling POC #134

tobyhede · 2025-10-20T09:18:17Z

Description:

💪 What

Adds a proof-of-concept Rust-based development framework for EQL that demonstrates improved testing, documentation generation, and multi-database
support using the Config module.

Implements type-safe dependency resolution via compile-time graph walking - components declare dependencies through associated types,
eliminating manual dependency ordering
Adds transaction-isolated test harness (TestDb) - tests run in auto-rollback transactions, removing need for manual database cleanup
Creates automatic build tool that generates SQL installers (280 lines) by walking the component dependency graph
Migrates Config module to new structure - add_column function works end-to-end with all dependencies loaded automatically
Generates customer-facing documentation via rustdoc with SQL code examples inline with trait definitions
Implements structured error handling with thiserror - errors include query context for better debugging
Passes all 16 tests across 4 crates (eql-core, eql-postgres, eql-test, eql-build)

Workspace Structure

eql-core: Trait definitions (Component, Config, Dependencies)
eql-postgres: PostgreSQL implementation of Config module
eql-test: Test harness with transaction isolation
eql-build: Build tool for SQL extraction

🤔 Why

Current pain: Manual dependency management with REQUIRE comments is error-prone and requires topological sort
Current pain: Tests require manual database resets, causing interference between test runs
Current pain: Documentation lives separately from code, leading to drift
Rust solution: Type system enforces dependency relationships at compile time
Rust solution: TestDb RAII pattern ensures automatic cleanup via Drop
Rust solution: Rustdoc keeps documentation with trait definitions

This POC proves the approach is viable before investing in full migration. The Config module serves as a representative example of how all EQL modules
could be structured.

👀 Usage

Running Tests

cd .worktrees/rust-sql-tooling
cargo test --all

Building SQL Installer

cargo run --bin eql-build postgres
# Output: release/cipherstash-encrypt-postgres-poc.sql

Generating Documentation

cargo doc --no-deps --open
# Navigate to eql_core::config::Config to see SQL examples

👩‍🔬 How to validate

1. Verify automatic dependency resolution works

cd .worktrees/rust-sql-tooling
cargo run --bin eql-build postgres
Observe the output shows 8 dependencies resolved in correct order (types.sql before functions_private.sql, etc.). Open
release/cipherstash-encrypt-postgres-poc.sql and verify types are defined before functions that use them.

2. Verify transaction isolation in tests

cargo test --package eql-test test_testdb_transaction_isolation -- --nocapture
Run the test multiple times - it should pass consistently without manual cleanup. The test creates a table, inserts data, queries it, then drops the
transaction.

3. Verify add_column works end-to-end

cargo test --package eql-postgres test_add_column_creates_config -- --nocapture
This test loads all SQL dependencies automatically, calls add_column, and verifies:
- Config JSONB has correct structure
- Configuration stored in database
- Encrypted constraint added to table

4. Verify documentation includes SQL examples

cargo doc --no-deps
open target/doc/eql_core/trait.Config.html
Navigate to the add_column method - documentation should show SQL usage examples, not Rust code.

🔗 Related links

- Implementation plan: docs/plans/2025-01-20-rust-sql-tooling-poc-v2.md

Add two new reference documents: - eql-functions.md: Complete API reference for all 40+ EQL functions organized by category (configuration, query, JSONB, array, helpers, aggregates). Includes both operator and function forms with examples. - database-indexes.md: PostgreSQL B-tree index guide covering creation, usage requirements, troubleshooting, and best practices. Based on operator_class_test.sql implementation. These documents provide the missing comprehensive reference material for EQL's public API and PostgreSQL index integration.

Complete rewrite of json-support.md to match actual EQL API: - Remove references to non-existent functions (ste_vec_value, ste_vec_term, ste_vec_terms) - Document actual functions: jsonb_path_query, jsonb_array_elements, jsonb_path_exists, grouped_value - Show correct operator usage (@>, <@, ->, ->>) - Add proper selector-based querying examples - Reduce from 887 to 297 lines, removing outdated content - Add explanation of how ste_vec indexing works All examples now verified against test files and source code.

Fix multiple documentation issues verified against implementation: - index-config.md: Correct function names from cs_* to eql_v2.*, add missing cast types (real, double) - proxy-configuration.md: Change from wrapped function calls to direct operator usage (= instead of hmac_256() wrappers), add function alternatives for Supabase - docs/README.md: Fix broken links, add references to new documentation (database-indexes.md, eql-functions.md) All changes verified against test files to ensure accuracy.

Add comprehensive user guidance to main README.md: - Add step-by-step getting started example with SQL code showing table creation, column configuration, and index setup - Add troubleshooting section with common errors, causes, and solutions - Clarify that CipherStash Proxy/Protect.js are required for encryption - Update TOC: change "Developing" to "Contributing" - Improve installation instructions and upgrade process Makes the README more actionable for new users with concrete examples and solutions to common setup issues.

Add "Adding SQL" section to DEVELOPMENT.md with guidelines for: - Modular SQL file organization - Operator wrappers vs implementation separation - Dependency declarations using REQUIRE comments - Test file naming conventions - Build system usage with tsort Provides clear guidance for contributors developing new SQL features.

Fix critical documentation issues and add missing function documentation: - Fix return type names: change eql_v2_* to eql_v2.* for index term extraction functions (hmac_256, blake3, bloom_filter, ore_block_u64_8_256) - Add missing function equivalents for Supabase compatibility: ilike(), lt(), lte(), gt(), gte() - Document configuration lifecycle functions: migrate_config(), activate_config(), discard(), reload_config() - Add aggregate functions: min(), max() - Document jsonb overloads for JSONB path and array functions - Add new Utility Functions section with version(), to_encrypted(), to_jsonb(), check_encrypted() - Standardize terminology: change "context" to "prefix" for ste_vec index configuration - Update table of contents to include new sections This update brings the documentation in line with actual implementation and test behavior, addressing 15+ previously undocumented functions.

- Fix check_encrypted() description to reflect actual validation (v,c,i fields not v,k,i) - Update check_encrypted() behavior: raises exceptions instead of returning false - Fix version() example to not show hardcoded version string - Enhance JSON path operator examples with specific selector types (text, encrypted, integer) - Add array index access example for -> operator

Prevents worktree contents from being tracked in repository.

…error Changes to POC plan: - Remove Task 7 (manual doc extraction to markdown) - Add new Task 7 (structured error handling with thiserror) - Update success criteria to use rustdoc for customer-facing docs - Update verification checklist to use 'cargo doc --no-deps --open' Rationale: - Rustdoc comments already generate documentation automatically - Writing /// comments for customers (with SQL examples) IS the docs - No need for separate markdown extraction - rustdoc HTML is better - thiserror provides better error messages for debugging This simplifies the POC while addressing the original doc drift problem.

Create workspace with four crates: - eql-core: Trait definitions for EQL API - eql-postgres: PostgreSQL implementation - eql-test: Test harness with transaction isolation - eql-build: Build tool for SQL extraction This is a proof of concept for Rust-based SQL development tooling to improve testing, documentation, and multi-database support.

Add error hierarchy: - EqlError: Top-level error type - ComponentError: SQL file and dependency errors - DatabaseError: Database operation errors Benefits: - Clear error messages with context (e.g., which query failed) - Type-safe error handling throughout the codebase - Better debugging experience for tests and build tools Errors defined first (before other code) to enable TDD.

Add core trait system for EQL API: - Component trait: Represents SQL file with type-safe dependencies - Dependencies trait: Automatic dependency collection via type system - Config trait: Configuration management API with rustdoc examples Key innovation: Component::collect_dependencies() walks the type graph at compile time to resolve SQL load order automatically. The Config trait includes documentation that will be auto-generated into customer-facing docs, preventing documentation drift.

Create TestDb struct providing: - Automatic transaction BEGIN on creation - Auto-rollback on drop (clean slate for next test) - Helper methods: execute(), query_one() - Assertion helpers: assert_jsonb_has_key() - Structured DatabaseError with query context This solves current testing pain points: - No more manual database resets between tests - Clear error messages (shows which query failed) - Foundation for parallel test execution (future)

Add SQL implementations: - config/types.sql: Configuration table and enum type - config/functions_private.sql: Helper functions (config_default, etc.) - config/migrate_activate.sql: Migration and activation functions - config/add_column.sql: Main add_column function - encrypted/check_encrypted.sql: Stub for encrypted data validation - encrypted/add_encrypted_constraint.sql: Constraint helper All dependencies for add_column now present. Next task will wire these up via Rust Component trait with automatic dependency resolution.

Add PostgreSQL component implementations: - ConfigTypes: Configuration table/enum (no dependencies) - ConfigPrivateFunctions: Helper functions (depends on ConfigTypes) - CheckEncrypted: Validation stub (no dependencies) - AddEncryptedConstraint: Constraint helper (depends on CheckEncrypted) - MigrateActivate: Migration functions (depends on ConfigTypes) - AddColumn: Main function (depends on all above) Key achievement: Component::collect_dependencies() automatically resolves load order via type-level dependency graph. Tests verify: - SQL files exist at expected paths - Dependencies collected without duplicates - Dependency order respects constraints

Add comprehensive integration tests: 1. test_add_column_creates_config: Verifies complete workflow - Loads all dependencies via Component::collect_dependencies() - Calls add_column function - Validates JSONB config structure - Confirms config stored in database - Checks encrypted constraint was added 2. test_add_column_rejects_duplicate: Verifies error handling - Ensures duplicate column config raises exception Also added: - batch_execute() to TestDb for loading multi-statement SQL files - ConfigTables and ConfigIndexes components - Fixed check_encrypted to accept eql_v2_encrypted type Key achievement: add_column function works end-to-end in POC. All dependencies loaded automatically via type-safe dependency graph.

Create build tool that: - Uses Component::collect_dependencies() for automatic ordering - Reads SQL files in dependency order - Generates release/cipherstash-encrypt-postgres-poc.sql - Removes REQUIRE comments (metadata from old system) Key achievement: Build tool uses type-level dependency graph to automatically resolve SQL load order. No manual topological sort or configuration files needed. Tests verify: - Output file created - Contains all expected SQL - Dependencies in correct order (types before functions using them)

- Cargo.lock: Lock file for reproducible builds - Implementation plan: Documents the POC approach and tasks

- Add sql_component! macro with automatic path inference from module::Component - Convert PascalCase component names to snake_case SQL filenames using paste crate - Support path overrides when naming doesn't follow convention - Handle single and multiple dependencies correctly (no tuple wrapping for single deps) - Move customer-facing documentation from component types to trait methods Customer docs now appear on functions (add_column, remove_column, etc.) not on internal component structs (AddColumn, RemoveColumn, etc.) - Reduce config.rs from 155 lines to 51 lines (67% reduction) - Component declarations: ~13 lines/component -> ~1 line/component Example usage: sql_component!(config::AddColumn, deps: [Dep1, Dep2, ...]); sql_component!(config::ConfigTypes => "types.sql"); sql_component!(RemoveColumn => "not_implemented.sql"); All tests passing.

tobyhede added 19 commits October 13, 2025 12:16

chore: add .worktrees/ to .gitignore

1dbbae7

Prevents worktree contents from being tracked in repository.

chore: add Cargo.lock and implementation plan

0919c47

- Cargo.lock: Lock file for reproducible builds - Implementation plan: Documents the POC approach and tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Rust-based SQL development tooling POC #134

Rust-based SQL development tooling POC #134

Uh oh!

tobyhede commented Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Rust-based SQL development tooling POC #134

Are you sure you want to change the base?

Rust-based SQL development tooling POC #134

Uh oh!

Conversation

tobyhede commented Oct 20, 2025

💪 What

Workspace Structure

🤔 Why

👀 Usage

Running Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant