Skip to content

Conversation

tobyhede
Copy link
Contributor

Description:

💪 What

Adds a proof-of-concept Rust-based development framework for EQL that demonstrates improved testing, documentation generation, and multi-database
support using the Config module.

  • Implements type-safe dependency resolution via compile-time graph walking - components declare dependencies through associated types,
    eliminating manual dependency ordering
  • Adds transaction-isolated test harness (TestDb) - tests run in auto-rollback transactions, removing need for manual database cleanup
  • Creates automatic build tool that generates SQL installers (280 lines) by walking the component dependency graph
  • Migrates Config module to new structure - add_column function works end-to-end with all dependencies loaded automatically
  • Generates customer-facing documentation via rustdoc with SQL code examples inline with trait definitions
  • Implements structured error handling with thiserror - errors include query context for better debugging
  • Passes all 16 tests across 4 crates (eql-core, eql-postgres, eql-test, eql-build)

Workspace Structure

  • eql-core: Trait definitions (Component, Config, Dependencies)
  • eql-postgres: PostgreSQL implementation of Config module
  • eql-test: Test harness with transaction isolation
  • eql-build: Build tool for SQL extraction

🤔 Why

  • Current pain: Manual dependency management with REQUIRE comments is error-prone and requires topological sort
  • Current pain: Tests require manual database resets, causing interference between test runs
  • Current pain: Documentation lives separately from code, leading to drift
  • Rust solution: Type system enforces dependency relationships at compile time
  • Rust solution: TestDb RAII pattern ensures automatic cleanup via Drop
  • Rust solution: Rustdoc keeps documentation with trait definitions

This POC proves the approach is viable before investing in full migration. The Config module serves as a representative example of how all EQL modules
could be structured.

👀 Usage

Running Tests

cd .worktrees/rust-sql-tooling
cargo test --all

Building SQL Installer

cargo run --bin eql-build postgres
# Output: release/cipherstash-encrypt-postgres-poc.sql

Generating Documentation

cargo doc --no-deps --open
# Navigate to eql_core::config::Config to see SQL examples

👩‍🔬 How to validate

1. Verify automatic dependency resolution works

cd .worktrees/rust-sql-tooling
cargo run --bin eql-build postgres
Observe the output shows 8 dependencies resolved in correct order (types.sql before functions_private.sql, etc.). Open
release/cipherstash-encrypt-postgres-poc.sql and verify types are defined before functions that use them.

2. Verify transaction isolation in tests

cargo test --package eql-test test_testdb_transaction_isolation -- --nocapture
Run the test multiple times - it should pass consistently without manual cleanup. The test creates a table, inserts data, queries it, then drops the
transaction.

3. Verify add_column works end-to-end

cargo test --package eql-postgres test_add_column_creates_config -- --nocapture
This test loads all SQL dependencies automatically, calls add_column, and verifies:
- Config JSONB has correct structure
- Configuration stored in database
- Encrypted constraint added to table

4. Verify documentation includes SQL examples

cargo doc --no-deps
open target/doc/eql_core/trait.Config.html
Navigate to the add_column method - documentation should show SQL usage examples, not Rust code.

🔗 Related links

- Implementation plan: docs/plans/2025-01-20-rust-sql-tooling-poc-v2.md

Add two new reference documents:
- eql-functions.md: Complete API reference for all 40+ EQL functions
  organized by category (configuration, query, JSONB, array, helpers,
  aggregates). Includes both operator and function forms with examples.
- database-indexes.md: PostgreSQL B-tree index guide covering creation,
  usage requirements, troubleshooting, and best practices. Based on
  operator_class_test.sql implementation.

These documents provide the missing comprehensive reference material for
EQL's public API and PostgreSQL index integration.
Complete rewrite of json-support.md to match actual EQL API:

- Remove references to non-existent functions (ste_vec_value,
  ste_vec_term, ste_vec_terms)
- Document actual functions: jsonb_path_query, jsonb_array_elements,
  jsonb_path_exists, grouped_value
- Show correct operator usage (@>, <@, ->, ->>)
- Add proper selector-based querying examples
- Reduce from 887 to 297 lines, removing outdated content
- Add explanation of how ste_vec indexing works

All examples now verified against test files and source code.
Fix multiple documentation issues verified against implementation:

- index-config.md: Correct function names from cs_* to eql_v2.*,
  add missing cast types (real, double)
- proxy-configuration.md: Change from wrapped function calls to direct
  operator usage (= instead of hmac_256() wrappers), add function
  alternatives for Supabase
- docs/README.md: Fix broken links, add references to new documentation
  (database-indexes.md, eql-functions.md)

All changes verified against test files to ensure accuracy.
Add comprehensive user guidance to main README.md:

- Add step-by-step getting started example with SQL code showing
  table creation, column configuration, and index setup
- Add troubleshooting section with common errors, causes, and solutions
- Clarify that CipherStash Proxy/Protect.js are required for encryption
- Update TOC: change "Developing" to "Contributing"
- Improve installation instructions and upgrade process

Makes the README more actionable for new users with concrete examples
and solutions to common setup issues.
Add "Adding SQL" section to DEVELOPMENT.md with guidelines for:
- Modular SQL file organization
- Operator wrappers vs implementation separation
- Dependency declarations using REQUIRE comments
- Test file naming conventions
- Build system usage with tsort

Provides clear guidance for contributors developing new SQL features.
Fix critical documentation issues and add missing function documentation:

- Fix return type names: change eql_v2_* to eql_v2.* for index term
  extraction functions (hmac_256, blake3, bloom_filter, ore_block_u64_8_256)
- Add missing function equivalents for Supabase compatibility:
  ilike(), lt(), lte(), gt(), gte()
- Document configuration lifecycle functions: migrate_config(),
  activate_config(), discard(), reload_config()
- Add aggregate functions: min(), max()
- Document jsonb overloads for JSONB path and array functions
- Add new Utility Functions section with version(), to_encrypted(),
  to_jsonb(), check_encrypted()
- Standardize terminology: change "context" to "prefix" for ste_vec
  index configuration
- Update table of contents to include new sections

This update brings the documentation in line with actual implementation
and test behavior, addressing 15+ previously undocumented functions.
- Fix check_encrypted() description to reflect actual validation (v,c,i fields not v,k,i)
- Update check_encrypted() behavior: raises exceptions instead of returning false
- Fix version() example to not show hardcoded version string
- Enhance JSON path operator examples with specific selector types (text, encrypted, integer)
- Add array index access example for -> operator
Prevents worktree contents from being tracked in repository.
…error

Changes to POC plan:
- Remove Task 7 (manual doc extraction to markdown)
- Add new Task 7 (structured error handling with thiserror)
- Update success criteria to use rustdoc for customer-facing docs
- Update verification checklist to use 'cargo doc --no-deps --open'

Rationale:
- Rustdoc comments already generate documentation automatically
- Writing /// comments for customers (with SQL examples) IS the docs
- No need for separate markdown extraction - rustdoc HTML is better
- thiserror provides better error messages for debugging

This simplifies the POC while addressing the original doc drift problem.
Create workspace with four crates:
- eql-core: Trait definitions for EQL API
- eql-postgres: PostgreSQL implementation
- eql-test: Test harness with transaction isolation
- eql-build: Build tool for SQL extraction

This is a proof of concept for Rust-based SQL development tooling
to improve testing, documentation, and multi-database support.
Add error hierarchy:
- EqlError: Top-level error type
- ComponentError: SQL file and dependency errors
- DatabaseError: Database operation errors

Benefits:
- Clear error messages with context (e.g., which query failed)
- Type-safe error handling throughout the codebase
- Better debugging experience for tests and build tools

Errors defined first (before other code) to enable TDD.
Add core trait system for EQL API:
- Component trait: Represents SQL file with type-safe dependencies
- Dependencies trait: Automatic dependency collection via type system
- Config trait: Configuration management API with rustdoc examples

Key innovation: Component::collect_dependencies() walks the type graph
at compile time to resolve SQL load order automatically.

The Config trait includes documentation that will be auto-generated
into customer-facing docs, preventing documentation drift.
Create TestDb struct providing:
- Automatic transaction BEGIN on creation
- Auto-rollback on drop (clean slate for next test)
- Helper methods: execute(), query_one()
- Assertion helpers: assert_jsonb_has_key()
- Structured DatabaseError with query context

This solves current testing pain points:
- No more manual database resets between tests
- Clear error messages (shows which query failed)
- Foundation for parallel test execution (future)
Add SQL implementations:
- config/types.sql: Configuration table and enum type
- config/functions_private.sql: Helper functions (config_default, etc.)
- config/migrate_activate.sql: Migration and activation functions
- config/add_column.sql: Main add_column function
- encrypted/check_encrypted.sql: Stub for encrypted data validation
- encrypted/add_encrypted_constraint.sql: Constraint helper

All dependencies for add_column now present. Next task will wire
these up via Rust Component trait with automatic dependency resolution.
Add PostgreSQL component implementations:
- ConfigTypes: Configuration table/enum (no dependencies)
- ConfigPrivateFunctions: Helper functions (depends on ConfigTypes)
- CheckEncrypted: Validation stub (no dependencies)
- AddEncryptedConstraint: Constraint helper (depends on CheckEncrypted)
- MigrateActivate: Migration functions (depends on ConfigTypes)
- AddColumn: Main function (depends on all above)

Key achievement: Component::collect_dependencies() automatically
resolves load order via type-level dependency graph.

Tests verify:
- SQL files exist at expected paths
- Dependencies collected without duplicates
- Dependency order respects constraints
Add comprehensive integration tests:
1. test_add_column_creates_config: Verifies complete workflow
   - Loads all dependencies via Component::collect_dependencies()
   - Calls add_column function
   - Validates JSONB config structure
   - Confirms config stored in database
   - Checks encrypted constraint was added

2. test_add_column_rejects_duplicate: Verifies error handling
   - Ensures duplicate column config raises exception

Also added:
- batch_execute() to TestDb for loading multi-statement SQL files
- ConfigTables and ConfigIndexes components
- Fixed check_encrypted to accept eql_v2_encrypted type

Key achievement: add_column function works end-to-end in POC.
All dependencies loaded automatically via type-safe dependency graph.
Create build tool that:
- Uses Component::collect_dependencies() for automatic ordering
- Reads SQL files in dependency order
- Generates release/cipherstash-encrypt-postgres-poc.sql
- Removes REQUIRE comments (metadata from old system)

Key achievement: Build tool uses type-level dependency graph
to automatically resolve SQL load order. No manual topological
sort or configuration files needed.

Tests verify:
- Output file created
- Contains all expected SQL
- Dependencies in correct order (types before functions using them)
- Cargo.lock: Lock file for reproducible builds
- Implementation plan: Documents the POC approach and tasks
- Add sql_component! macro with automatic path inference from module::Component
- Convert PascalCase component names to snake_case SQL filenames using paste crate
- Support path overrides when naming doesn't follow convention
- Handle single and multiple dependencies correctly (no tuple wrapping for single deps)

- Move customer-facing documentation from component types to trait methods
  Customer docs now appear on functions (add_column, remove_column, etc.)
  not on internal component structs (AddColumn, RemoveColumn, etc.)

- Reduce config.rs from 155 lines to 51 lines (67% reduction)
- Component declarations: ~13 lines/component -> ~1 line/component

Example usage:
  sql_component!(config::AddColumn, deps: [Dep1, Dep2, ...]);
  sql_component!(config::ConfigTypes => "types.sql");
  sql_component!(RemoveColumn => "not_implemented.sql");

All tests passing.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant