-
Notifications
You must be signed in to change notification settings - Fork 0
Rust-based SQL development tooling POC #134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
tobyhede
wants to merge
19
commits into
main
Choose a base branch
from
feature/rust-sql-tooling
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add two new reference documents: - eql-functions.md: Complete API reference for all 40+ EQL functions organized by category (configuration, query, JSONB, array, helpers, aggregates). Includes both operator and function forms with examples. - database-indexes.md: PostgreSQL B-tree index guide covering creation, usage requirements, troubleshooting, and best practices. Based on operator_class_test.sql implementation. These documents provide the missing comprehensive reference material for EQL's public API and PostgreSQL index integration.
Complete rewrite of json-support.md to match actual EQL API: - Remove references to non-existent functions (ste_vec_value, ste_vec_term, ste_vec_terms) - Document actual functions: jsonb_path_query, jsonb_array_elements, jsonb_path_exists, grouped_value - Show correct operator usage (@>, <@, ->, ->>) - Add proper selector-based querying examples - Reduce from 887 to 297 lines, removing outdated content - Add explanation of how ste_vec indexing works All examples now verified against test files and source code.
Fix multiple documentation issues verified against implementation: - index-config.md: Correct function names from cs_* to eql_v2.*, add missing cast types (real, double) - proxy-configuration.md: Change from wrapped function calls to direct operator usage (= instead of hmac_256() wrappers), add function alternatives for Supabase - docs/README.md: Fix broken links, add references to new documentation (database-indexes.md, eql-functions.md) All changes verified against test files to ensure accuracy.
Add comprehensive user guidance to main README.md: - Add step-by-step getting started example with SQL code showing table creation, column configuration, and index setup - Add troubleshooting section with common errors, causes, and solutions - Clarify that CipherStash Proxy/Protect.js are required for encryption - Update TOC: change "Developing" to "Contributing" - Improve installation instructions and upgrade process Makes the README more actionable for new users with concrete examples and solutions to common setup issues.
Add "Adding SQL" section to DEVELOPMENT.md with guidelines for: - Modular SQL file organization - Operator wrappers vs implementation separation - Dependency declarations using REQUIRE comments - Test file naming conventions - Build system usage with tsort Provides clear guidance for contributors developing new SQL features.
Fix critical documentation issues and add missing function documentation: - Fix return type names: change eql_v2_* to eql_v2.* for index term extraction functions (hmac_256, blake3, bloom_filter, ore_block_u64_8_256) - Add missing function equivalents for Supabase compatibility: ilike(), lt(), lte(), gt(), gte() - Document configuration lifecycle functions: migrate_config(), activate_config(), discard(), reload_config() - Add aggregate functions: min(), max() - Document jsonb overloads for JSONB path and array functions - Add new Utility Functions section with version(), to_encrypted(), to_jsonb(), check_encrypted() - Standardize terminology: change "context" to "prefix" for ste_vec index configuration - Update table of contents to include new sections This update brings the documentation in line with actual implementation and test behavior, addressing 15+ previously undocumented functions.
- Fix check_encrypted() description to reflect actual validation (v,c,i fields not v,k,i) - Update check_encrypted() behavior: raises exceptions instead of returning false - Fix version() example to not show hardcoded version string - Enhance JSON path operator examples with specific selector types (text, encrypted, integer) - Add array index access example for -> operator
Prevents worktree contents from being tracked in repository.
…error Changes to POC plan: - Remove Task 7 (manual doc extraction to markdown) - Add new Task 7 (structured error handling with thiserror) - Update success criteria to use rustdoc for customer-facing docs - Update verification checklist to use 'cargo doc --no-deps --open' Rationale: - Rustdoc comments already generate documentation automatically - Writing /// comments for customers (with SQL examples) IS the docs - No need for separate markdown extraction - rustdoc HTML is better - thiserror provides better error messages for debugging This simplifies the POC while addressing the original doc drift problem.
Create workspace with four crates: - eql-core: Trait definitions for EQL API - eql-postgres: PostgreSQL implementation - eql-test: Test harness with transaction isolation - eql-build: Build tool for SQL extraction This is a proof of concept for Rust-based SQL development tooling to improve testing, documentation, and multi-database support.
Add error hierarchy: - EqlError: Top-level error type - ComponentError: SQL file and dependency errors - DatabaseError: Database operation errors Benefits: - Clear error messages with context (e.g., which query failed) - Type-safe error handling throughout the codebase - Better debugging experience for tests and build tools Errors defined first (before other code) to enable TDD.
Add core trait system for EQL API: - Component trait: Represents SQL file with type-safe dependencies - Dependencies trait: Automatic dependency collection via type system - Config trait: Configuration management API with rustdoc examples Key innovation: Component::collect_dependencies() walks the type graph at compile time to resolve SQL load order automatically. The Config trait includes documentation that will be auto-generated into customer-facing docs, preventing documentation drift.
Create TestDb struct providing: - Automatic transaction BEGIN on creation - Auto-rollback on drop (clean slate for next test) - Helper methods: execute(), query_one() - Assertion helpers: assert_jsonb_has_key() - Structured DatabaseError with query context This solves current testing pain points: - No more manual database resets between tests - Clear error messages (shows which query failed) - Foundation for parallel test execution (future)
Add SQL implementations: - config/types.sql: Configuration table and enum type - config/functions_private.sql: Helper functions (config_default, etc.) - config/migrate_activate.sql: Migration and activation functions - config/add_column.sql: Main add_column function - encrypted/check_encrypted.sql: Stub for encrypted data validation - encrypted/add_encrypted_constraint.sql: Constraint helper All dependencies for add_column now present. Next task will wire these up via Rust Component trait with automatic dependency resolution.
Add PostgreSQL component implementations: - ConfigTypes: Configuration table/enum (no dependencies) - ConfigPrivateFunctions: Helper functions (depends on ConfigTypes) - CheckEncrypted: Validation stub (no dependencies) - AddEncryptedConstraint: Constraint helper (depends on CheckEncrypted) - MigrateActivate: Migration functions (depends on ConfigTypes) - AddColumn: Main function (depends on all above) Key achievement: Component::collect_dependencies() automatically resolves load order via type-level dependency graph. Tests verify: - SQL files exist at expected paths - Dependencies collected without duplicates - Dependency order respects constraints
Add comprehensive integration tests: 1. test_add_column_creates_config: Verifies complete workflow - Loads all dependencies via Component::collect_dependencies() - Calls add_column function - Validates JSONB config structure - Confirms config stored in database - Checks encrypted constraint was added 2. test_add_column_rejects_duplicate: Verifies error handling - Ensures duplicate column config raises exception Also added: - batch_execute() to TestDb for loading multi-statement SQL files - ConfigTables and ConfigIndexes components - Fixed check_encrypted to accept eql_v2_encrypted type Key achievement: add_column function works end-to-end in POC. All dependencies loaded automatically via type-safe dependency graph.
Create build tool that: - Uses Component::collect_dependencies() for automatic ordering - Reads SQL files in dependency order - Generates release/cipherstash-encrypt-postgres-poc.sql - Removes REQUIRE comments (metadata from old system) Key achievement: Build tool uses type-level dependency graph to automatically resolve SQL load order. No manual topological sort or configuration files needed. Tests verify: - Output file created - Contains all expected SQL - Dependencies in correct order (types before functions using them)
- Cargo.lock: Lock file for reproducible builds - Implementation plan: Documents the POC approach and tasks
- Add sql_component! macro with automatic path inference from module::Component - Convert PascalCase component names to snake_case SQL filenames using paste crate - Support path overrides when naming doesn't follow convention - Handle single and multiple dependencies correctly (no tuple wrapping for single deps) - Move customer-facing documentation from component types to trait methods Customer docs now appear on functions (add_column, remove_column, etc.) not on internal component structs (AddColumn, RemoveColumn, etc.) - Reduce config.rs from 155 lines to 51 lines (67% reduction) - Component declarations: ~13 lines/component -> ~1 line/component Example usage: sql_component!(config::AddColumn, deps: [Dep1, Dep2, ...]); sql_component!(config::ConfigTypes => "types.sql"); sql_component!(RemoveColumn => "not_implemented.sql"); All tests passing.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description:
💪 What
Adds a proof-of-concept Rust-based development framework for EQL that demonstrates improved testing, documentation generation, and multi-database
support using the Config module.
eliminating manual dependency ordering
Workspace Structure
eql-core
: Trait definitions (Component, Config, Dependencies)eql-postgres
: PostgreSQL implementation of Config moduleeql-test
: Test harness with transaction isolationeql-build
: Build tool for SQL extraction🤔 Why
This POC proves the approach is viable before investing in full migration. The Config module serves as a representative example of how all EQL modules
could be structured.
👀 Usage
Running Tests