Skip to content

Releases: paiml/trueno-db

Release v0.3.7

16 Dec 20:34

Choose a tag to compare

Release v0.3.5

06 Dec 08:23

Choose a tag to compare

Release v0.3.4

30 Nov 17:44

Choose a tag to compare

Release v0.3.2

24 Nov 12:12

Choose a tag to compare

v0.3.1: Upgrade to trueno v0.7.0

22 Nov 21:30

Choose a tag to compare

Performance Improvements

Upgraded to trueno v0.7.0 for significant performance gains:

  • 18% faster large matrix operations (1024×1024) via 3-level cache blocking
  • 8.5% improvement for 512×512 matrices
  • Zero-allocation hot path implementation in trueno
  • Better cache utilization for large datasets

Benchmark Results (1M rows)

Operation Trueno SIMD Scalar Baseline Speedup
SUM 224.13 µs 569.11 µs 2.54x
MIN 223.57 µs 948.48 µs 4.24x
AVG 225.20 µs 584.24 µs 2.59x

Quality Assurance

  • 202 tests pass with zero regressions
  • Zero clippy warnings
  • 93% code coverage for trueno-db
  • ✅ Property-based testing
  • ✅ Backend equivalence testing (GPU == SIMD == Scalar)

Dependencies

  • trueno: v0.6.0 → v0.7.0 (crates.io)

Installation

```toml
[dependencies]
trueno-db = "0.3.1"
```

For GPU support:
```toml
[dependencies]
trueno-db = { version = "0.3.1", features = ["gpu"] }
```

Links

Full Changelog: v0.3.0...v0.3.1

Release v0.3.0

22 Nov 13:48

Choose a tag to compare

Release v0.2.1

21 Nov 22:37

Choose a tag to compare

Quality improvements and trueno v0.6.0 upgrade patch release.

Highlights

  • 🔧 Upgraded trueno to v0.6.0 (latest SIMD features)
  • 🐛 Fixed 43 .unwrap() calls across production and examples
  • 📊 TDG score: 96.3/100 (A+) ⬆️ +14.5 points
  • ✅ Zero critical defects
  • ✅ All 156 tests passing

Changes

Changed

  • Upgraded trueno dependency - Updated from v0.4.1 to v0.6.0
    • Latest SIMD performance improvements and features
    • All 156 tests pass with new version
    • Zero breaking changes

Fixed

  • Production error handling - Replaced 19 .unwrap() calls with proper error handling

    • src/gpu/kernels.rs: 6 unwraps → .expect() with descriptive messages
    • src/gpu/jit.rs: 3 unwraps → .expect() for mutex and cache operations
    • src/query/mod.rs: 1 unwrap → proper doc example with ? operator
    • src/topk.rs: 10 unwraps → .ok_or_else() for error propagation
    • Prevents panics in production code paths
  • Example code quality - Replaced 24 .unwrap() calls with .expect() in examples

    • examples/market_crashes.rs: 5 unwraps fixed
    • examples/benchmark_shootout.rs: 4 unwraps fixed
    • examples/topk_selection.rs: 6 unwraps fixed
    • examples/complete_pipeline.rs: 3 unwraps fixed
    • examples/gaming_leaderboards.rs: 6 unwraps fixed
    • Better error messages demonstrating best practices

Removed

  • Redundant stub benchmark - Deleted benches/backend_comparison.rs
    • Functionality already covered by tests/backend_equivalence_tests.rs
    • Functionality already covered by benches/competitive_benchmarks.rs
    • Toyota Way: Kaizen (Eliminate Waste)

Quality Metrics

  • TDG Score: 96.3/100 (A+) ⬆️ +14.5 points from 81.8
    • Critical defects: 25 → 0 (100% eliminated)
    • Grade A+ files: 73.7% (up from 66.7%)
    • Zero F-grade files
  • SATD Violations: 3 → 2 (eliminated TODO in stub file)
  • Tests: 156/156 passing (100%)
    • All tests pass with trueno v0.6.0

Full Changelog: v0.2.0...v0.2.1

Release v0.2.0

21 Nov 11:40

Choose a tag to compare

v0.2.0 - Phase 1 MVP Complete 🎉

Status: 9/9 Core Tasks Complete | 127/127 Tests Passing | 95.58% Coverage

Phase 1 MVP Achievements

All 9 core tasks implemented, tested, and validated:

  • Arrow/Parquet storage with morsel-based paging (CORE-001)
  • Cost-based backend dispatcher with 5x rule (CORE-002)
  • JIT WGSL compiler for kernel fusion (CORE-003)
  • GPU kernels: SUM, MIN, MAX, COUNT, AVG (CORE-004)
  • SIMD fallback via trueno (AVX-512/AVX2) (CORE-005)
  • Backend equivalence tests (GPU == SIMD == Scalar) (CORE-006)
  • SQL parser (SELECT, WHERE, GROUP BY) (CORE-007)
  • PCIe transfer benchmarks (CORE-008)
  • Competitive benchmarking infrastructure (CORE-009)

SIMD Performance Benchmarks

Validated on AMD Threadripper 7960X with 1M rows:

Operation SIMD (µs) Scalar (µs) Speedup Status
SUM 228 634 2.78x ✅ Validated
MIN 228 1,048 4.60x ✅ Validated
MAX 228 257 1.13x ✅ Validated
AVG 228 634 2.78x ✅ Validated

Quality Metrics

  • Tests: 127/127 passing (100%)
    • Unit tests: 45/45 (includes JIT compiler tests)
    • Integration tests: 30/30
    • Backend tests: 23/23 (equivalence + selection + errors)
    • Property tests: 11/11 (1,100 scenarios)
    • Doc tests: 8/8 (2 ignored for GPU-only examples)
    • OOM prevention: 6/6
    • Query tests: 10/10
  • Code Coverage: 95.58% (target: >90%)
  • Clippy: 0 warnings (strict mode)
  • Property Tests: 1,100 scenarios validated

Major Features

JIT WGSL Compiler (CORE-003)

  • ShaderCache with Arc for thread-safe caching
  • Template-based code generation (Phase 1 MVP approach)
  • Fused filter+sum kernel (single pass, eliminates intermediate buffer)
  • Supports operators: gt, lt, eq, gte, lte, ne
  • Toyota Way: Muda elimination (waste of intermediate memory writes)

GPU Kernels (CORE-004)

  • MIN_I32 and MAX_I32 kernels using Harris 2007 2-stage reduction
  • atomicMin/atomicMax operations for global reduction
  • Workgroup size: 256 threads (8 GPU warps)
  • Parallel reduction algorithm for efficient aggregation

SIMD Fallback (CORE-005)

  • trueno v0.4.0 integration for auto-detected SIMD backend
  • Graceful degradation: AVX-512 → AVX2 → SSE2 → Scalar
  • Kahan summation for numerical stability
  • spawn_blocking isolation for CPU-bound work

Infrastructure

  • GitHub Actions CI/CD Pipeline: Fully automated quality gates
  • Performance Hero Shot: Visual comparison graphics
  • Production Examples: 3 comprehensive demo applications
  • Property-Based Testing: 1,100 scenarios validated
  • Red Team Audit: Adversarial verification (APPROVED)

Toyota Way Principles Applied

  • Jidoka (Built-in Quality): EXTREME TDD, backend equivalence tests
  • Kaizen (Continuous Improvement): All claims backed by benchmarks
  • Muda (Waste Elimination): Kernel fusion, zero-copy operations
  • Poka-Yoke (Mistake Proofing): Morsel paging prevents VRAM OOM
  • Genchi Genbutsu (Go and See): Physics-based cost model from real measurements
  • Heijunka (Level Load): spawn_blocking for async isolation

Installation

```toml
[dependencies]

Default: SIMD-only (fast compile, small binary)

trueno-db = "0.2.0"

With GPU support (opt-in, slower compile)

trueno-db = { version = "0.2.0", features = ["gpu"] }
```

Documentation

What's Next?

Phase 2: Multi-GPU

  • Local multi-GPU data partitioning
  • Work-stealing scheduler
  • Multi-GPU aggregation + reduce

See CHANGELOG.md for complete details.


Full Changelog: v0.1.0...v0.2.0

Academic Foundation: Built on peer-reviewed research from CIDR, SIGMOD, and VLDB conferences

🎉 Generated with Claude Code