|
5 | 5 | </picture> |
6 | 6 | </div> |
7 | 7 |
|
8 | | -# Rivet DB |
9 | | - |
10 | 8 | > _Query everything — **instantly**._ |
11 | 9 |
|
12 | | -**RivetDB** lets you run Trino-style federated queries over a smart, queryable cache. |
13 | | -Remote data is fetched just-in-time and cached for ultra-fast reuse. |
| 10 | +# RivetDB |
| 11 | + |
| 12 | +Query everything — instantly. |
| 13 | + |
| 14 | +RivetDB is a high-performance, federated query engine built on a smart, just-in-time cache. |
| 15 | +It provides a Trino-style SQL interface for querying remote data sources without needing heavy infrastructure. |
| 16 | +The goal is to unify data access across systems while still getting fast, reliable performance on a single node. |
| 17 | + |
| 18 | +RivetDB takes insperation from the [DuckDB](https://duckdb.org/) and SmallData community, which has shown how much you can get out of a single machine when you pair simplicity, vectorized execution, and efficient columnar formats. RivetDB applies that same thinking to federated data, using Arrow-native execution to make remote sources feel local. We think this is going to be especially important as agents query disparate systems and want to avoid scattering data fetching logic. |
| 19 | + |
| 20 | +Under the hood, RivetDB is built in Rust and powered by [Apache DataFusion](https://datafusion.apache.org/). The combination gives us strong safety guarantees, solid performance, and a proven execution engine that plays well with Arrow. |
| 21 | + |
| 22 | +RivetDB is under active development, and the APIs will continue to shift as we move toward a stable 1.0 release. |
| 23 | +ment, and the APIs will continue to shift as we move toward a stable 1.0 release. |
| 24 | + |
| 25 | + |
| 26 | +> **🚧 Early Development:** We are targeting a public preview with caching, lookup APIs, and a CLI in **Q1 2026**. |
| 27 | +> Expect breaking changes until the API surface stabilizes. |
| 28 | +
|
| 29 | +--- |
| 30 | + |
| 31 | +## Getting Started |
| 32 | + |
| 33 | +RivetDB is evolving quickly and installation instructions will change. |
| 34 | +To experiment with the current engine: |
| 35 | + |
| 36 | +```bash |
| 37 | +docker pull ghcr.io/hotdata-dev/rivetdb:latest |
| 38 | +docker run -p 3000:3000 ghcr.io/hotdata-dev/rivetdb:latest |
| 39 | +``` |
| 40 | + |
| 41 | +Alternatively/additionally, you can build and run locally: |
| 42 | +```bash |
| 43 | + cargo run --bin server config-local.toml |
| 44 | +``` |
| 45 | + |
| 46 | +--- |
| 47 | + |
| 48 | + |
| 49 | +## What RivetDB does Today |
| 50 | + |
| 51 | +- Just-in-time retrieval of remote data |
| 52 | +- Federated SQL interface inspired by Trino |
| 53 | +- Rust-powered engine focused on performance and correctness |
| 54 | +- Basic caching and early internal APIs |
| 55 | +- Initial examples and tests |
| 56 | +- **Current adapter support:** |
| 57 | + - **Postgres** |
| 58 | + - **DuckDB** |
| 59 | + - **MotherDuck** |
| 60 | + |
| 61 | +This foundation supports the larger roadmap described below. |
| 62 | + |
| 63 | +--- |
| 64 | + |
| 65 | +## Vision |
| 66 | + |
| 67 | +RivetDB aims to become a unified query engine that eliminates challenges working with between disparate data systems. The project emphasizes: |
| 68 | + |
| 69 | +- A consistent SQL interface for structured, semi-structured, and remote data sources |
| 70 | +- Intelligent caching that adapts to query patterns and reduces data movement |
| 71 | +- Tight integration with modern data platforms |
| 72 | +- Tooling that gives developers fast introspection into data, metadata, and performance |
| 73 | +- Clear APIs and predictable behavior |
| 74 | + |
| 75 | +The long-term trajectory includes distributed caching, richer connectors, real-time introspection, and seamless orchestration integration. |
| 76 | + |
| 77 | +--- |
14 | 78 |
|
15 | 79 | ## Roadmap |
16 | | -Coming soon! We're planning to release the first version in early January. |
| 80 | + |
| 81 | +Roadmap items below correspond directly to open issues in the repository. |
| 82 | +For the latest view, consult: https://github.com/hotdata-dev/rivetdb/issues |
| 83 | + |
| 84 | +### **📍 Version 0.1 — Core Feature Set (Current Development)** |
| 85 | + |
| 86 | +These features define the first stable preview of RivetDB: |
| 87 | + |
| 88 | +- **Parquet Metadata Cache** |
| 89 | + Cache Parquet metadata to optimize planning and repeated reads. |
| 90 | + |
| 91 | +- **Result Lookup API** |
| 92 | + Provide structured access to intermediate query results for debugging and tooling. |
| 93 | + |
| 94 | +- **Query Metadata API** |
| 95 | + Expose information about query execution: planning, caching, durations, and more. |
| 96 | + |
| 97 | +- **Optional Session IDs** |
| 98 | + Support optional sessions for multi-query workflows. |
| 99 | + |
| 100 | +- **Table Caching Support** |
| 101 | + Enable caching of entire remote tables for repeated or incremental queries. |
| 102 | + |
| 103 | +- **Arrow Flight SQL Support** |
| 104 | + Add high-performance transport for large result sets and client libraries. |
| 105 | + |
| 106 | +- **RivetDB CLI** |
| 107 | + Build a command-line interface for running queries, inspecting cache state, and interacting with the engine. |
| 108 | + |
| 109 | +These are high-priority items and represent the minimum surface for an early release. |
| 110 | + |
| 111 | +--- |
| 112 | + |
| 113 | +### **🚀 Future Themes (Post-0.1)** |
| 114 | + |
| 115 | +These represent emerging priorities after the first public preview: |
| 116 | + |
| 117 | +- **Additional Connectors** |
| 118 | + Planned support for major databases and warehouses, including: |
| 119 | + - MySQL / MariaDB |
| 120 | + - SQLite |
| 121 | + - BigQuery |
| 122 | + - Snowflake |
| 123 | + - Redshift |
| 124 | + - Databricks / Unity Catalog |
| 125 | + - ClickHouse |
| 126 | + - Athena |
| 127 | + - Synapse / Fabric |
| 128 | + - Generic JDBC |
| 129 | + - REST and streaming sources |
| 130 | + - Cloud storage systems (S3, GCS, Azure Blob) |
| 131 | + |
| 132 | +- **Distributed Caching & Invalidation** |
| 133 | + Peer-aware caching and consistency guarantees across nodes. |
| 134 | + |
| 135 | +- **Observability & Telemetry** |
| 136 | + Metrics, tracing, execution timelines, and debugger-friendly diagnostics. |
| 137 | + |
| 138 | +- **Developer Tooling & IDE Integrations** |
| 139 | + Schema discovery, query explain tools, and interactive exploration. |
| 140 | + |
| 141 | +--- |
| 142 | + |
| 143 | +## Feature Status (At-a-Glance) |
| 144 | + |
| 145 | +| Feature | Status | |
| 146 | +|--------|--------| |
| 147 | +| Parquet Metadata Cache | Planned | |
| 148 | +| Result Lookup API | Planned | |
| 149 | +| Query Metadata API | Planned | |
| 150 | +| Table Caching | Planned | |
| 151 | +| Arrow Flight SQL | Planned | |
| 152 | +| RivetDB CLI | Planned | |
| 153 | +| Cache | Alpha | |
| 154 | +| Current Connectors: Postgres, DuckDB, MotherDuck | Alpha | |
| 155 | +| Observability | Backlog | |
| 156 | +| Additional Connectors | Backlog | |
0 commit comments