RivetDB

Query everything — instantly.

RivetDB is a high-performance, federated query engine built on a smart, just-in-time cache.

It provides a Trino-style SQL interface for querying remote data sources without needing heavy infrastructure. The goal is to unify data access across systems while still getting fast, reliable performance on a single node.

RivetDB takes inspiration from DuckDB and the Small Data community, which has shown how much you can get out of a single machine when you pair simplicity, vectorized execution, and efficient columnar formats. RivetDB applies that same thinking to federated data, using Arrow-native execution to make remote sources feel local. We think this is going to be especially important as agents query disparate systems and want to avoid scattering data fetching logic.

Under the hood, RivetDB is built in Rust and powered by Apache DataFusion. The combination gives us strong safety guarantees, solid performance, and a proven execution engine that plays well with Arrow.

RivetDB is under active development, and the APIs will continue to shift as we move toward a stable 1.0 release.

🚧 Early Development: We are targeting a public preview with caching, lookup APIs, and a CLI in Q1 2026.
Expect breaking changes until the API surface stabilizes.

Getting Started

RivetDB is evolving quickly and installation instructions will change.
To experiment with the current engine:

docker pull ghcr.io/hotdata-dev/rivetdb:latest
docker run -p 3000:3000 ghcr.io/hotdata-dev/rivetdb:latest

Alternatively/additionally, you can build and run locally:

 cargo run --bin server config-local.toml

What RivetDB does Today

Just-in-time retrieval of remote data
Federated SQL interface inspired by Trino
Rust-powered engine focused on performance and correctness
Basic caching and early internal APIs
Initial examples and tests
Current adapter support:
- Postgres
- DuckDB
- MotherDuck

This foundation supports the larger roadmap described below.

Vision

RivetDB aims to become a unified query engine that eliminates challenges working with between disparate data systems. The project emphasizes:

A consistent SQL interface for structured, semi-structured, and remote data sources
Intelligent caching that adapts to query patterns and reduces data movement
Tooling that gives developers introspection into data, metadata, and performance
Millisecond startup times for on-demand ephemeral compute
Developer-friendly documentation and APIs

The long-term goal includes distributed caching, additional connectors, real-time introspection, and seamless orchestration integration.

Roadmap

Roadmap items below correspond directly to open issues in the repository.
For the latest view, consult: https://github.com/hotdata-dev/rivetdb/issues

📍 Version 0.1 — Core Feature Set (Current Development)

These features define the first stable preview of RivetDB:

Parquet Metadata Cache
Cache Parquet metadata to optimize planning and repeated reads.
Result Lookup API
Provide structured access to intermediate query results for debugging and tooling.
Query Metadata API
Expose information about query execution: planning, caching, durations, and more.
Optional Session IDs
Support optional sessions for multi-query workflows.
Table Caching Support
Enable caching of entire remote tables for repeated or incremental queries.
Arrow Flight SQL Support
Add high-performance transport for large result sets and client libraries.
RivetDB CLI
Build a command-line interface for running queries, inspecting cache state, and interacting with the engine.

These are high-priority items and represent the minimum surface for an early release.

🚀 Future Themes (Post-0.1)

These represent emerging priorities after the first public preview:

Additional Connectors
Planned support for major databases and warehouses, including:
- MySQL / MariaDB
- SQLite
- BigQuery
- Snowflake
- Redshift
- Databricks / Unity Catalog
- ClickHouse
- Athena
- Synapse / Fabric
- Generic JDBC
- REST and streaming sources
- Cloud storage systems (S3, GCS, Azure Blob)
Distributed Caching & Invalidation
Peer-aware caching and consistency guarantees across nodes.
Observability & Telemetry
Metrics, tracing, execution timelines, and debugger-friendly diagnostics.
Developer Tooling & IDE Integrations
Schema discovery, query explain tools, and interactive exploration.

Feature Status (At-a-Glance)

Feature	Status
Parquet Metadata Cache	Planned
Result Lookup API	Planned
Query Metadata API	Planned
Table Caching	Planned
Arrow Flight SQL	Planned
RivetDB CLI	Planned
Cache	Alpha
Current Connectors: Postgres, DuckDB, MotherDuck	Alpha
Observability	Backlog
Additional Connectors	Backlog

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.cargo		.cargo
.github/workflows		.github/workflows
docs/plans		docs/plans
logo		logo
src		src
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
config-docker.toml		config-docker.toml
config-local.toml		config-local.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RivetDB

Getting Started

What RivetDB does Today

Vision

Roadmap

📍 Version 0.1 — Core Feature Set (Current Development)

🚀 Future Themes (Post-0.1)

Feature Status (At-a-Glance)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

hotdata-dev/rivetdb

Folders and files

Latest commit

History

Repository files navigation

RivetDB

Getting Started

What RivetDB does Today

Vision

Roadmap

📍 Version 0.1 — Core Feature Set (Current Development)

🚀 Future Themes (Post-0.1)

Feature Status (At-a-Glance)

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages