Skip to content

Commit 2daeb05

Browse files
authored
Revise README with project details and roadmap (#29)
Updated the README to enhance project description, installation instructions, and roadmap details.
1 parent ac277f2 commit 2daeb05

File tree

1 file changed

+145
-5
lines changed

1 file changed

+145
-5
lines changed

README.md

Lines changed: 145 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,12 +5,152 @@
55
</picture>
66
</div>
77

8-
# Rivet DB
9-
108
> _Query everything — **instantly**._
119
12-
**RivetDB** lets you run Trino-style federated queries over a smart, queryable cache.
13-
Remote data is fetched just-in-time and cached for ultra-fast reuse.
10+
# RivetDB
11+
12+
Query everything — instantly.
13+
14+
RivetDB is a high-performance, federated query engine built on a smart, just-in-time cache.
15+
It provides a Trino-style SQL interface for querying remote data sources without needing heavy infrastructure.
16+
The goal is to unify data access across systems while still getting fast, reliable performance on a single node.
17+
18+
RivetDB takes insperation from the [DuckDB](https://duckdb.org/) and SmallData community, which has shown how much you can get out of a single machine when you pair simplicity, vectorized execution, and efficient columnar formats. RivetDB applies that same thinking to federated data, using Arrow-native execution to make remote sources feel local. We think this is going to be especially important as agents query disparate systems and want to avoid scattering data fetching logic.
19+
20+
Under the hood, RivetDB is built in Rust and powered by [Apache DataFusion](https://datafusion.apache.org/). The combination gives us strong safety guarantees, solid performance, and a proven execution engine that plays well with Arrow.
21+
22+
RivetDB is under active development, and the APIs will continue to shift as we move toward a stable 1.0 release.
23+
ment, and the APIs will continue to shift as we move toward a stable 1.0 release.
24+
25+
26+
> **🚧 Early Development:** We are targeting a public preview with caching, lookup APIs, and a CLI in **Q1 2026**.
27+
> Expect breaking changes until the API surface stabilizes.
28+
29+
---
30+
31+
## Getting Started
32+
33+
RivetDB is evolving quickly and installation instructions will change.
34+
To experiment with the current engine:
35+
36+
```bash
37+
docker pull ghcr.io/hotdata-dev/rivetdb:latest
38+
docker run -p 3000:3000 ghcr.io/hotdata-dev/rivetdb:latest
39+
```
40+
41+
Alternatively/additionally, you can build and run locally:
42+
```bash
43+
cargo run --bin server config-local.toml
44+
```
45+
46+
---
47+
48+
49+
## What RivetDB does Today
50+
51+
- Just-in-time retrieval of remote data
52+
- Federated SQL interface inspired by Trino
53+
- Rust-powered engine focused on performance and correctness
54+
- Basic caching and early internal APIs
55+
- Initial examples and tests
56+
- **Current adapter support:**
57+
- **Postgres**
58+
- **DuckDB**
59+
- **MotherDuck**
60+
61+
This foundation supports the larger roadmap described below.
62+
63+
---
64+
65+
## Vision
66+
67+
RivetDB aims to become a unified query engine that eliminates challenges working with between disparate data systems. The project emphasizes:
68+
69+
- A consistent SQL interface for structured, semi-structured, and remote data sources
70+
- Intelligent caching that adapts to query patterns and reduces data movement
71+
- Tight integration with modern data platforms
72+
- Tooling that gives developers fast introspection into data, metadata, and performance
73+
- Clear APIs and predictable behavior
74+
75+
The long-term trajectory includes distributed caching, richer connectors, real-time introspection, and seamless orchestration integration.
76+
77+
---
1478

1579
## Roadmap
16-
Coming soon! We're planning to release the first version in early January.
80+
81+
Roadmap items below correspond directly to open issues in the repository.
82+
For the latest view, consult: https://github.com/hotdata-dev/rivetdb/issues
83+
84+
### **📍 Version 0.1 — Core Feature Set (Current Development)**
85+
86+
These features define the first stable preview of RivetDB:
87+
88+
- **Parquet Metadata Cache**
89+
Cache Parquet metadata to optimize planning and repeated reads.
90+
91+
- **Result Lookup API**
92+
Provide structured access to intermediate query results for debugging and tooling.
93+
94+
- **Query Metadata API**
95+
Expose information about query execution: planning, caching, durations, and more.
96+
97+
- **Optional Session IDs**
98+
Support optional sessions for multi-query workflows.
99+
100+
- **Table Caching Support**
101+
Enable caching of entire remote tables for repeated or incremental queries.
102+
103+
- **Arrow Flight SQL Support**
104+
Add high-performance transport for large result sets and client libraries.
105+
106+
- **RivetDB CLI**
107+
Build a command-line interface for running queries, inspecting cache state, and interacting with the engine.
108+
109+
These are high-priority items and represent the minimum surface for an early release.
110+
111+
---
112+
113+
### **🚀 Future Themes (Post-0.1)**
114+
115+
These represent emerging priorities after the first public preview:
116+
117+
- **Additional Connectors**
118+
Planned support for major databases and warehouses, including:
119+
- MySQL / MariaDB
120+
- SQLite
121+
- BigQuery
122+
- Snowflake
123+
- Redshift
124+
- Databricks / Unity Catalog
125+
- ClickHouse
126+
- Athena
127+
- Synapse / Fabric
128+
- Generic JDBC
129+
- REST and streaming sources
130+
- Cloud storage systems (S3, GCS, Azure Blob)
131+
132+
- **Distributed Caching & Invalidation**
133+
Peer-aware caching and consistency guarantees across nodes.
134+
135+
- **Observability & Telemetry**
136+
Metrics, tracing, execution timelines, and debugger-friendly diagnostics.
137+
138+
- **Developer Tooling & IDE Integrations**
139+
Schema discovery, query explain tools, and interactive exploration.
140+
141+
---
142+
143+
## Feature Status (At-a-Glance)
144+
145+
| Feature | Status |
146+
|--------|--------|
147+
| Parquet Metadata Cache | Planned |
148+
| Result Lookup API | Planned |
149+
| Query Metadata API | Planned |
150+
| Table Caching | Planned |
151+
| Arrow Flight SQL | Planned |
152+
| RivetDB CLI | Planned |
153+
| Cache | Alpha |
154+
| Current Connectors: Postgres, DuckDB, MotherDuck | Alpha |
155+
| Observability | Backlog |
156+
| Additional Connectors | Backlog |

0 commit comments

Comments
 (0)