Revise README with project details and roadmap (#29)

eddietejeda · web-flow · commit 2daeb05827d7 · 2025-12-11T16:08:37.000-08:00
Updated the README to enhance project description, installation instructions, and roadmap details.
diff --git a/README.md b/README.md
@@ -5,12 +5,152 @@
   </picture>
 </div>
 
-# Rivet DB
-
 > _Query everything — **instantly**._
 
-**RivetDB** lets you run Trino-style federated queries over a smart, queryable cache.
-Remote data is fetched just-in-time and cached for ultra-fast reuse.
+# RivetDB
+
+Query everything — instantly.
+
+RivetDB is a high-performance, federated query engine built on a smart, just-in-time cache.
+It provides a Trino-style SQL interface for querying remote data sources without needing heavy infrastructure.
+The goal is to unify data access across systems while still getting fast, reliable performance on a single node.
+
+RivetDB takes insperation from the [DuckDB](https://duckdb.org/)  and SmallData community, which has shown how much you can get out of a single machine when you pair simplicity, vectorized execution, and efficient columnar formats. RivetDB applies that same thinking to federated data, using Arrow-native execution to make remote sources feel local. We think this is going to be especially important as agents query disparate systems and want to avoid scattering data fetching logic.
+
+Under the hood, RivetDB is built in Rust and powered by [Apache DataFusion](https://datafusion.apache.org/). The combination gives us strong safety guarantees, solid performance, and a proven execution engine that plays well with Arrow.
+
+RivetDB is under active development, and the APIs will continue to shift as we move toward a stable 1.0 release.
+ment, and the APIs will continue to shift as we move toward a stable 1.0 release.
+
+
+> **🚧 Early Development:** We are targeting a public preview with caching, lookup APIs, and a CLI in **Q1 2026**.  
+> Expect breaking changes until the API surface stabilizes.
+
+---
+
+## Getting Started
+
+RivetDB is evolving quickly and installation instructions will change.  
+To experiment with the current engine:
+
+```bash
+docker pull ghcr.io/hotdata-dev/rivetdb:latest
+docker run -p 3000:3000 ghcr.io/hotdata-dev/rivetdb:latest
+```
+
+Alternatively/additionally, you can build and run locally: 
+```bash
+ cargo run --bin server config-local.toml
+```
+
+---
+
+
+## What RivetDB does Today
+
+- Just-in-time retrieval of remote data  
+- Federated SQL interface inspired by Trino  
+- Rust-powered engine focused on performance and correctness  
+- Basic caching and early internal APIs  
+- Initial examples and tests  
+- **Current adapter support:**  
+  - **Postgres**  
+  - **DuckDB**  
+  - **MotherDuck**  
+
+This foundation supports the larger roadmap described below.
+
+---
+
+## Vision
+
+RivetDB aims to become a unified query engine that eliminates challenges working with between disparate data systems. The project emphasizes:
+
+- A consistent SQL interface for structured, semi-structured, and remote data sources
+- Intelligent caching that adapts to query patterns and reduces data movement
+- Tight integration with modern data platforms
+- Tooling that gives developers fast introspection into data, metadata, and performance
+- Clear APIs and predictable behavior
+
+The long-term trajectory includes distributed caching, richer connectors, real-time introspection, and seamless orchestration integration.
+
+---
 
 ## Roadmap
-Coming soon! We're planning to release the first version in early January.
+
+Roadmap items below correspond directly to open issues in the repository.  
+For the latest view, consult: https://github.com/hotdata-dev/rivetdb/issues
+
+### **📍 Version 0.1 — Core Feature Set (Current Development)**
+
+These features define the first stable preview of RivetDB:
+
+- **Parquet Metadata Cache**  
+  Cache Parquet metadata to optimize planning and repeated reads.
+
+- **Result Lookup API**  
+  Provide structured access to intermediate query results for debugging and tooling.
+
+- **Query Metadata API**  
+  Expose information about query execution: planning, caching, durations, and more.
+
+- **Optional Session IDs**  
+  Support optional sessions for multi-query workflows.
+
+- **Table Caching Support**  
+  Enable caching of entire remote tables for repeated or incremental queries.
+
+- **Arrow Flight SQL Support**  
+  Add high-performance transport for large result sets and client libraries.
+
+- **RivetDB CLI**  
+  Build a command-line interface for running queries, inspecting cache state, and interacting with the engine.
+
+These are high-priority items and represent the minimum surface for an early release.
+
+---
+
+### **🚀 Future Themes (Post-0.1)**
+
+These represent emerging priorities after the first public preview:
+
+- **Additional Connectors**  
+  Planned support for major databases and warehouses, including:  
+  - MySQL / MariaDB  
+  - SQLite  
+  - BigQuery  
+  - Snowflake  
+  - Redshift  
+  - Databricks / Unity Catalog  
+  - ClickHouse  
+  - Athena  
+  - Synapse / Fabric  
+  - Generic JDBC  
+  - REST and streaming sources  
+  - Cloud storage systems (S3, GCS, Azure Blob)
+
+- **Distributed Caching & Invalidation**  
+  Peer-aware caching and consistency guarantees across nodes.
+
+- **Observability & Telemetry**  
+  Metrics, tracing, execution timelines, and debugger-friendly diagnostics.
+
+- **Developer Tooling & IDE Integrations**  
+  Schema discovery, query explain tools, and interactive exploration.
+
+---
+
+## Feature Status (At-a-Glance)
+
+| Feature | Status |
+|--------|--------|
+| Parquet Metadata Cache | Planned |
+| Result Lookup API | Planned |
+| Query Metadata API | Planned |
+| Table Caching | Planned |
+| Arrow Flight SQL | Planned |
+| RivetDB CLI | Planned |
+| Cache | Alpha |
+| Current Connectors: Postgres, DuckDB, MotherDuck | Alpha |
+| Observability | Backlog |
+| Additional Connectors | Backlog |