|
9 | 9 | [](https://central.sonatype.com/artifact/dev.vortex/vortex-spark) |
10 | 10 | [](https://codecov.io/github/vortex-data/vortex) |
11 | 11 |
|
12 | | -🫶 [Join the community on Slack!](https://vortex.dev/slack) | 📚 [Documentation](https://docs.vortex.dev/) | 📊 [Performance Benchmarks](https://bench.vortex.dev) |
| 12 | +[Join the community on Slack!](https://vortex.dev/slack) | [Documentation](https://docs.vortex.dev/) | [Performance Benchmarks](https://bench.vortex.dev) |
13 | 13 |
|
14 | 14 | ## Overview |
15 | 15 |
|
16 | 16 | Vortex is a next-generation columnar file format and toolkit designed for high-performance data processing. |
17 | 17 | It is the fastest and most extensible format for building data systems backed by object storage. It provides: |
18 | 18 |
|
19 | | -- **⚡️ Blazing Fast Performance** |
| 19 | +- **Blazing Fast Performance** |
20 | 20 | - 100x faster random access reads (vs. modern Apache Parquet) |
21 | 21 | - 10-20x faster scans |
22 | 22 | - 5x faster writes |
23 | 23 | - Similar compression ratios |
24 | 24 | - Efficient support for wide tables with zero-copy/zero-parse metadata |
25 | 25 |
|
26 | | -- **🔧 Extensible Architecture** |
| 26 | +- **Extensible Architecture** |
27 | 27 | - Modeled after Apache DataFusion's extensible approach |
28 | 28 | - Pluggable encoding system, type system, compression strategy, & layout strategy |
29 | 29 | - Zero-copy compatibility with Apache Arrow |
30 | 30 |
|
31 | | -- **🗳️ Open Source, Neutral Governance** |
| 31 | +- **Open Source, Neutral Governance** |
32 | 32 | - A Linux Foundation (LF AI & Data) Project |
33 | 33 | - Apache-2.0 Licensed |
34 | 34 |
|
35 | | -- **↔️ Integrations** |
| 35 | +- **Integrations** |
36 | 36 | - Arrow, DataFusion, DuckDB, Spark, Pandas, Polars, & more |
37 | 37 | - Apache Iceberg (coming soon) |
38 | 38 |
|
39 | 39 | > 🟢 **Development Status**: Library APIs may change from version to version, but we now consider |
40 | | -> the file format <ins>*stable*</ins>. From release 0.36.0, all future releases of Vortex should |
| 40 | +> the file format <ins>_stable_</ins>. From release 0.36.0, all future releases of Vortex should |
41 | 41 | > maintain backwards compatibility of the file format (i.e., be able to read files written by |
42 | 42 | > any earlier version >= 0.36.0). |
43 | 43 |
|
44 | 44 | ## Key Features |
45 | 45 |
|
46 | 46 | ### Core Capabilities |
47 | 47 |
|
48 | | -- ✨ **Logical Types** - Clean separation between logical schema and physical layout |
49 | | -- 🔄 **Zero-Copy Arrow Integration** - Seamless conversion to/from Apache Arrow arrays |
50 | | -- 🧩 **Extensible Encodings** - Pluggable physical layouts with built-in optimizations |
51 | | -- 📦 **Cascading Compression** - Support for nested encoding schemes |
52 | | -- 🚀 **High-Performance Computing** - Optimized compute kernels for encoded data |
53 | | -- 📊 **Rich Statistics** - Lazy-loaded summary statistics for optimization |
| 48 | +- **Logical Types** - Clean separation between logical schema and physical layout |
| 49 | +- **Zero-Copy Arrow Integration** - Seamless conversion to/from Apache Arrow arrays |
| 50 | +- **Extensible Encodings** - Pluggable physical layouts with built-in optimizations |
| 51 | +- **Cascading Compression** - Support for nested encoding schemes |
| 52 | +- **High-Performance Computing** - Optimized compute kernels for encoded data |
| 53 | +- **Rich Statistics** - Lazy-loaded summary statistics for optimization |
54 | 54 |
|
55 | 55 | ### Technical Architecture |
56 | 56 |
|
@@ -152,7 +152,7 @@ If you discovery a security vulnerability, please email < [email protected]> |
152 | 152 | Copyright © Vortex a Series of LF Projects, LLC. |
153 | 153 | For terms of use, trademark policy, and other project policies please see <https://lfprojects.org> |
154 | 154 |
|
155 | | -## Acknowledgments 🏆 |
| 155 | +## Acknowledgments |
156 | 156 |
|
157 | 157 | The Vortex project benefits enormously from groundbreaking work from the academic & open-source communities. |
158 | 158 |
|
|
0 commit comments