Skip to content

Commit 4f1306b

Browse files
authored
Describe upcoming API breaks in docs (#5583)
This probably needs some more detail, but I could do with fresh eyes to let me know what's missing as I'm too deep into the context. Signed-off-by: Nicholas Gates <[email protected]>
1 parent 191228e commit 4f1306b

File tree

3 files changed

+57
-0
lines changed

3 files changed

+57
-0
lines changed

docs/guides/wip-lazy-evaluation.md

Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# Work-in-progress: Lazy Evaluation in Vortex
2+
3+
This guide intends to provide an overview of the in-flight and upcoming changes to Vortex to enable
4+
fully lazy evaluation of Vortex arrays.
5+
6+
Hopefully this document helps users and contributors understand the design decisions and plan around
7+
the upcoming breaking API changes required to implement this feature.
8+
9+
The motivation for this work comes in many parts, including:
10+
11+
* Support for alternate execution models such as GPU, pipelined CPU, or JIT-compiled CPU.
12+
* Improved scan performance with common-subtree elimination.
13+
* Improved visibility into the optimizations that Vortex applies by making the computation graph explicit.
14+
* Easier to benchmark and improvement performance of individual compute functions by isolating them from
15+
lazy decompression logic.
16+
* Easier to extend Vortex with new compute functions, such as geo-spatial functionality.
17+
* Simpler to implement custom arrays and layouts by reducing the API surface area.
18+
* Enabling more advanced statistics and pruning such as using bloom filters and free-text indexes.
19+
20+
## Summary of Changes
21+
22+
* Define `vortex-vector` as a fully decompressed in-memory format used for CPU computation.
23+
* Vortex `Array` to represent a logical decompression plan.
24+
* Introduce `ScalarFn` to define semantics and implementation of scalar compute over Vortex vectors.
25+
* Make `Expression` a non-pluggable closed enum. Plugins will implement `ScalarFn` instead.
26+
* Note this avoids the current situation we're in where all arrays need to know about all compute functions.
27+
* Introduce `ScalarFnArray` to represent lazy application of a `ScalarFn` over one or more Vortex arrays.
28+
* Existing compute function dispatch is re-implemented as Array optimization rules.
29+
* Redesign the `Layout` API to use simpler optimization rules instead of complex expression partitioning.
30+
* Implement statistics falsification as optimizer rules over expressions.
31+
* e.g. `falsify(a > 10)` becomes `stat.max(a) <= 10`.
32+
* This also enables custom falsification expressions such as bloom filter checks.

docs/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,7 @@ maxdepth: 1
111111
caption: User Guides
112112
---
113113
114+
guides/wip-lazy-evaluation
114115
guides/python-integrations
115116
guides/writing-an-encoding
116117
```

uv.lock

Lines changed: 24 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)