Skip to content

DeepOLA: Project Tracker #52

@pyongjoo

Description

@pyongjoo

Paper Writing

Our Implementation

  • (T1) In place of ArrayRow, we have another class for DataFrame
    • Conceptually, we process partition by partition. What is a partition? A partition = A dataframe (with projected/subset of columns).
    • We need to pass a series of (DataFrame, Meta)
    • How can we read/construct DataFrame in partitions?
      • rust arrow already supports iterator-based reads (we doubt it has)
      • pre-partition a table into multiple csv files
  • (T2, which is blocked by T1) Convert join-free TPC-H queries into a node-based structure
  • (T3) hash table thing (for joins, it is better to keep hash tables, rather than constructing them for every partition)
  • (T4, blocked by T3) Convert with-join TPC-H queries into a node-based structure
  • (T5) Do something for ``where subquery''
  • (T6, blocked T5) Convert remaining (i.e., with-subquery) TPC-H queries
  • Example: https://github.com/illinoisdata/DeepOLA/blob/main/rust/runtime/examples/tpch/q1.rs
  • Supawit: working on estimation logic

Experiments: end-to-end

Experiments: estimation accuracy

Experiments: others

  • Impact of batch size

Reference

Metadata

Metadata

Assignees

Labels

documentationImprovements or additions to documentation

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions