Skip to content

Dashboard: support Results Preview for Flotilla (distributed) queries #6559

@samstokes

Description

@samstokes

Problem

The Results Preview tab in the dashboard only works for Swordfish (native) queries. For Flotilla (distributed Ray) queries, the tab is currently hidden (see #6557).

Why it doesn't work today

The dashboard subscriber collects result previews via on_result_out(), which requires a MicroPartition. In the Flotilla path, results are ray.ObjectRefs — calling ctx._notify_result_out() would require a ray.get() to materialize each partition, which is expensive and defeats the purpose of lazy distributed execution.

The native runner (native_runner.py:138) calls ctx._notify_result_out(query_id, result.partition()) because results are already local MicroPartitions. The ray runner (ray_runner.py:625-628) skips this entirely.

Suggested approach

Options to explore:

  1. Materialize a small preview on the driver: After yielding the first few results, do a ray.get() on a small subset (e.g. first partition, head N rows) and call _notify_result_out() with that. This keeps the cost bounded.
  2. Post-hoc preview: After the query finishes, materialize a small preview from the collected result partition set and send it to the dashboard.
  3. Worker-side preview: Have Flotilla workers send preview data back to the driver as a side-channel, avoiding full materialization.

Context

  • _notify_result_out in daft/context.py:99 explicitly raises ValueError("Query Managers only support the Native Runner for now")

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions