-
-
Notifications
You must be signed in to change notification settings - Fork 840
Description
In #3081, we are integrating VegaFusion to make it possible to extract the transformed data from an Altair Chart object. As discussed in #3081 (comment), it's even going to be possible to extract the transformed data from a chart without any marks configured. e.g:
import pandas as pd
chart = alt.Chart(
pd.DataFrame({"a": [1, 2, 3], "b": ["A", "BB", "CCC"]})
)
chart.transform_filter("datum.a > 1")._transformed_data()| a | b | |
|---|---|---|
| 0 | 2 | BB |
| 1 | 3 | CCC |
In this case, the Chart object is actually being used more like a lazy DataFrame than a chart. What if we added an alt.DataFrame class that includes a subset of the alt.Chart methods. In particular:
- The
.transform_*methods, which would return a newalt.DataFrame. And maybe we even drop thetransform_*prefix. - The
.mark_*methods, which would return a newalt.Chart.
VegaFusion doesn't do this efficiently yet, but I'd also picture supporting a .dtypes property that would return the output pandas data types for the alt.DataFrame. We could even use these output dtypes for encoding type inference (the way we currently only do for pandas DataFrames).
Alternative: maybe this functionality could be combined with the existing alt.Data sub classes, so that you could do things like:
alt.UrlData("https://path/to/file.csv").filter("datum.a > 1").transformed_data()
alt.UrlData("https://path/to/file.csv").filter("datum.a > 1").mark_point().encode(...)Please follow these steps to make it more efficient to respond to your feature request.
- Describe the feature's goal, motivating use cases, and its expected behavior.