-
Notifications
You must be signed in to change notification settings - Fork 16
feat: DH-18281: Add dx filter support #1185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: DH-18281: Add dx filter support #1185
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces filtering support for dx charts by adding two new parameters – filter_by and required_filter_by – that enable users to filter data based on specific column(s) when rendering plots. Key changes include new tests verifying widget sendMessage behavior with filters, a new FilterColumn type for handling filter metadata, and modifications across multiple plot functions and rendering utilities to integrate filtering.
Reviewed Changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| plugins/plotly-express/src/js/src/PlotlyExpressChartModel.test.ts | Added tests to validate widget messaging when optional and required filters are applied; also introduced an import that appears unused. |
| plugins/plotly-express/src/deephaven/plot/express/types/utility.py | Introduced the FilterColumn namedtuple for filter metadata. |
| plugins/plotly-express/src/deephaven/plot/express/plots/* | Updated plot function signatures and docstrings to document filter_by and required_filter_by parameters. |
| plugins/plotly-express/src/deephaven/plot/express/plots/_private_utils.py | Added helper functions to process filter_by arguments for both normal and PartitionedTables. |
| plugins/plotly-express/src/deephaven/plot/express/plots/_layer.py, PartitionManager.py, and DeephavenFigure* | Integrated filter column handling into figure generation, layering, and graph management. |
| plugins/plotly-express/docs/filter-by.md | Added documentation and examples for using the new filtering functionality. |
plugins/plotly-express/src/js/src/PlotlyExpressChartModel.test.ts
Outdated
Show resolved
Hide resolved
| if filters is None and (filter_by or required_filter_by): | ||
| # if there are input filters wait for them before creating the proper chart | ||
| # the python figure is created, then the filters are sent from the client | ||
| self.send_default_figure = True |
Copilot
AI
Jun 4, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The logic for setting 'send_default_figure' based on the presence of 'filters' along with filter_by and required_filter_by is a bit nested; consider refactoring or adding explanatory comments to improve maintainability.
This requires deephaven/deephaven-plugins#1185 for full support, but this is all that's needed to support linker directly with those changes.
mofojed
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some things to cleanup.
For my comment about how it will tie in with sharing tables between workers... I know that's a bit more of a complicated problem, but it's something I think we should think about/discuss soon.
| PlotlyChartWidgetData, | ||
| setDefaultValueFormat, | ||
| } from './PlotlyExpressChartUtils'; | ||
| import { l } from 'vite/dist/node/types.d-aGj9QkWt'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with Copilot - what is this import? I'm guessing it was auto-added accidentally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
|
||
| To plot a subset of a table based on a column value, use the `filter_by` and `required_filter_by` parameters. These parameters accept column(s) denoting variables to filter on in the dataset. The plot shows only the data that matches the filter criteria. `filter_by` does not require the [input filter](https://deephaven.io/core/docs/how-to-guides/user-interface/filters/#input-filters) or [linker](https://deephaven.io/core/docs/how-to-guides/user-interface/filters/#linker) to be set on that column whereas `required_filter_by` does. | ||
|
|
||
| Under the hood, the Deephaven query engine performs a `partition_by` table operation on the given filter column. This efficient implementation means that plots with many groups can be filtered and redrawn quickly, even with large datasets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is slightly different than how one_click currently works on plots, and I think it's an improvement, but I want to acknowledge the difference.
Existing one_click documentation: https://deephaven.io/core/docs/reference/plot/one-click/
I rewrote the example to show it with express:
from deephaven import read_csv
from deephaven.plot.selectable_dataset import one_click
from deephaven.plot.figure import Figure
source = read_csv(
"https://media.githubusercontent.com/media/deephaven/examples/main/CryptoCurrencyHistory/CSV/CryptoTrades_20210922.csv"
)
oc = one_click(t=source, by=["Instrument"])
plot = Figure().plot_xy(series_name="Plot", t=oc, x="Timestamp", y="Price").show()
flp2 = dx.line(source, x="Timestamp", y="Price", filter_by="Instrument")And the two plots look quite different initially, with plot looking dumb because it's not filtered, and flp2 looking good just showing all the different partitions:

Just want to acknowledge it. Once filtered they look the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is definitely a good thing to point out, and has gotten me thinking. I've definitely seen that dumb example before and it doesn't make sense since it's implying continuity where there is none. I think if you're dealing with categorical values to filter on, the new behavior is way better. It's possible, however, you would want the old behavior, especially in cases of numeric columns that have lots of different values in the column. In that case, the new behavior is definitely worse in terms of performance and usability.
I think this calls for other arguments that are also filters, but behave differently. They probably don't have to be implemented now, although they could be and will touch much of the same code.
These arguments could be called filter (if we don't mind that there could be confusion) or even where (with required counterparts in either case). I'd lean just naming them filter, as it forms a natural progression.
filter just filters on the column -> filter_by filters and partitions on the columns -> required_filter_by filters, partitions, and is required
required_filter is also a natural progression of filter, in that it is required. It's a bit of an odd one in that it will often look the same as required_filter_by, but won't do the partition so is much better suited for cases when the partition isn't desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a little note about this for now. If we want to add the separate filter capability let me know and I will create a ticket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth adding that even a numeric column creates a partitioned table, so really adding this sort of filter behavior would be a new backend behavior that can emulate the old frontend behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created DH-19810 just in case we want to some day add this.
| import deephaven.plot.express as dx | ||
|
|
||
| stocks = dx.data.stocks() # import the example stocks data set | ||
|
|
||
| # specify `x` and `y` columns, as well as additional filter variables with `filter_by` | ||
| filtered_line_plot = dx.line(stocks, x="Timestamp", y="Price", filter_by="Sym") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| # specify `x` and `y` columns, as well as additional filter variables with `filter_by` and `required_filter_by` | ||
| filtered_line_plot = dx.line( | ||
| stocks, x="Timestamp", y="Price", filter_by="Sym", required_filter_by="Exchange" | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Random - I thought we fixed the title margin? Wonder why it's still so big?

I thought this was fixed with deephaven/web-client-ui#2381 but still seeing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed removing it from the default figure as the default figure was rarely sent before. Fixed.
|
|
||
| # specify `x` and `y` columns, as well as additional filter variables with `filter_by` and `required_filter_by` | ||
| filtered_line_plot = dx.line( | ||
| stocks, x="Timestamp", y="Price", filter_by="Sym", required_filter_by="Exchange" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm a jerk and specify the same column for both filter_by and required_filter_by, it just treats it as a required filter. Maybe we should just raise an exception right away instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| `make_subplots` maintains any `filter_by` and `required_filter_by` filter columns originally passed into the subplots. | ||
|
|
||
| > [!WARNING] | ||
| > Multiple filters with the same name but different types are not currently supported. Rename columns so that they are unique if necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice note.
| elif message["type"] == "FILTER": | ||
| self._figure.update_filters(message["filterMap"]) | ||
| revision = self._revision_manager.get_revision() | ||
| # updating the filters automatically recreates the figure, so it's ready to send | ||
| figure = self._get_figure() | ||
| try: | ||
| self._connection.on_data(*self._build_figure_message(figure, revision)) | ||
| except RuntimeError: | ||
| # trying to send data when the connection is closed, ignore | ||
| pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something we're going to want to think about going forward is how this ties into sharing tables between workers, and resolving on the client: https://deephaven.atlassian.net/issues/DH-19001
In context of plotting, I'm imagining something like:
WorkerAhas a tabletWorkerBuses this lazy resolve API and passes it into the plotting API, e.g.p = dx.line(uri.client_resolve("pq://WorkerA/t"), x="X", y="Y"). Importantly, theWorkerBdoes not fetch the table from WorkerA, just keeps a reference to it.- Client fetches
pfromWorkerB, which sends a plot figure definition and tells the client to fetch the table/data fromWorkerA
Of course this flow becomes more complicated the more server side processing we do before passing the data table to the client. We're a ways away from this but it's something I want to keep in my mind...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think as you allude to the problem we're really going to run into is the server side processing. Ultimately, dx was built with the idea that it has access to tables that it can transform when running.
I think it would be relatively straightforward to detect that there is a URI passed in, suspend processing, and save off arguments, similar to what happens in the background with plot by charts.
So now we're back to retrieving the suspended chart, and doing the table processing. It seems like the problem really is we don't want to retrieve a table then have all the table processing happen in a users code studio in case the table is just too big. In my fantasy land, what I'd really want is some sort of identical table interface where I can just say "use this worker to do everything with this table but I can still pull out what I need here". Maybe like a pydeephaven table? Really I wish we would have that for every case we deal with, because the idea that we have to build logic to handle this sort of thing in every plugin we write seems like a pain. And if it's a pain for us, what about external developers trying to build custom plugins? This seems like something we want to strive for a general purpose abstraction for.
Otherwise, I think there just has to be the concept of a worker that handles the table transformations. As in I could pass in worker="WorkerC", suspend processing as mentioned above, then when I actually ask for that chart, all the processing happens in WorkerC because it has to happen somewhere.
| fireFilterUpdated(filterMap: FilterMap): void { | ||
| // Only send the filter update if filters are not required | ||
| // They will either be set or none are required | ||
| if (!this.isFilterRequired()) { | ||
| this.widget?.sendMessage( | ||
| JSON.stringify({ | ||
| type: 'FILTER', | ||
| filterMap: Object.fromEntries(filterMap), | ||
| }) | ||
| ); | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this function name to sendFilterUpdated. The fire* functions are messages that are emitted by this model, whereas this is a directed message to the underlying widget.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
@mofojed I'm wondering if instead of trying to refactor for the new filter hooks, I should just address your comments and merge for now. The thing is we will still need to use the existing overlay, and until DH-19613 is done there is going to still be some weirdness. I added a comment to DH-19613 about adding the types to the overlay as that is really where the inconsistency comes in. The new hooks can allow both a unique name and type combo, but the overlay can't currently, so we just end up in a case where the overlay would show something that isn't correct. |
|
plotly-express docs preview (Available for 14 days) |
|
plotly-express docs preview (Available for 14 days) |
|
plotly-express docs preview (Available for 14 days) |
|
plotly-express docs preview (Available for 14 days) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's up with these snapshot updates? I see the legend now has a title, and in indicator the indicators are rounded to no decimal points. Are those changes expected? Not sure why they'd be related to filter_by.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The legend title was added. Instead of using chart title like one_click charts do this is how someone can see what columns they are filtering on. Worth noting that px already does it. I just missed it before and realized it would be good to add especially for filters as a replacement for the chart title.
I can pull out out to a PR if really necessary but I thought it would be fine here since it it especially important to have with filters.
I didn't see those rounding changes, I will take a look as I'm not sure why that would happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a fix for the rounding changes as what was happening was the client would ask for a new chart render when it shouldn't. I think that combined with DH-19811 caused it.
|
plotly-express docs preview (Available for 14 days) |
|
plotly-express docs preview (Available for 14 days) |
| > If you are familiar with the `one_click` API it works similarly to `filter_by`, but there are some differences in behavior: | ||
| > In the `one_click` API, if filters are provided but not set then one trace is charted. | ||
| > In the `filter_by` API, if filters are provided but not set then all values within the filter columns are charted on separate traces. | ||
| > This provides a consistent experience with plot by behavior, but may not be optimal if filtering on numeric columns with many unique values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plot_by?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
by argument, made that more explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@margaretkennedy is this clearer now? Just want to make sure
Co-authored-by: margaretkennedy <[email protected]>
|
plotly-express docs preview (Available for 14 days) |
|
plotly-express docs preview (Available for 14 days) |

Adds filtering to
dxwith thefilter_byandrequired_filter_byarguments that take columns and allow them to be filtered on with input filters and linkers.This PR is sufficient to work with input filters. deephaven/web-client-ui#2456 is required for linkers.