Skip to content

Commit 10434b3

Browse files
maximlthoxbro
andauthored
Add selector to display sample information on hover for rasterized/datashaded plots (holoviz#1585)
Co-authored-by: Simon Høxbro Hansen <[email protected]>
1 parent be5339c commit 10434b3

File tree

5 files changed

+203
-51
lines changed

5 files changed

+203
-51
lines changed

doc/ref/plotting_options/interactivity.ipynb

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@
3333
"Enables or disables hover tooltips on the plot, also accepts `'hline'` and `'vline'` to change the hit-testing mode.\n",
3434
"\n",
3535
"::: {note}\n",
36-
"This option is True by default for most plots, but is automatically set to False when `datashade=True` since [Datashader](https://datashader.org/) returns an image that doesn’t support interactivity. If you’re using `datashade=True` and still want interactivity, consider alternatives like using `rasterize=True` or combining datashade with [dynspread](https://datashader.org/api.html#datashader.transfer_functions.dynspread) and overlays that retain interactivity.\n",
36+
"This option is `True` by default for most plots, but is automatically set to `False` when [`datashade=True`](option-datashade) and [`selector`](option-selector) is not set, since no relevant data can be displayed as HoloViews returns to the front-end an RGB element that doesn’t include the aggregated data. If you’re using `datashade=True` and still want interactivity, consider alternatives like using [`rasterize=True`](option-rasterize), combining `datashade` with [`dynspread`](option-dynspread), or enabling [`resample_when`](option-resample_when).\n",
3737
":::\n",
3838
"\n",
3939
"::: {note}\n",
@@ -99,8 +99,7 @@
9999
"\n",
100100
"Specifies additional columns from the dataset to be shown in the hover tooltip.\n",
101101
"- Accepts a list of column names, a single column name as a string, or 'all' to include all available columns.\n",
102-
"- When set to 'all', it includes index columns only if `use_index=True`.\n",
103-
"- Ignored for `datashade=True` plots, as those do not support interactivity.\n",
102+
"- When set to 'all', it includes index columns only if [`use_index=True`](option-use_index).\n",
104103
"\n",
105104
"::: {note} \n",
106105
"`hover_cols` complements the default dimensions shown in the tooltip but does not override them.\n",

doc/ref/plotting_options/resampling.ipynb

Lines changed: 114 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
"cell_type": "markdown",
2020
"metadata": {},
2121
"source": [
22-
"The `hvsampledata.synthetic_clusters` dataset is in many examples below."
22+
"The `hvsampledata.synthetic_clusters` dataset is used in many examples below. This dataset, returned as a DataFrame object, consists of five sub-datasets combined. Each of the sub-dataset has a random x, y-coordinate based on a normal distribution centered at a specific (x, y) location, with standard deviations derived from a power law, resulting in very dense to very scattered clusters. Each point also carries a `val` (`0` to `4`) and `cat` (`d1` to `d5`) column to identify its dataset and category. The total dataset contains 1,000,000 points, evenly split across the five distributions."
2323
]
2424
},
2525
{
@@ -61,7 +61,7 @@
6161
"- Selection of data from a dimension of the supplied dataset, or the index of the corresponding row in the dataset, including: `'first'`, `'last'`, `'min'`, `'max'`.\n",
6262
"\n",
6363
"`aggregator` accepts either:\n",
64-
"- A [Datashader reduction object](https://datashader.org/api.html#reductions), such as `ds.count()` or `ds.mean('val')`.\n",
64+
"- A [Datashader reduction instance](https://datashader.org/api.html#reductions), such as `ds.count()` or `ds.mean('val')`.\n",
6565
"- A string (e.g. `'mean'`, `'count'`, `'min'`, `'max'`, etc.), in which case the aggregated dimension can be defined by setting the [`color`](option-color) option (if not, the first non-coordinate variable found is used).\n",
6666
"\n",
6767
"The `'count_cat'` or `'by'` aggregators can be used for categorical cata. `ds.by(<column>, <reduction>)` allows to define the per-category reduction function (default is `count`). Alternatively, setting the [`by`](option-by) option to a categorical column is equivalent to setting `aggregator=ds.by(<cat_column>)`.\n",
@@ -133,33 +133,6 @@
133133
"The next examples show how to leverage `ds.summary()` and `ds.where()`. Hover over the plots to see how what information is made available in the tooltip."
134134
]
135135
},
136-
{
137-
"cell_type": "code",
138-
"execution_count": null,
139-
"metadata": {},
140-
"outputs": [],
141-
"source": [
142-
"ds.summary(min_s=ds.min('s'), min_val=ds.min('val'))"
143-
]
144-
},
145-
{
146-
"cell_type": "code",
147-
"execution_count": null,
148-
"metadata": {},
149-
"outputs": [],
150-
"source": [
151-
"ds.where(ds.min('s'), 'val')"
152-
]
153-
},
154-
{
155-
"cell_type": "code",
156-
"execution_count": null,
157-
"metadata": {},
158-
"outputs": [],
159-
"source": [
160-
"ds.summary(min_s=ds.min('s'), min_val=ds.min('val'))"
161-
]
162-
},
163136
{
164137
"cell_type": "code",
165138
"execution_count": null,
@@ -199,7 +172,7 @@
199172
"This approach can turn even the largest datasets into an image that captures patterns such as density or value distribution, making it ideal for high-volume scatter plots. When `datashade=True`, hvPlot returns a [`DynamicMap`](inv:holoviews#reference/containers/bokeh/DynamicMap) containing an [`RGB`](inv:holoviews#reference/elements/bokeh/RGB) instead of individual glyphs.\n",
200173
"\n",
201174
":::{tip}\n",
202-
"Since `datashade=True` produces an RGB image, the underlying data (e.g. the aggregated values per pixel) is not directly available to the plot. Enabling the `'hover'` [tool](options-hover) (disabled by default when `datashade=True`) would only show the RGB value per pixel, and no meaningful colorbar can be attached to the plot. To let the frontend apply colormapping instead of the backend, and as a consequence expose the underlying data, we recommend setting [`rasterize=True`](option-rasterize) instead of `datashade=True`.\n",
175+
"Since `datashade=True` produces an RGB image, the underlying data (e.g. the aggregated values per pixel) is not directly available to the plot. Enabling the `'hover'` [tool](options-hover) (disabled by default when `datashade=True` unless [`selector`](option-selector) is set) would only show the RGB value per pixel, and no meaningful colorbar can be attached to the plot. To let the frontend apply colormapping instead of the backend, and as a consequence expose the underlying data, we recommend setting [`rasterize=True`](option-rasterize) instead of `datashade=True`.\n",
203176
":::\n",
204177
"\n",
205178
"The [`cnorm`](option-cnorm) option defaults to `'eq_hist'` when `datashade=True`."
@@ -216,9 +189,9 @@
216189
"\n",
217190
"df = hvsampledata.synthetic_clusters(\"pandas\")\n",
218191
"\n",
219-
"df.hvplot.scatter(\n",
192+
"df.hvplot.points(\n",
220193
" x='x', y='y', datashade=True, data_aspect=1, frame_height=250,\n",
221-
" title='Datashaded scatter plot with\\n\"count\" aggregator and\\n\"eq_hist\" cnorm'\n",
194+
" title='Datashaded points plot with\\n\"count\" aggregator and\\n\"eq_hist\" cnorm'\n",
222195
")"
223196
]
224197
},
@@ -305,11 +278,11 @@
305278
" x='x', y='y', frame_height=250, data_aspect=1,\n",
306279
" xlim=(-5.5, -5), ylim=(2.5, 3),\n",
307280
")\n",
308-
"df.hvplot.scatter(\n",
281+
"df.hvplot.points(\n",
309282
" rasterize=True, dynspread=False,\n",
310283
" title=\"Datashade without dynspread\", **plot_opts,\n",
311284
") +\\\n",
312-
"df.hvplot.scatter(\n",
285+
"df.hvplot.points(\n",
313286
" rasterize=True, dynspread=True,\n",
314287
" title=\"Datashade with dynspread\", **plot_opts,\n",
315288
")"
@@ -339,11 +312,11 @@
339312
" x='x', y='y', frame_height=250, data_aspect=1,\n",
340313
" xlim=(-5.5, -5), ylim=(2.5, 3),\n",
341314
")\n",
342-
"df.hvplot.scatter(\n",
315+
"df.hvplot.points(\n",
343316
" rasterize=True, dynspread=True,\n",
344317
" title=\"Dynspread with max_px=3 (default)\", **plot_opts,\n",
345318
") +\\\n",
346-
"df.hvplot.scatter(\n",
319+
"df.hvplot.points(\n",
347320
" rasterize=True, dynspread=True, max_px=8,\n",
348321
" title=\"Dynspread with max_px=8\", **plot_opts\n",
349322
")"
@@ -383,7 +356,7 @@
383356
"\n",
384357
"df = hvsampledata.synthetic_clusters(\"pandas\")\n",
385358
"\n",
386-
"df.hvplot.scatter(\n",
359+
"df.hvplot.points(\n",
387360
" x='x', y='y', datashade=True, pixel_ratio=0.1, frame_height=250,\n",
388361
" data_aspect=1, title=\"Datashade with low pixel ratio\"\n",
389362
")"
@@ -430,9 +403,9 @@
430403
"\n",
431404
"df = hvsampledata.synthetic_clusters(\"pandas\")\n",
432405
"\n",
433-
"df.hvplot.scatter(\n",
406+
"df.hvplot.points(\n",
434407
" x='x', y='y', rasterize=True, data_aspect=1, frame_height=250, cnorm='log',\n",
435-
" title='Rasterized scatter with count aggregator\\nand log cnorm'\n",
408+
" title='Rasterized points with count aggregator\\nand log cnorm'\n",
436409
")"
437410
]
438411
},
@@ -464,7 +437,7 @@
464437
"\n",
465438
"df = hvsampledata.synthetic_clusters(\"pandas\")\n",
466439
"\n",
467-
"df.hvplot.scatter(\n",
440+
"df.hvplot.points(\n",
468441
" x='x', y='y', rasterize=True, resample_when=1_000,\n",
469442
" data_aspect=1, frame_height=250, cnorm='log',\n",
470443
" title=\"Rasterize only when >1000 points in view\"\n",
@@ -478,6 +451,103 @@
478451
"When running the code above, you will notice that after zooming in enough, the original data points appear. This gives a hybrid experience: raw points at low density, rasterized aggregates when zoomed out."
479452
]
480453
},
454+
{
455+
"cell_type": "markdown",
456+
"metadata": {},
457+
"source": [
458+
"(option-selector)=\n",
459+
"## `selector`\n",
460+
"\n",
461+
":::{versionadded} 0.12.0\n",
462+
"Requires `holoviews>=1.21`.\n",
463+
"Requires `bokeh>=3.7`.\n",
464+
":::\n",
465+
"\n",
466+
"When a Datashader operation is applied, with [`datashade=True`](option-datashade) or [`rasterize=True`](option-rasterize), the `selector` option allows to augment the tooltip with information computed (*selected*) from variables other than the aggregated one, effectively showing a sample of the dataset in the tooltip.\n",
467+
"\n",
468+
"Datashader operations allow to easily identify *macro level patterns* in large datasets by aggregating the data appropriately. However, they do not by default expose information about *individual data points*. Let's take for example a simple scatter plots set with `rasterize=True`; hovering over the image will only display the aggregated value per pixel (`'count'` by default), with no way to know more about each point (unless [`resample_when`](option-resample_when) is enabled and the user zooms in enough). Setting `selector` in this case would augment the tooltip with sample information from other variables, selected from *one unique row* of the dataset. Find out more about `selector` in HoloViews' [Interactive Hover for Big Data guide](https://dev.holoviews.org/user_guide/Interactive_Hover_for_Big_Data.html).\n",
469+
"\n",
470+
"Like the [`aggregator`](option-aggregator) option, a `selector` refers to a [Datashader `Reduction` object](https://datashader.org/api.html#reductions). However, unlike `aggregator` that accepts reductions that can combine data in a pixel (e.g. `'mean'` or `'count'`), `selector` only accepts reductions that *select* values, including: `'first'`, `'last'`, `'min'`, and `'max'`. Valid options include:\n",
471+
"- A string object for reductions that do not require a variable name, including `'first'` and `'last'`.\n",
472+
"- A 2-tuple with a reduction name and a variable name, for reductions that require a variable name, including `'min'` and `'max'` (e.g. `('min', 'column')`).\n",
473+
"- A reduction instance, including `ds.first()`, `ds.last()`, `ds.min()`, and `ds.max()`.\n",
474+
"\n",
475+
"::: {note}\n",
476+
"The hover tooltip always requires a live kernel when `selector` is set as the values displayed need to be sent by the Python server. Without a live kernel, like on this webpage, all the values are displayed as `'undefined'`.\n",
477+
":::\n",
478+
"\n",
479+
"When you hover over the first plot below, you will see a value for `s`, `val`, and `cat` in the bottom part of the tooltip. All these values originate from the same row in the DataFrame, that row being the first one found in the subdataset contained within this pixel. In the second plot, the values displayed are derived from the row where `val` is minimum within the hovered pixel."
480+
]
481+
},
482+
{
483+
"cell_type": "code",
484+
"execution_count": null,
485+
"metadata": {},
486+
"outputs": [],
487+
"source": [
488+
"import hvplot.pandas # noqa\n",
489+
"import hvsampledata\n",
490+
"\n",
491+
"df = hvsampledata.synthetic_clusters(\"pandas\")\n",
492+
"\n",
493+
"plot_opts = dict(x='x', y='y', rasterize=True, data_aspect=1, frame_height=250, cnorm='log')\n",
494+
"(\n",
495+
" df.hvplot.points(selector='first', title='selector=\"first\"', **plot_opts) +\n",
496+
" df.hvplot.points(selector=('min', 'val'), title='selector=(\"min\", \"val\")', **plot_opts)\n",
497+
")"
498+
]
499+
},
500+
{
501+
"cell_type": "markdown",
502+
"metadata": {},
503+
"source": [
504+
"`datashade=True` plots get their hover tool enabled by default when `selector` is set."
505+
]
506+
},
507+
{
508+
"cell_type": "code",
509+
"execution_count": null,
510+
"metadata": {},
511+
"outputs": [],
512+
"source": [
513+
"import datashader as ds\n",
514+
"import hvplot.pandas # noqa\n",
515+
"import hvsampledata\n",
516+
"\n",
517+
"df = hvsampledata.synthetic_clusters(\"pandas\")\n",
518+
"\n",
519+
"df.hvplot.points(\n",
520+
" x='x', y='y', data_aspect=1, frame_height=250, cnorm='log',\n",
521+
" datashade=True, selector=ds.min('val'), title='datashade=True',\n",
522+
")"
523+
]
524+
},
525+
{
526+
"cell_type": "markdown",
527+
"metadata": {},
528+
"source": [
529+
"`selector` can also be set when datashading categorical data."
530+
]
531+
},
532+
{
533+
"cell_type": "code",
534+
"execution_count": null,
535+
"metadata": {},
536+
"outputs": [],
537+
"source": [
538+
"import hvplot.pandas # noqa\n",
539+
"import hvsampledata\n",
540+
"import datashader as ds\n",
541+
"\n",
542+
"df = hvsampledata.synthetic_clusters(\"pandas\")\n",
543+
"\n",
544+
"df.hvplot.points(\n",
545+
" x='x', y='y', data_aspect=1, frame_height=250, colorbar=False,\n",
546+
" rasterize=True, aggregator=ds.by('cat'), selector='first',\n",
547+
" title=\"Categorical rasterizing with\\n'count' aggregator'\",\n",
548+
")"
549+
]
550+
},
481551
{
482552
"cell_type": "markdown",
483553
"metadata": {},
@@ -503,9 +573,9 @@
503573
" x='x', y='y', datashade=True, dynspread=True,\n",
504574
" data_aspect=1, frame_width=200, xlim=(-2, 0), ylim=(7, 9),\n",
505575
")\n",
506-
"df.hvplot.scatter(threshold=0.0, title=\"Dynspread threshold=0.0\", **plot_opts) +\\\n",
507-
"df.hvplot.scatter(threshold=0.5, title=\"Dynspread threshold=0.5\", **plot_opts) +\\\n",
508-
"df.hvplot.scatter(threshold=1.0, title=\"Dynspread threshold=1.0\", **plot_opts)"
576+
"df.hvplot.points(threshold=0.0, title=\"Dynspread threshold=0.0\", **plot_opts) +\\\n",
577+
"df.hvplot.points(threshold=0.5, title=\"Dynspread threshold=0.5\", **plot_opts) +\\\n",
578+
"df.hvplot.points(threshold=1.0, title=\"Dynspread threshold=1.0\", **plot_opts)"
509579
]
510580
},
511581
{
@@ -529,7 +599,7 @@
529599
"\n",
530600
"df = hvsampledata.synthetic_clusters(\"pandas\")\n",
531601
"\n",
532-
"df.hvplot.scatter(\n",
602+
"df.hvplot.points(\n",
533603
" x='x', y='y', rasterize=True, x_sampling=0.1, y_sampling=0.1,\n",
534604
" data_aspect=1, cnorm='log', xlim=(0, 1), ylim=(0, 1), frame_height=250,\n",
535605
" title='Zoomed in rasterized plot\\nwith custom x/y-sampling'\n",

hvplot/converter.py

Lines changed: 38 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@
5252

5353
from .backend_transforms import _transfer_opts_cur_backend
5454
from .util import (
55+
_HV_GE_1_21_0,
5556
filter_opts,
5657
is_tabular,
5758
is_series,
@@ -407,7 +408,7 @@ class HoloViewsConverter:
407408
408409
Resampling Options
409410
------------------
410-
aggregator : str datashader.Reduction or None, default=None
411+
aggregator : str, datashader.Reduction, or None, default=None
411412
Aggregator to use when applying rasterize or datashade operation
412413
(valid options include 'mean', 'count', 'min', 'max' and more, and
413414
datashader reduction objects)
@@ -466,6 +467,18 @@ class HoloViewsConverter:
466467
Applies a resampling operation (datashade, rasterize or downsample) if
467468
the number of individual data points present in the current viewport
468469
is above this threshold. The raw plot is displayed otherwise.
470+
selector : datashader.Reduction | str | tuple | None, default=None
471+
Datashader reduction to apply during a ``rasterize`` or ``datashade``
472+
operation, used to select additional information for inclusion in the
473+
hover tooltip. Supported options include:
474+
475+
- string: only ``'first'`` and ``'last'``
476+
- tuple of two strings: ``(<reduction>, <column>)``, e.g. ``('min', 'value')``.
477+
- Datashader object: ``ds.first``, ``ds.last``, ``ds.min``, and ``ds.max``.
478+
479+
.. versionadded:: 0.12.0
480+
Requires ``holoviews>=1.21``.
481+
Requires ``bokeh>=3.7``.
469482
threshold : float, default=0.5
470483
When using ``dynspread``, this value defines the minimum density of overlapping points
471484
required before the spreading operation is applied.
@@ -610,6 +623,7 @@ class HoloViewsConverter:
610623
'dynspread',
611624
'max_px',
612625
'precompute',
626+
'selector',
613627
'threshold',
614628
]
615629

@@ -794,6 +808,7 @@ def __init__(
794808
debug=False,
795809
framewise=True,
796810
aggregator=None,
811+
selector=None,
797812
projection=None,
798813
global_extent=None,
799814
geo=False,
@@ -911,12 +926,20 @@ def __init__(
911926
'At least one resampling operation (rasterize, datashader, '
912927
'downsample) must be enabled when resample_when is set.'
913928
)
929+
if selector is not None:
930+
if not _HV_GE_1_21_0:
931+
msg = 'selector requires holoviews>=1.21.'
932+
raise ImportError(msg)
933+
if not (datashade or rasterize):
934+
msg = 'rasterize or datashade must be enabled when selector is set.'
935+
raise ValueError(msg)
914936
self.resample_when = resample_when
915937
self.datashade = datashade
916938
self.rasterize = rasterize
917939
self.downsample = downsample
918940
self.dynspread = dynspread
919941
self.aggregator = aggregator
942+
self.selector = selector
920943
self.precompute = precompute
921944
self.x_sampling = x_sampling
922945
self.y_sampling = y_sampling
@@ -1043,7 +1066,7 @@ def __init__(
10431066
if kind == 'errorbars':
10441067
hover = False
10451068
elif hover is None:
1046-
hover = not self.datashade
1069+
hover = True if self.selector else not self.datashade
10471070
if hover and not any(
10481071
t for t in tools if isinstance(t, HoverTool) or t in ['hover', 'vline', 'hline']
10491072
):
@@ -1962,13 +1985,24 @@ def method_wrapper(ds, x, y):
19621985
layers = _transfer_opts_cur_backend(layers)
19631986
return layers
19641987

1965-
import_datashader()
1988+
ds = import_datashader()
19661989
from holoviews.operation.datashader import datashade, rasterize, dynspread
19671990

19681991
categorical, agg = self._process_categorical_datashader()
19691992
if agg:
19701993
opts['aggregator'] = agg
1971-
1994+
if self.selector:
1995+
selector = self.selector
1996+
try:
1997+
if isinstance(selector, str):
1998+
selector = getattr(ds, selector)()
1999+
elif isinstance(selector, tuple):
2000+
selector = getattr(ds, selector[0])(selector[1])
2001+
except AttributeError as e:
2002+
sel = selector[0] if isinstance(selector, tuple) else selector
2003+
msg = f'Invalid selector value {sel!r}.'
2004+
raise ValueError(msg) from e
2005+
opts['selector'] = selector
19722006
if self.precompute:
19732007
opts['precompute'] = self.precompute
19742008
if self.x_sampling:

0 commit comments

Comments
 (0)