Skip to content
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
386 changes: 386 additions & 0 deletions doc/explanation/grouping_options.ipynb
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you reference e.g. groupby can you add a link to the reference doc like [groupby](option-groupby)? That won't work for users running the notebook locally but that's fine.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Original file line number Diff line number Diff line change
@@ -0,0 +1,386 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "02909b6b",
"metadata": {},
"source": [
"# Understanding Grouping and Coloring Options\n",
"\n",
"This guide explains the key differences between hvPlot's three main approaches to handling categorical data: `groupby`, `by`, and `color`/`c`. Understanding these differences is important for creating effective visualizations and choosing the right approach for your specific use case."
]
},
{
"cell_type": "markdown",
"id": "817592aa",
"metadata": {},
"source": [
"## Overview of the Three Approaches\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this comment after going through the whole notebook. I like it but think it can be re-organized a little to be even better. How I see it is that there are three main use cases:

  • Drilling down a dataset with widgets -> groupby
  • Facetting a dataset -> by + subplots
  • Color a dataset

The first two are pretty straightforward to explain. The last one is where things are a bit more subtle are users can use wither by or c/color but there are differences:

  • by:
    • Creates containers (can be indexed and stuff)
    • Accept a list of dimensions
    • Generates an overlay -> If many overlays (many values in a dimension) then performance can be affected
    • Works only for categorical data (i.e. no nice colorbar for numerical data)
  • color:
    • Single Element returned
    • Does not accept a list of dimensions
    • Can also be used for getting a nice colorbar

And I would document color before by as I think most users should probably use color?

Copy link
Collaborator Author

@Azaya89 Azaya89 Aug 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arranging them this way is a bit more 'poetic' as it kinda rolls off the tongue better: groupby -> by -> color. Putting color in-between both kinda breaks the flow IMO.

"\n",
"hvPlot provides three primary ways to handle categorical data in your plots:\n",
"\n",
"1. **`groupby`**: Creates interactive widgets with [HoloMap](inv:holoviews#holoviews.HoloMap) / [DynamicMap](inv:holoviews#holoviews.DynamicMap) containers\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arf sorry my bad. I found out that intersphinx links aren't great for a user running the notebook (they just don't work there). So can we:

  • use absolute HTTP links in notebooks
  • use intersphinx links in docstrings and markdown files
    ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I can do that for this notebook but finding all the other places we used the intersphinx links is not going to be easy.

"2. **`by`**: Creates multiple plot elements in [NdOverlay](inv:holoviews#holoviews.NdOverlay) or [NdLayout](inv:holoviews#holoviews.NdLayout) containers\n",
"3. **`color`/`c`**: Creates vectorized coloring within a single plot element\n",
"\n",
"Each approach produces different outputs, offers different interaction capabilities, and has different performance characteristics."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dbeabe0c",
"metadata": {},
"outputs": [],
"source": [
"import hvplot.pandas\n",
"\n",
"penguins = hvplot.sampledata.penguins(\"pandas\").dropna()\n",
"penguins.head(3)"
]
},
{
"cell_type": "markdown",
"id": "98a0b527",
"metadata": {},
"source": [
"## `groupby`: Interactive Widgets\n",
"\n",
"The `groupby` parameter creates **interactive widgets** that allow users to filter and explore different subsets of your data dynamically.\n",
"\n",
"### What it creates:\n",
"- **HoloViews containers**: [HoloMap](inv:holoviews#holoviews.HoloMap) or [DynamicMap](inv:holoviews#holoviews.DynamicMap) object\n",
"- **Interactive widgets**: Automatically generated based on data type\n",
"- **Single view at a time**: Only one category visible per interaction\n",
"\n",
"### When to use:\n",
"- When you want to explore different subsets of data interactively\n",
"- When you have many categories that would clutter a single plot\n",
"- When building dashboards or interactive reports\n",
"- When you need to reduce visual complexity"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a2005058",
"metadata": {},
"outputs": [],
"source": [
"plot_groupby = penguins.hvplot.scatter(\n",
" x='bill_length_mm',\n",
" y='bill_depth_mm',\n",
" groupby='species',\n",
" title=\"Groupby: Interactive Widget\",\n",
" width=400,\n",
")\n",
"plot_groupby"
]
},
{
"cell_type": "markdown",
"id": "ae7b6cd5",
"metadata": {},
"source": [
"### Indexing and access:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "55e5fd82",
"metadata": {},
"outputs": [],
"source": [
"# Access specific category: plot['category_value']\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice but I think this needs just a bit of text. Also, I'm not sure we introduce anywhere in the docs that some holoviews objects can be indexed and inspected this way. So a small introduction to that (or a link to a place where we document that, or to holoviews docs) wouldn't hurt.

"adelie_only = plot_groupby['Adelie'] # Shows only Adelie penguins\n",
"print(f\"Type of groupby plot: {type(plot_groupby)}\")\n",
"print(f\"Type of specific species: {type(adelie_only)}\")\n",
"adelie_only"
]
},
{
"cell_type": "markdown",
"id": "30b3be0e",
"metadata": {},
"source": [
"## `by`: Multiple Plot Elements\n",
"\n",
"The `by` parameter creates **multiple plot elements** shown simultaneously, either overlaid or in separate subplots.\n",
"\n",
"### What it creates:\n",
"- [NdOverlay](inv:holoviews#holoviews.NdOverlay): Multiple elements overlaid (default)\n",
"- [NdLayout](inv:holoviews#holoviews.NdLayout): Separate subplots when `subplots=True`\n",
"- **All categories visible**: Simultaneously displayed\n",
"\n",
"### When to use:\n",
"- When you want to compare categories side-by-side\n",
"- When you have a manageable number of categories (typically < 10)\n",
"- When color differentiation is sufficient for your analysis\n",
"- When you need all data visible at once"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "44a01c85",
"metadata": {},
"outputs": [],
"source": [
"overlay_plot = penguins.hvplot.scatter(\n",
" x='bill_length_mm',\n",
" y='bill_depth_mm',\n",
" by='species',\n",
" title=\"By: Overlaid Elements\"\n",
")\n",
"overlay_plot"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "2d18719e",
"metadata": {},
"outputs": [],
"source": [
"plot_by_subplots = penguins.hvplot.scatter(\n",
" x='bill_length_mm',\n",
" y='bill_depth_mm',\n",
" by='species',\n",
" width=300,\n",
" subplots=True,\n",
")\n",
"plot_by_subplots.cols(2)"
]
},
{
"cell_type": "markdown",
"id": "429174c7",
"metadata": {},
"source": [
"### Indexing and access:"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, it needs a bit of text to explain what you do and why you can do it in this case.

]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d21ed90c",
"metadata": {},
"outputs": [],
"source": [
"# Access specific category: plot['category_value']\n",
"adelie_element = overlay_plot['Adelie'] # Returns just the element for Adelie penguins\n",
"print(f\"Type of 'by' plot: {type(overlay_plot)}\")\n",
"print(f\"Type of specific species element: {type(adelie_element)}\")\n",
"adelie_element"
]
},
{
"cell_type": "markdown",
"id": "3fe7048d",
"metadata": {},
"source": [
"## `color`/`c`: Vectorized Coloring\n",
"\n",
"The `color` parameter creates **vectorized coloring** within a single plot element, where each data point is colored based on the category value.\n",
"\n",
"### What it creates:\n",
"- **Single plot element**: One unified [Scatter](inv:holoviews#holoviews.Scatter), [Curve](inv:holoviews#holoviews.Curve), etc.\n",
"- **Vectorized coloring**: Each point colored by category\n",
"- **Cannot be indexed**: Single element, not separable by category\n",
"\n",
"### When to use:\n",
"- When you want the best performance for large datasets\n",
"- When you need smooth, continuous color mapping\n",
"- When you don't need to isolate specific categories"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f54ccbed",
"metadata": {},
"outputs": [],
"source": [
"plot_color = penguins.hvplot.scatter(\n",
" x='bill_length_mm',\n",
" y='bill_depth_mm',\n",
" color='species',\n",
" title=\"Color: Vectorized Coloring\"\n",
")\n",
"plot_color"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d4fb28a9",
"metadata": {},
"outputs": [],
"source": [
"# Alternative syntax using 'c'\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be removed, I don't think it brings much.

"plot_c = penguins.hvplot.scatter(\n",
" x='bill_length_mm',\n",
" y='bill_depth_mm',\n",
" c='species',\n",
" title=\"Using 'c' parameter (same result)\"\n",
")\n",
"plot_c"
]
},
{
"cell_type": "markdown",
"id": "60b4c757",
"metadata": {},
"source": [
"### Indexing and access:"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same, some more text to explain why in this case this is not possible.

]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ab54c88e",
"metadata": {},
"outputs": [],
"source": [
"# Cannot index by category - this won't work:\n",
"print(f\"Type of 'color' plot: {type(plot_color)}\")\n",
"try:\n",
" adelie_color = plot_color['Adelie'] # This will raise an error!\n",
"except Exception:\n",
" print(\"Error: You cannot index a single element by category!\")"
]
},
{
"cell_type": "markdown",
"id": "1626f2c0",
"metadata": {},
"source": [
"## Visual Output Comparison\n",
"\n",
"Let's compare all three approaches side by side to see the differences:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a1545149",
"metadata": {},
"outputs": [],
"source": [
"import panel as pn\n",
"\n",
"width = 300\n",
"\n",
"groupby_plot = penguins.hvplot.scatter(\n",
" x='bill_length_mm', y='bill_depth_mm',\n",
" groupby='species', title=\"groupby='species'\",\n",
" frame_width=width, widget_location='bottom_right',\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add shared_axes=False to groupby_plot as otherwise updating the widget affects the view of the other plots.

")\n",
"\n",
"by_plot = penguins.hvplot.scatter(\n",
" x='bill_length_mm', y='bill_depth_mm',\n",
" by='species', title=\"by='species'\",\n",
" frame_width=width,\n",
")\n",
"\n",
"color_plot = penguins.hvplot.scatter(\n",
" x='bill_length_mm', y='bill_depth_mm',\n",
" color='species', title=\"color='species'\",\n",
" frame_width=width,\n",
")\n",
"\n",
"pn.Column(pn.Row(by_plot, color_plot), groupby_plot)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put groupby first? Like in the order of how they are documented. In fact all in a Column would work I think.

]
},
{
"cell_type": "markdown",
"id": "bdcbe3e1",
"metadata": {},
"source": [
"### Key Visual Differences:\n",
"\n",
"- **`groupby='species'`** shows only one species at a time with a widget\n",
"- **`by='species'`** shows all species overlaid with different colors and a legend\n",
"- **`color='species'`** looks similar to `by` but is a single plot element\n",
"\n",
"When using `subplots=True` with `by`, you get separate panels:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3b777d41",
"metadata": {},
"outputs": [],
"source": [
"penguins.hvplot.scatter(\n",
" x='bill_length_mm', y='bill_depth_mm',\n",
" by='species', subplots=True,\n",
" width=300, height=250,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "0658c3a1",
"metadata": {},
"source": [
"## Advanced: Combining Approaches\n",
"\n",
"You can combine these approaches for more complex visualizations:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "754427cf",
"metadata": {},
"outputs": [],
"source": [
"penguins.hvplot.scatter(\n",
" x='bill_length_mm', y='bill_depth_mm',\n",
" groupby='island', color='species',\n",
" width=500, height=400,\n",
" title=\"Groupby island, colored by species\"\n",
")\n"
]
},
{
"cell_type": "markdown",
"id": "05067653",
"metadata": {},
"source": [
"## Summary\n",
"\n",
"| Aspect | `groupby` | `by` | `color`/`c` |\n",
"| ----------------- | ------------------ | ------------------ | ---------------------- |\n",
"| **Holoviews Object** | [HoloMap](inv:holoviews#holoviews.HoloMap) / [DynamicMap](inv:holoviews#holoviews.DynamicMap) | [NdOverlay](inv:holoviews#holoviews.NdOverlay) / [NdLayout](inv:holoviews#holoviews.NdLayout) | Single Element |\n",
"| **Interactivity** | Widget-based | Static overlay | Static single plot |\n",
"| **Indexing** | `plot['category']` | `plot['category']` | Not available |\n",
"| **Performance** | Variable | Medium | Best |\n",
"| **Visual** | One at a time | All simultaneous | All simultaneous |\n",
"| **Use case** | Exploration | Comparison | Performance/Aesthetics |\n",
"\n",
"### Choose the approach that best matches your needs:\n",
"\n",
"- **Use `groupby`** for interactive exploration of many categories\n",
"- **Use `by`** for direct comparison of manageable categories \n",
"- **Use `color`** for performance with large datasets or aesthetic coloring\n",
"\n",
":::{admonition} Further Reading\n",
":class: seealso\n",
"See the [HoloViews Reference Manual](https://holoviews.org/reference_manual/index.html) for more information on the various objects created by a hvPlot plot.\n",
":::"
]
}
],
"metadata": {
"language_info": {
"name": "python",
"pygments_lexer": "ipython3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
11 changes: 11 additions & 0 deletions doc/explanation/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Explanation

Explanation guides provide in-depth understanding of key concepts in hvPlot. These guides help you understand the reasoning behind design decisions and when to use different approaches.

```{toctree}
:titlesonly:
:hidden:
:maxdepth: 2

grouping_options
```
Loading
Loading