Skip to content
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions doc/explanation/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# Explanation

Explanation guides provide in-depth understanding of key concepts in hvPlot. These guides help you understand the reasoning behind design decisions and when to use different approaches.

```{toctree}
:titlesonly:
:hidden:
:maxdepth: 2

statistical_plot_types
```
214 changes: 214 additions & 0 deletions doc/explanation/statistical_plot_types.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,214 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "1aaf9273",
"metadata": {},
"source": [
"# Understanding hvPlot's Statistical Plot Types\n",
"\n",
"hvPlot provides several statistical plotting functions that go beyond basic charts. Each plot type reveals different aspects of your data and has specific strengths and limitations. This guide explains when and why to use each type."
]
},
{
"cell_type": "markdown",
"id": "b8b10a4b",
"metadata": {},
"source": [
"## Multivariate Data Visualization\n",
"\n",
"When working with datasets containing multiple variables, understanding relationships between all dimensions becomes challenging. hvPlot offers three complementary approaches:"
]
},
{
"cell_type": "markdown",
"id": "5f52ca70",
"metadata": {},
"source": [
"### Scatter Matrix\n",
"\n",
"**What it shows:** All pairwise relationships between numeric variables\n",
"\n",
"**Strengths:**\n",
"- Provides quantitative insights into correlations\n",
"- Interactive linking allows exploration across all variable pairs\n",
"- Familiar scatter plot format is easy to interpret\n",
"\n",
"**Best for:** Identifying correlations, outliers, and clustering patterns between variable pairs\n",
"\n",
"**Limitations:** Can become cluttered with many variables; doesn't show patterns across all dimensions simultaneously"
]
},
{
"cell_type": "markdown",
"id": "2d894a15",
"metadata": {},
"source": [
"### Parallel Coordinates\n",
"\n",
"**What it shows:** Patterns and relationships across all variables simultaneously\n",
"\n",
"**Strengths:**\n",
"- Reveals patterns across all dimensions at once\n",
"- Excellent for identifying distinct groups or classes\n",
"- Shows which variables contribute most to group differences\n",
"\n",
"**Best for:** Comparing groups across multiple dimensions, identifying which variables distinguish different classes\n",
"\n",
"**Limitations:** Can be difficult to read with many observations; requires some practice to interpret effectively"
]
},
{
"cell_type": "markdown",
"id": "9d1ba8b8",
"metadata": {},
"source": [
"### Andrews Curves\n",
"\n",
"**What it shows:** Aggregate differences between classes using Fourier series representation\n",
"\n",
"**Strengths:**\n",
"- Smooth curves make group differences visually apparent\n",
"- Good for showing overall class separation\n",
"- Less cluttered than parallel coordinates with many observations\n",
"\n",
"**Best for:** Visualizing overall differences between classes when you care more about separation than specific variable contributions\n",
"\n",
"**Limitations:** Provides less quantitative insight into which specific features drive differences; mathematical transformation makes individual variable contributions less interpretable"
]
},
{
"cell_type": "markdown",
"id": "b925b654",
"metadata": {},
"source": [
"## Time Series Analysis\n",
"\n",
"### Lag Plots\n",
"\n",
"**What it shows:** Relationship between current values and values at a previous time point\n",
"\n",
"**Strengths:**\n",
"- Reveals autocorrelation patterns in time series\n",
"- Identifies volatility and stability in temporal data\n",
"- Helps detect seasonal or cyclical patterns\n",
"\n",
"**Best for:** Understanding temporal dependencies, comparing volatility between different time series, detecting autocorrelation\n",
"\n",
"**Key insight:** Tight clustering around the diagonal indicates stable, predictable behavior; scattered points indicate high volatility or weak temporal correlation"
]
},
{
"cell_type": "markdown",
"id": "ff27fc8e",
"metadata": {},
"source": [
"## Distribution Analysis\n",
"\n",
"Understanding the distribution of your data is fundamental to statistical analysis. hvPlot provides several plot types that reveal different aspects of data distributions:\n",
"\n",
"### Histograms\n",
"\n",
"**What it shows:** Frequency distribution of values in a single variable\n",
"\n",
"**Strengths:**\n",
"- Clear visualization of data distribution shape\n",
"- Easy to identify skewness, modality, and outliers\n",
"- Familiar and intuitive for most users\n",
"- Customizable bin sizes for different levels of detail\n",
"\n",
"**Best for:** Understanding the overall shape and spread of a single variable, identifying distribution patterns\n",
"\n",
"**Limitations:** Can be sensitive to bin size choices; doesn't show relationships between variables\n",
"\n",
"### Box Plots\n",
"\n",
"**What it shows:** Five-number summary (minimum, Q1, median, Q3, maximum) plus outliers\n",
"\n",
"**Strengths:**\n",
"- Compact summary of distribution characteristics\n",
"- Excellent for comparing distributions across groups\n",
"- Clearly identifies outliers and quartile ranges\n",
"- Robust to extreme values\n",
"\n",
"**Best for:** Comparing distributions between groups, identifying outliers, understanding data spread and central tendency\n",
"\n",
"**Limitations:** Hides detailed distribution shape; can miss bimodal or complex distributions\n",
"\n",
"### Violin Plots\n",
"\n",
"**What it shows:** Combination of box plot information with kernel density estimation\n",
"\n",
"**Strengths:**\n",
"- Shows both summary statistics and distribution shape\n",
"- Reveals multimodal distributions that box plots miss\n",
"- Good for comparing complex distributions across groups\n",
"- More informative than box plots for understanding distribution shape\n",
"\n",
"**Best for:** Comparing detailed distribution shapes across groups, when you need both summary statistics and distribution density\n",
"\n",
"**Limitations:** Can be more complex to interpret; kernel density estimation may smooth over important details"
]
},
{
"cell_type": "markdown",
"id": "3969834d",
"metadata": {},
"source": [
"## Interactive Advantages\n",
"\n",
"All hvPlot statistical plots benefit from Bokeh's interactive features:\n",
"\n",
"- **Linked brushing:** Selections in one part of the plot highlight corresponding points elsewhere\n",
"- **Linked zooming/panning:** Coordinated exploration across multiple plot panels\n",
"- **Hover tooltips:** Detailed information about individual data points\n",
"\n",
"These features make hvPlot's statistical plots significantly more powerful than static alternatives for data exploration."
]
},
{
"cell_type": "markdown",
"id": "5dd00ffb",
"metadata": {},
"source": [
"## Choosing the Right Plot Type\n",
"\n",
"| Goal | Recommended Plot | Why |\n",
"|------|------------------|-----|\n",
"| Find correlations between variable pairs | Scatter Matrix | Shows quantitative relationships clearly |\n",
"| Compare groups across many variables | Parallel Coordinates | Reveals which variables distinguish groups |\n",
"| Show overall class separation | Andrews Curves | Emphasizes aggregate differences |\n",
"| Analyze temporal dependencies | Lag Plot | Designed specifically for time series patterns |\n",
"| Understand single variable distribution | Histogram | Clear frequency distribution visualization |\n",
"| Compare distributions across groups | Box Plot or Violin Plot | Box plots for simple comparisons, violin plots for detailed shapes |\n",
"| Identify outliers | Box Plot | Explicitly shows outliers beyond quartile ranges |\n",
"| Detect multimodal distributions | Violin Plot or Histogram | Violin plots show density curves, histograms show frequency peaks |\n",
"| Quick distribution summary | Box Plot | Compact five-number summary |\n",
"| Detailed distribution analysis | Violin Plot | Combines summary statistics with full distribution shape |\n",
"| Detect outliers in multivariate data | Scatter Matrix + Parallel Coordinates | Combine pairwise and multi-dimensional views |"
]
},
{
"cell_type": "markdown",
"id": "41d345c1",
"metadata": {},
"source": [
"## Next Steps\n",
"\n",
"- Learn how to create:\n",
" - [multivariate statistical plots](../how_to/multivariate_statistical_plots.ipynb)\n",
" - [time series lag plots](../how_to/time_series_lag_plots.ipynb)\n",
"- See the [reference documentation](../ref/api/index.md) for complete parameter lists\n",
"- Explore more visualization options at [holoviews.org](https://holoviews.org)"
]
}
],
"metadata": {
"language_info": {
"name": "python",
"pygments_lexer": "ipython3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
12 changes: 12 additions & 0 deletions doc/how_to/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# How-To Guides

How-to guides are practical, problem-oriented instructions that help you accomplish specific tasks with hvPlot. These guides assume you're already familiar with the basics and want to solve particular problems or achieve specific goals.

```{toctree}
:titlesonly:
:hidden:
:maxdepth: 2

multivariate_statistical_plots
time_series_lag_plots
```
121 changes: 121 additions & 0 deletions doc/how_to/multivariate_statistical_plots.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "22a52969",
"metadata": {},
"source": [
"# How to Visualize Multivariate Data with Statistical Plots\n",
"\n",
"When working with datasets that have multiple variables, hvPlot provides several statistical plotting functions to help you explore relationships and patterns. This guide shows you how to use three key methods: scatter matrices, parallel coordinates, and Andrews curves."
]
},
{
"cell_type": "markdown",
"id": "c461984a",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"First, import hvplot and load a multivariate dataset:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "19eb19e7",
"metadata": {},
"outputs": [],
"source": [
"import hvplot.pandas # noqa\n",
"\n",
"penguins = hvplot.sampledata.penguins(\"pandas\").dropna()\n",
"penguins.head(3)"
]
},
{
"cell_type": "markdown",
"id": "ea800985",
"metadata": {},
"source": [
"## Scatter Matrix\n",
"\n",
"Use a scatter matrix to visualize all pairwise relationships between numeric variables:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "66bc47a4",
"metadata": {},
"outputs": [],
"source": [
"num_penguins = penguins[['species', 'bill_length_mm', 'bill_depth_mm',\n",
" 'flipper_length_mm', 'body_mass_g']]\n",
"hvplot.scatter_matrix(num_penguins, c=\"species\")"
]
},
{
"cell_type": "markdown",
"id": "ac6e0be3",
"metadata": {},
"source": [
"## Parallel Coordinates\n",
"\n",
"Use parallel coordinates to see patterns across all dimensions simultaneously:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1cca872a",
"metadata": {},
"outputs": [],
"source": [
"hvplot.parallel_coordinates(num_penguins, \"species\")"
]
},
{
"cell_type": "markdown",
"id": "926ca828",
"metadata": {},
"source": [
"## Andrews Curves\n",
"\n",
"Use Andrews curves to visualize aggregate differences between classes:"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d510d131",
"metadata": {},
"outputs": [],
"source": [
"hvplot.andrews_curves(num_penguins, \"species\")"
]
},
{
"cell_type": "markdown",
"id": "bc2a7c0e",
"metadata": {},
"source": [
":::{admonition} Next Steps\n",
":class: seealso\n",
"\n",
"- See the [explanation guide](../explanation/statistical_plot_types.ipynb) to understand when to use each plot type\n",
"- Check the [reference documentation](../ref/api/index.md) for complete parameter lists\n",
"- For time series analysis, see [how to analyze time series relationships](time_series_lag_plots.ipynb)\n",
":::"
]
}
],
"metadata": {
"language_info": {
"name": "python",
"pygments_lexer": "ipython3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
Loading
Loading