Skip to content

Commit 72d29ee

Browse files
committed
More concise terminology and description of the supported input data formats.
1 parent df4c389 commit 72d29ee

File tree

2 files changed

+117
-36
lines changed

2 files changed

+117
-36
lines changed

doc/_embedded_plots/grouped_bar.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
import matplotlib.pyplot as plt
22

3-
group_labels = ['group A', 'group B']
3+
categories = ['A', 'B']
44
data0 = [1.0, 3.0]
55
data1 = [1.4, 3.4]
66
data2 = [1.8, 3.8]
77

88
fig, ax = plt.subplots(figsize=(4, 2.2))
99
ax.grouped_bar(
1010
[data0, data1, data2],
11-
tick_labels=group_labels,
11+
tick_labels=categories,
1212
labels=['dataset 0', 'dataset 1', 'dataset 2'],
1313
colors=['#1f77b4', '#58a1cf', '#abd0e6'],
1414
)

lib/matplotlib/axes/_axes.py

Lines changed: 115 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -2506,6 +2506,7 @@ def bar(self, x, height, width=0.8, bottom=None, *, align="center",
25062506
See Also
25072507
--------
25082508
barh : Plot a horizontal bar plot.
2509+
grouped_bar : Plot multiple datasets as grouped bar plot.
25092510
25102511
Notes
25112512
-----
@@ -3073,84 +3074,102 @@ def grouped_bar(self, heights, *, positions=None, group_spacing=1.5, bar_spacing
30733074
Make a grouped bar plot.
30743075
30753076
.. note::
3076-
This function is new in v3.10, and the API is still provisional.
3077+
This function is new in v3.11, and the API is still provisional.
30773078
We may still fine-tune some aspects based on user-feedback.
30783079
3079-
This is a convenience function to plot bar charts for multiple datasets
3080-
into one Axes. In particular, it simplifies positioning of the bars
3081-
compared to individual `~.Axes.bar` plots.
3080+
This is a convenience function to plot bars for multiple datasets.
3081+
In particular, it simplifies positioning of the bars compared to individual
3082+
`~.Axes.bar` plots.
30823083
3083-
Terminology: A bar *group* is a set of bars drawn next to each other. They
3084-
can be associated with a group name, which is visualized as the tick label
3085-
below that group. A *dataset* is a set of values, one for each bar group.
3086-
This means *dataset_0* will be rendered as the first bar in each bar group.
3084+
Bar plots present categorical data as a sequence of bars, one bar per category.
3085+
We call one set of such values a *dataset* and it's bars all share the same
3086+
color. Grouped bar plots show multiple such datasets, where the values per
3087+
category are grouped together. The category names are drawn as tick labels
3088+
below the bar groups. Each dataset has a distinct bar color, and can optionally
3089+
get a label that is used for the legend.
3090+
3091+
Here is an example call structure and the corresponding plot:
3092+
3093+
.. code-block:: python
3094+
3095+
grouped_bar([dataset_1, dataset_2, dataset_3],
3096+
tick_labels=['A', 'B'],
3097+
labels=['dataset 1', 'dataset 2', 'dataset 3'])
30873098
30883099
.. plot:: _embedded_plots/grouped_bar.py
30893100
30903101
Parameters
30913102
----------
3092-
heights : list of array-like or dict of array-like or 2D array
3103+
heights : list of array-like or dict of array-like or 2D array \
3104+
or pandas.DataFrame
30933105
The heights for all x and groups. One of:
30943106
30953107
- list of array-like: A list of datasets, each dataset must have
30963108
the same number of elements.
30973109
30983110
.. code-block:: none
30993111
3100-
# group_A group_B
3101-
dataset_0 = [ds0_a, ds0_b]
3102-
dataset_1 = [ds1_a, ds1_b]
3103-
dataset_2 = [ds2_a, ds2_b]
3104-
3105-
heights = [dataset_0, dataset_1, dataset_2]
3112+
# category_A, category_B
3113+
dataset_0 = [ds0_A, ds0_B]
3114+
dataset_1 = [ds1_A, ds1_B]
3115+
dataset_2 = [ds2_A, ds2_B]
31063116
31073117
Example call::
31083118
31093119
grouped_bar([dataset_0, dataset_1, dataset_2])
31103120
3111-
- dict of array-like: A mapping names to datasets. Each dataset
3112-
(dict value) must have the same number of elements elements.
3121+
- dict of array-like: A mapping from names to datasets. Each dataset
3122+
(dict value) must have the same number of elements.
31133123
31143124
This is similar to passing a list of array-like, with the addition that
31153125
each dataset gets a name.
31163126
3117-
Example call::
3127+
Example call:
3128+
3129+
.. code-block:: python
31183130
31193131
grouped_bar({'ds0': dataset_0, 'ds1': dataset_1, 'ds2': dataset_2]})
31203132
31213133
The names are used as *labels*, i.e. the following two calls are
3122-
equivalent::
3134+
equivalent:
3135+
3136+
.. code-block:: python
31233137
31243138
data_dict = {'ds0': dataset_0, 'ds1': dataset_1, 'ds2': dataset_2]}
31253139
grouped_bar(data_dict)
31263140
grouped_bar(data_dict.values(), labels=data_dict.keys())
31273141
31283142
When using a dict-like input, you must not pass *labels* explicitly.
31293143
3130-
- a 2D array: The columns are the different datasets.
3144+
- a 2D array: The rows are the categories, the columns are the different
3145+
datasets.
31313146
31323147
.. code-block:: none
31333148
3134-
dataset_0 dataset_1 dataset_2
3135-
group_A ds0_a ds1_a ds2_a
3136-
group_B ds0_b ds1_b ds2_b
3149+
dataset_0 dataset_1 dataset_2
3150+
category_A ds0_a ds1_a ds2_a
3151+
category_B ds0_b ds1_b ds2_b
3152+
3153+
Example call:
31373154
3138-
.. code-block::
3155+
.. code-block:: python
31393156
31403157
group_labels = ["group_A", "group_B"]
31413158
dataset_labels = ["dataset_0", "dataset_1", "dataset_2"]
31423159
array = np.random.random((2, 3))
31433160
31443161
Note that this is consistent with pandas. These two calls produce
3145-
the same bar plot structure::
3162+
the same bar plot structure:
31463163
3147-
grouped_bar(array, tick_labels=group_labels, labels=dataset_labels)
3148-
df = pd.DataFrame(array, index=group_labels, columns=dataset_labels)
3164+
.. code-block:: python
3165+
3166+
grouped_bar(array, tick_labels=categories, labels=dataset_labels)
3167+
df = pd.DataFrame(array, index=categories, columns=dataset_labels)
31493168
df.plot.bar()
31503169
31513170
- a `pandas.DataFrame`.
31523171
3153-
.. code-block::
3172+
.. code-block:: python
31543173
31553174
df = pd.DataFrame(
31563175
np.random.random((2, 3))
@@ -3159,15 +3178,15 @@ def grouped_bar(self, heights, *, positions=None, group_spacing=1.5, bar_spacing
31593178
)
31603179
grouped_bar(df)
31613180
3162-
Note that ``grouped_bar(df)`` produced a structurally equivalent plot like
3163-
``df.plot.bar()`.
3181+
Note that ``grouped_bar(df)`` produces a structurally equivalent plot like
3182+
``df.plot.bar()``.
31643183
31653184
positions : array-like, optional
31663185
The center positions of the bar groups. The values have to be equidistant.
31673186
If not given, a sequence of integer positions 0, 1, 2, ... is used.
31683187
31693188
tick_labels: list of str, optional
3170-
The group labels, which are placed on ticks at the center *positions*
3189+
The category labels, which are placed on ticks at the center *positions*
31713190
of the bar groups.
31723191
31733192
If not set, the axis ticks (positions and labels) are left unchanged.
@@ -3176,10 +3195,13 @@ def grouped_bar(self, heights, *, positions=None, group_spacing=1.5, bar_spacing
31763195
The labels of the datasets, i.e. the bars within one group.
31773196
These will show up in the legend.
31783197
3179-
group_spacing : float
3198+
group_spacing : float, default: 1.5
31803199
The space between two bar groups in units of bar width.
31813200
3182-
bar_spacing : float
3201+
The default value of 1.5 thus means that there's a gap of
3202+
1.5 bar widths between bar groups.
3203+
3204+
bar_spacing : float, default: 0
31833205
The space between bars in units of bar width.
31843206
31853207
orientation : {"vertical", "horizontal"}, default: "vertical"
@@ -3202,7 +3224,7 @@ def grouped_bar(self, heights, *, positions=None, group_spacing=1.5, bar_spacing
32023224
_GroupedBarReturn
32033225
32043226
A provisional result object. This will be refined in the future.
3205-
For now, the API is limited to
3227+
For now, the guaranteed API on the returned object is limited to
32063228
32073229
- the attribute ``bar_containers``, which is a list of
32083230
`.BarContainer`, i.e. the results of the individual `~.Axes.bar`
@@ -3211,6 +3233,65 @@ def grouped_bar(self, heights, *, positions=None, group_spacing=1.5, bar_spacing
32113233
- a ``remove()`` method, that remove all bars from the Axes.
32123234
See also `.Artist.remove()`.
32133235
3236+
See Also
3237+
--------
3238+
bar : A lower-level API for bar plots, with more degrees of freedom like
3239+
individual bar sizes and colors.
3240+
3241+
Notes
3242+
-----
3243+
For a better understanding, we compare the `~.Axes.grouped_bar` API with
3244+
those of `~.Axes.bar` and `~.Axes.boxplot`.
3245+
3246+
**Comparison to bar()**
3247+
3248+
`~.Axes.grouped_bar` intentionally deviates from the `~.Axes.bar` API in some
3249+
aspects. ``bar(x, y)`` is a lower-level API and places bars with height *y*
3250+
at explicit positions *x*. It also allows to specify individual bar widths
3251+
and colors. This kind of detailed control and flexibility is difficult to
3252+
manage and often not needed when plotting multiple datasets as grouped bar
3253+
plot. Therefore, ``grouped_bar`` focusses on the abstraction of bar plots
3254+
as visualization of categorical data.
3255+
3256+
The following examples may help to transfer from ``bar`` to
3257+
``grouped_bar``.
3258+
3259+
Positions are de-emphasized due to categories, and default to integer values.
3260+
If you have used ``range(N)`` as positions, you can leave that value out::
3261+
3262+
bar(range(N), heights)
3263+
grouped_bar([heights])
3264+
3265+
If needed, positions can be passed as keyword arguments::
3266+
3267+
bar(x, heights)
3268+
grouped_bar([heights], positions=x)
3269+
3270+
To place category labels in `~.Axes.bar` you could use the argument
3271+
*tick_label* or use a list of category names as *x*.
3272+
`~.Axes.grouped_bar` expects them in the argument *tick_labels*::
3273+
3274+
bar(range(N), heights, tick_label=["A", "B"])
3275+
bar(["A", "B"], heights)
3276+
grouped_bar([heights], tick_labels=["A", "B"])
3277+
3278+
Dataset labels, which are shown in the legend, are still passed via the
3279+
*label* parameter::
3280+
3281+
bar(..., label="dataset")
3282+
grouped_bar(..., label=["dataset"])
3283+
3284+
**Comparison to boxplot()**
3285+
3286+
Both, `~.Axes.grouped_bar` and `~.Axes.boxplot` visualize categorical data
3287+
from multiple datasets. The basic API on *tick_labels* and *positions*
3288+
is the same, so that you can easily switch between plotting all
3289+
individual values as `~.Axes.grouped_bar` or the statistical distribution
3290+
per category as `~.Axes.boxplot`::
3291+
3292+
grouped_bar(values, positions=..., tick_labels=...)
3293+
boxplot(values, positions=..., tick_labels=...)
3294+
32143295
"""
32153296
if cbook._is_pandas_dataframe(heights):
32163297
if labels is None:

0 commit comments

Comments
 (0)