Skip to content

Commit 65050e3

Browse files
committed
starting with the vega-altair episode
1 parent 0ca5a4d commit 65050e3

File tree

2 files changed

+99
-0
lines changed

2 files changed

+99
-0
lines changed

content/index.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@ to learn yourself as you need to.
7272
60 min ; :doc:`pandas`
7373
30 min ; :doc:`xarray`
7474
60 min ; :doc:`plotting-matplotlib`
75+
60 min ; :doc:`plotting-vega-altair`
7576
30 min ; :doc:`data-formats`
7677
60 min ; :doc:`scripts`
7778
40 min ; :doc:`profiling`
@@ -97,6 +98,7 @@ to learn yourself as you need to.
9798
pandas
9899
xarray
99100
plotting-matplotlib
101+
plotting-vega-altair
100102
data-formats
101103
scripts
102104
profiling

content/plotting-vega-altair.md

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
# Plotting with [Vega-Altair](https://altair-viz.github.io/)
2+
3+
:::{instructor-note}
4+
- 10 min: Introduction
5+
- 10 min: Type-along (creating a first plot)
6+
- 20 min: Exercise (using visual channels)
7+
- 20 min: Exercise (adapting a gallery example and customizing)
8+
- 10 min: Key points, discussion, and Q&A
9+
:::
10+
11+
12+
## Repeatability/reproducibility
13+
14+
From [Claus O. Wilke: "Fundamentals of Data Visualization"](https://clauswilke.com/dataviz/):
15+
16+
> *One thing I have learned over the years is that automation is your friend. I
17+
> think figures should be autogenerated as part of the data analysis pipeline
18+
> (which should also be automated), and they should come out of the pipeline
19+
> ready to be sent to the printer, no manual post-processing needed.*
20+
21+
- **Try to minimize manual post-processing**. This could bite you when you need to regenerate 50
22+
figures one day before submission deadline or regenerate a set of figures
23+
after the person who created them left the group.
24+
- There is not the one perfect language and **not the one perfect library** for everything.
25+
- Within Python, many libraries exist:
26+
- [Vega-Altair](https://altair-viz.github.io/gallery/index.html):
27+
declarative visualization, statistics built in
28+
- [Matplotlib](https://matplotlib.org/stable/gallery/index.html):
29+
probably the most standard and most widely used
30+
- [Seaborn](https://seaborn.pydata.org/examples/index.html):
31+
high-level interface to Matplotlib, statistical functions built in
32+
- [Plotly](https://plotly.com/python/):
33+
interactive graphs
34+
- [Bokeh](https://demo.bokeh.org/):
35+
also here good for interactivity
36+
- [plotnine](https://plotnine.readthedocs.io/):
37+
implementation of a grammar of graphics in Python, it is based on [ggplot2](https://ggplot2.tidyverse.org/)
38+
- [ggplot](https://yhat.github.io/ggpy/):
39+
R users will be more at home
40+
- [PyNGL](https://www.pyngl.ucar.edu/Examples/gallery.shtml):
41+
used in the weather forecast community
42+
- [K3D](https://k3d-jupyter.org/gallery/index.html):
43+
Jupyter Notebook extension for 3D visualization
44+
- ...
45+
- Two main families of libraries: procedural (e.g. Matplotlib) and declarative (e.g. Vega-Altair).
46+
47+
48+
## Why are we starting with [Vega-Altair](https://altair-viz.github.io/)?
49+
50+
- Concise and powerful
51+
- "Simple, friendly and consistent API" allows us to focus on the data
52+
visualization part and get started without too much Python knowledge
53+
- The way it **combines visual channels with data columns** can feel intuitive
54+
- Interfaces very nicely with [pandas](https://pandas.pydata.org/)
55+
- Easy to change figures
56+
- Good documentation
57+
- Open source
58+
- Makes it easy to save figures in a number of formats
59+
- Easy to save interactive visualizations to be used in websites
60+
61+
62+
## Exercise
63+
64+
In this exercise we can try to adapt existing scripts to either **tweak how the
65+
plot looks** or to **modify the input data**. This is very close to real life:
66+
there are so many options and possibilities and it is almost impossible to
67+
remember everything so this strategy is useful to practice:
68+
- Select an example that is close to what you have in mind
69+
- Being able to adapt it to your needs
70+
- Being able to search for help
71+
- Being able to understand help request answers (not easy)
72+
73+
:::{challenge} Exercise Customization-1: Adapting a gallery example
74+
**This is a great exercise which is very close to real life.**
75+
76+
- Browse the [Vega-Altair example gallery](https://altair-viz.github.io/gallery/index.html).
77+
- Select one example that is close to your current/recent visualization project
78+
or simply interests you.
79+
- First try to reproduce this example, as-is, in the Jupyter Notebook.
80+
- Then try to print out the data that is used in this example just before the call of the plotting function
81+
to learn about its structure.
82+
- Then try to modify the data a bit.
83+
- If you have time, try to feed it different, simplified data.
84+
This will be key for adapting the examples to your projects.
85+
:::
86+
87+
---
88+
89+
:::{keypoints}
90+
- Browse a number of example galleries to help you choose the library
91+
that fits best your work/style.
92+
- Figures for presentation slides and figures for manuscripts have
93+
different requirements.
94+
- Think about color-vision deficiencies when choosing colors. There are color
95+
palettes optimized for this.
96+
- Minimize manual post-processing and try to script all steps.
97+
:::

0 commit comments

Comments
 (0)