|
| 1 | +# Plotting with [Vega-Altair](https://altair-viz.github.io/) |
| 2 | + |
| 3 | +:::{instructor-note} |
| 4 | +- 10 min: Introduction |
| 5 | +- 10 min: Type-along (creating a first plot) |
| 6 | +- 20 min: Exercise (using visual channels) |
| 7 | +- 20 min: Exercise (adapting a gallery example and customizing) |
| 8 | +- 10 min: Key points, discussion, and Q&A |
| 9 | +::: |
| 10 | + |
| 11 | + |
| 12 | +## Repeatability/reproducibility |
| 13 | + |
| 14 | +From [Claus O. Wilke: "Fundamentals of Data Visualization"](https://clauswilke.com/dataviz/): |
| 15 | + |
| 16 | +> *One thing I have learned over the years is that automation is your friend. I |
| 17 | +> think figures should be autogenerated as part of the data analysis pipeline |
| 18 | +> (which should also be automated), and they should come out of the pipeline |
| 19 | +> ready to be sent to the printer, no manual post-processing needed.* |
| 20 | +
|
| 21 | +- **Try to minimize manual post-processing**. This could bite you when you need to regenerate 50 |
| 22 | + figures one day before submission deadline or regenerate a set of figures |
| 23 | + after the person who created them left the group. |
| 24 | +- There is not the one perfect language and **not the one perfect library** for everything. |
| 25 | +- Within Python, many libraries exist: |
| 26 | + - [Vega-Altair](https://altair-viz.github.io/gallery/index.html): |
| 27 | + declarative visualization, statistics built in |
| 28 | + - [Matplotlib](https://matplotlib.org/stable/gallery/index.html): |
| 29 | + probably the most standard and most widely used |
| 30 | + - [Seaborn](https://seaborn.pydata.org/examples/index.html): |
| 31 | + high-level interface to Matplotlib, statistical functions built in |
| 32 | + - [Plotly](https://plotly.com/python/): |
| 33 | + interactive graphs |
| 34 | + - [Bokeh](https://demo.bokeh.org/): |
| 35 | + also here good for interactivity |
| 36 | + - [plotnine](https://plotnine.readthedocs.io/): |
| 37 | + implementation of a grammar of graphics in Python, it is based on [ggplot2](https://ggplot2.tidyverse.org/) |
| 38 | + - [ggplot](https://yhat.github.io/ggpy/): |
| 39 | + R users will be more at home |
| 40 | + - [PyNGL](https://www.pyngl.ucar.edu/Examples/gallery.shtml): |
| 41 | + used in the weather forecast community |
| 42 | + - [K3D](https://k3d-jupyter.org/gallery/index.html): |
| 43 | + Jupyter Notebook extension for 3D visualization |
| 44 | + - ... |
| 45 | +- Two main families of libraries: procedural (e.g. Matplotlib) and declarative (e.g. Vega-Altair). |
| 46 | + |
| 47 | + |
| 48 | +## Why are we starting with [Vega-Altair](https://altair-viz.github.io/)? |
| 49 | + |
| 50 | +- Concise and powerful |
| 51 | +- "Simple, friendly and consistent API" allows us to focus on the data |
| 52 | + visualization part and get started without too much Python knowledge |
| 53 | +- The way it **combines visual channels with data columns** can feel intuitive |
| 54 | +- Interfaces very nicely with [pandas](https://pandas.pydata.org/) |
| 55 | +- Easy to change figures |
| 56 | +- Good documentation |
| 57 | +- Open source |
| 58 | +- Makes it easy to save figures in a number of formats |
| 59 | +- Easy to save interactive visualizations to be used in websites |
| 60 | + |
| 61 | + |
| 62 | +## Exercise |
| 63 | + |
| 64 | +In this exercise we can try to adapt existing scripts to either **tweak how the |
| 65 | +plot looks** or to **modify the input data**. This is very close to real life: |
| 66 | +there are so many options and possibilities and it is almost impossible to |
| 67 | +remember everything so this strategy is useful to practice: |
| 68 | +- Select an example that is close to what you have in mind |
| 69 | +- Being able to adapt it to your needs |
| 70 | +- Being able to search for help |
| 71 | +- Being able to understand help request answers (not easy) |
| 72 | + |
| 73 | +:::{challenge} Exercise Customization-1: Adapting a gallery example |
| 74 | +**This is a great exercise which is very close to real life.** |
| 75 | + |
| 76 | +- Browse the [Vega-Altair example gallery](https://altair-viz.github.io/gallery/index.html). |
| 77 | +- Select one example that is close to your current/recent visualization project |
| 78 | + or simply interests you. |
| 79 | +- First try to reproduce this example, as-is, in the Jupyter Notebook. |
| 80 | +- Then try to print out the data that is used in this example just before the call of the plotting function |
| 81 | + to learn about its structure. |
| 82 | +- Then try to modify the data a bit. |
| 83 | +- If you have time, try to feed it different, simplified data. |
| 84 | + This will be key for adapting the examples to your projects. |
| 85 | +::: |
| 86 | + |
| 87 | +--- |
| 88 | + |
| 89 | +:::{keypoints} |
| 90 | +- Browse a number of example galleries to help you choose the library |
| 91 | + that fits best your work/style. |
| 92 | +- Figures for presentation slides and figures for manuscripts have |
| 93 | + different requirements. |
| 94 | +- Think about color-vision deficiencies when choosing colors. There are color |
| 95 | + palettes optimized for this. |
| 96 | +- Minimize manual post-processing and try to script all steps. |
| 97 | +::: |
0 commit comments