Skip to content

Commit 8bd02a5

Browse files
authored
Merge pull request #60 from sibusiso16/patch-3
Update introduction.Rmd
2 parents 237af65 + 085d3eb commit 8bd02a5

File tree

1 file changed

+11
-11
lines changed

1 file changed

+11
-11
lines changed

introduction.Rmd

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ As @r4ds argue, the exploratory phase of a data science workflow (Figure \@ref(f
1010
knitr::include_graphics("images/workflow.svg")
1111
```
1212

13-
In fact, packages within the **tidyverse** such as **dplyr** (transformation) and **ggplot2** (visualization) are such productive tools that many analysts use _static_ **ggplot2** graphics for EDA. Then, when it comes to communicating results, some analysts switch to another tool or language altogether (e.g., JavaScript) to generate interactive web graphics presenting their most important findings [@flowingdata-r]; [@nyt-r]. Unfortunately, this requires a heavy context switch that requires a totally different skillset and impedes productivity. Moreover, for the average analyst, the opportunity costs involved with becoming component with the complex world of web technologies is simply not worth the required investment.
13+
In fact, packages within the **tidyverse** such as **dplyr** (transformation) and **ggplot2** (visualization) are such productive tools that many analysts use _static_ **ggplot2** graphics for EDA. Then, when it comes to communicating results, some analysts switch to another tool or language altogether (e.g., JavaScript) to generate interactive web graphics presenting their most important findings [@flowingdata-r]; [@nyt-r]. Unfortunately, this requires a heavy context switch that requires a totally different skillset and impedes productivity. Moreover, for the average analyst, the opportunity costs involved with becoming competent with the complex world of web technologies is simply not worth the required investment.
1414

1515
Even before the web, interactive graphics were shown to have great promise in aiding the exploration of high-dimensional data [@Cook:2007uk]. The ASA maintains an incredible video library, <http://stat-graphics.org/movies/>, documenting the use of interactive statistical graphics for tasks that otherwise wouldn't have been easy or possible using numerical summaries and/or static graphics alone. Roughly speaking, these tasks tend to fall under three categories:
1616

@@ -20,14 +20,14 @@ Even before the web, interactive graphics were shown to have great promise in ai
2020

2121
Today, you can find and run some of these and similar Graphical User Interface (GUI) systems for creating interactive graphics: `DataDesk` <https://datadescription.com/>, `GGobi` <http://www.ggobi.org/>, `Mondrian` <http://www.theusrus.de/Mondrian/>, `JMP` <https://www.jmp.com>, `Tableau` <https://www.tableau.com/>. Although these GUI-based systems have nice properties, they don't gel with a code-based workflow: any tasks you complete through a GUI likely can't be replicated without human intervention. That means, if at any point, the data changes, and analysis outputs must be regenerated, you need to remember precisely how to reproduce the outcome, which isn't necessarily easy, trustworthy, or economical. Moreover, GUI-based systems are typically 'closed' systems that don't allow themselves to be easily customized, extended, or integrated with another system.
2222

23-
Programming interactive graphics allows you to leverage all the benefits of a code-based workflow while also helping with tasks that are difficult to accomplish with code alone. For an example, if you were to visualize engine displacement (`displ`) versus miles per gallon (`hwy`) using the `mpg` dataset, you might wonder: "what are these cars with an unusually high value of `hwy` given their `displ`?" Rather than trying to write code to query those observations, it would be more easier and intuitive to draw an outline around the points to query the data behind them.
23+
Programming interactive graphics allows you to leverage all the benefits of a code-based workflow while also helping with tasks that are difficult to accomplish with code alone. For an example, if you were to visualize engine displacement (`displ`) versus miles per gallon (`hwy`) using the `mpg` dataset, you might wonder: "what are these cars with an unusually high value of `hwy` given their `displ`?". Rather than trying to write code to query those observations, it would be more easier and intuitive to draw an outline around the points to query the data behind them.
2424

2525
```{r mpg-static, fig.cap = "(ref:mpg-static)"}
2626
library(ggplot2)
2727
ggplot(mpg, aes(displ, hwy)) + geom_point()
2828
```
2929

30-
Figure \@ref(fig:mpg-lasso) demonstrates how we can transform Figure \@ref(fig:mpg-static) into an interactive version that can be used to query and inspect points of interest. The framework that enables this kind of linked brushing is discussed in depth within Chapter \@ref(graphical-queries), but the point here is that the added effort required to enable such functionality is relatively small. This is important, because although interactivity _can_ augment exploration by allowing us to pursue follow-up questions, it's typically only _practical_ when we can create and alter them quickly. That's because, in a true exploratory setting, you have to make lots of visualizations, and investigate lots of follow-up questions, before stumbling across something truly valuable.
30+
Figure \@ref(fig:mpg-lasso) demonstrates how we can transform Figure \@ref(fig:mpg-static) into an interactive version that can be used to query and inspect points of interest. The framework that enables this kind of linked brushing is discussed in depth within Section \@ref(graphical-queries), but the point here is that the added effort required to enable such functionality is relatively small. This is important, because although interactivity _can_ augment exploration by allowing us to pursue follow-up questions, it's typically only _practical_ when we can create and alter them quickly. That's because, in a true exploratory setting, you have to make lots of visualizations, and investigate lots of follow-up questions, before stumbling across something truly valuable.
3131

3232
```r
3333
library(plotly)
@@ -58,7 +58,7 @@ ggplot(mpg, aes(displ, hwy)) +
5858
)
5959
```
6060

61-
This simple example quickly shows how interactive web graphics can assist EDA (for another, slightly more in-depth example, see Chapter \@ref(intro-ggplotly)). Being able to program these graphics from `R` allows one to combine their functionality within a world-class computing environment for data analysis and statistics. Programming interactive graphics may not be as intuitive as using a GUI-based system, but making the investment pays dividends in terms of workflow improvements: automation, scaling, provenance, and flexibility.
61+
This simple example quickly shows how interactive web graphics can assist EDA (for another, slightly more in-depth example, see Section \@ref(intro-ggplotly)). Being able to program these graphics from `R` allows one to combine their functionality within a world-class computing environment for data analysis and statistics. Programming interactive graphics may not be as intuitive as using a GUI-based system, but making the investment pays dividends in terms of workflow improvements: automation, scaling, provenance, and flexibility.
6262

6363
## What you will learn {#what-you-will-learn}
6464

@@ -74,7 +74,7 @@ plot_ly(diamonds, x = ~cut, color = ~clarity, colors = "Accent")
7474
include_vimeo("315707813")
7575
```
7676

77-
To its ability to link multiple data views purely client-side (see Chapter \@ref(graphical-queries)):
77+
To its ability to link multiple data views purely client-side (see Section \@ref(graphical-queries)):
7878

7979
```{r storms-preview, echo = FALSE, fig.cap = "(ref:storms-preview)"}
8080
include_vimeo("257149623")
@@ -103,19 +103,19 @@ By going through the code behind these examples, you'll see that many of them le
103103

104104
This book contains six parts and each part contains numerous chapters. A summary of each part is provided below.
105105

106-
1. _Creating views:_ introduces the process of transforming data into graphics via **plotly**'s programmatic interface. It focuses mostly on `plot_ly()`, which can interface directly with the underlying plotly.js graphing library, but emphasis is put on features unique to the `R` package that make it easier to transform data into graphics. Another way to create graphs with **plotly** is to use the `ggplotly()` function to transform **ggplot2** graphs into **plotly** graphs. Chapter \@ref(intro-ggplotly) discusses when and why `ggplotly()` might be desirable to `plot_ly()`. It's also worth mentioning that this part (nor the book as a whole) does not intend to cover every possible chart type and option available in **plotly** -- it's more of a presentation of the most generally useful techniques with the greater `R` ecosystem in mind. For a more exhaustive gallery of examples of what **plotly** itself is capable of, see <https://plot.ly/r/>.
106+
1. _Creating views:_ introduces the process of transforming data into graphics via **plotly**'s programmatic interface. It focuses mostly on `plot_ly()`, which can interface directly with the underlying plotly.js graphing library, but emphasis is put on features unique to the `R` package that make it easier to transform data into graphics. Another way to create graphs with **plotly** is to use the `ggplotly()` function to transform **ggplot2** graphs into **plotly** graphs. Section \@ref(intro-ggplotly) discusses when and why `ggplotly()` might be desirable to `plot_ly()`. It's also worth mentioning that this part (nor the book as a whole) does not intend to cover every possible chart type and option available in **plotly** -- it's more of a presentation of the most generally useful techniques with the greater `R` ecosystem in mind. For a more exhaustive gallery of examples of what **plotly** itself is capable of, see <https://plot.ly/r/>.
107107

108108
2. _Publishing views:_ discusses various techniques for exporting (as well as embedding) **plotly** graphs to various file formats (e.g., HTML, svg, pdf, png, etc). Also, Chapter \@ref(editing-views) demonstrates how one could leverage editable layout components HTML to touch-up a graph, then export to a static file format of interest before publication. Indeed, this book was created using the techniques from this section.
109109

110110
3. _Combining multiple views:_ demonstrates how to combine multiple data views into a single web page (arranging) or graphic (animation). Most of these techniques are shown using **plotly** graphs, but techniques from Section \@ref(arranging-htmlwidgets) extend to any HTML content generated via **htmltools** (which includes **htmlwidgets**).
111111

112-
4. _Linking multiple views:_ provides an overview of the two models for linking **plotly** graph(s) to other data views. The first model, covered in Chapter \@ref(graphical-queries), outlines **plotly**'s support for linking views purely client-side, meaning the resulting graphs render in any web browser on any machine without requiring external software. The second model, covered in Chapter \@ref(linking-views-with-shiny), demonstrates how to link **plotly** with other views via **shiny**, a reactive web application framework for `R`. Relatively speaking, the second model grants the `R` user way more power and flexbility, but comes at the cost of requiring more computational infrastructure. That being said, RStudio provides accessible resources for deploying **shiny** apps <https://shiny.rstudio.com/articles/#deployment>.
112+
4. _Linking multiple views:_ provides an overview of the two models for linking **plotly** graph(s) to other data views. The first model, covered in Section \@ref(graphical-queries), outlines **plotly**'s support for linking views purely client-side, meaning the resulting graphs render in any web browser on any machine without requiring external software. The second model, covered in Chapter \@ref(linking-views-with-shiny), demonstrates how to link **plotly** with other views via **shiny**, a reactive web application framework for `R`. Relatively speaking, the second model grants the `R` user way more power and flexbility, but comes at the cost of requiring more computational infrastructure. That being said, RStudio provides accessible resources for deploying **shiny** apps <https://shiny.rstudio.com/articles/#deployment>.
113113

114114
5. _Custom behavior with JavaScript:_ demonstrates various ways to customize **plotly** graphs by writing custom JavaScript to handle certain user events. This part of the book is designed to be approachable for `R` users that want to learn just enough JavaScript to **plotly** to do something it doesn't "natively" support.
115115

116116
6. _Various special topics_: offers a grab-bag of topics that address common questions, mostly related to the customization of **plotly** graphs in `R`.
117117

118-
You might already notice that this book often uses the term 'view' or 'data view', so here we take a moment to frame its use in a wider context. As @Wills2008 puts it: "a 'data view' is anything that gives the user a way of examining data so as to gain insight and understanding. A data view is usually thought of as a barchart, scatterplot, or other traditional statistical graphic, but we use the term more generally, including 'views' such as the results of a regression analysis, a neural net prediction, or a set of descriptive statistics". In this book, more often than not, the term 'view' typically refers to a **plotly** graph or other **htmlwidgets** (e.g., **DT**, **leaflet**, etc). In particular, Chapter \@ref(graphical-queries) is all about linking multiple **htmlwidgets** together through a graphical database querying framework. However, the term 'view' takes on a more general interpretation in Chapter \@ref(linking-views-with-shiny) since the reactive programming framework that **shiny** provides allows us to have a more general conversation surrounding linked data views.
118+
You might already notice that this book often uses the term 'view' or 'data view', so here we take a moment to frame its use in a wider context. As @Wills2008 puts it: "a 'data view' is anything that gives the user a way of examining data so as to gain insight and understanding. A data view is usually thought of as a barchart, scatterplot, or other traditional statistical graphic, but we use the term more generally, including 'views' such as the results of a regression analysis, a neural net prediction, or a set of descriptive statistics". In this book, more often than not, the term 'view' typically refers to a **plotly** graph or other **htmlwidgets** (e.g., **DT**, **leaflet**, etc). In particular, Section \@ref(graphical-queries) is all about linking multiple **htmlwidgets** together through a graphical database querying framework. However, the term 'view' takes on a more general interpretation in Chapter \@ref(linking-views-with-shiny) since the reactive programming framework that **shiny** provides allows us to have a more general conversation surrounding linked data views.
119119

120120

121121
## What you won't learn (much of)
@@ -124,7 +124,7 @@ You might already notice that this book often uses the term 'view' or 'data view
124124

125125
Although this book is fundamentally about creating web graphics, it does not aim to teach you web technologies (e.g., HTML, SVG, CSS, JavaScript, etc). It's true that mastering these technologies grants you the ability to build really impressive websites, but even expert web developers would say their skillset is much better suited for expository rather than exploratory visualization. That's because, most web programming tools are not well-suited for the exploratory phase of a data science workflow where iteration between data visualization, transformation, and modeling is a necessary task that often impedes hypothesis generation and sense-making. As a result, for most data analysts whose primary function is to derive insight from data, the opportunity costs involved with mastering web technologies is usually not worth the investment.
126126

127-
That being said, learning a little about web technologies can have a relatively large payoff with directed learning and instruction. In Chapter \@ref(javascript), you'll learn how to customize **plotly** graphs with JavaScript -- even if you haven't seen JavaScript before, this Chapter should be approachable, insightful, and provide you with some useful examples.
127+
That being said, learning a little about web technologies can have a relatively large payoff with directed learning and instruction. In Chapter \@ref(javascript), you'll learn how to customize **plotly** graphs with JavaScript -- even if you haven't seen JavaScript before, this chapter should be approachable, insightful, and provide you with some useful examples.
128128

129129
### d3js
130130

@@ -144,7 +144,7 @@ Encoding information in a graphic (concisely and effectively) is a large topic u
144144

145145
## Prerequisites
146146

147-
For those new to `R` and/or data visualization, [R for Data Science](https://r4ds.had.co.nz/) provides a excellent foundation for understanding the vast majority of concepts covered in this book [@r4ds]. In particular, if you have a solid grasp on [Part I: Explore](https://r4ds.had.co.nz/explore-intro.html), [Part II: Wrangle](https://r4ds.had.co.nz/wrangle-intro.html), and [Part III: Program](https://r4ds.had.co.nz/program-intro.html), you should be able to understand most everything here. Although not explicitly covered, the book does make references to (and was creating using) **rmarkdown**, so if you're new to **rmarkdown**, I also recommend reading the [R Markdown chapter](https://r4ds.had.co.nz/r-markdown.html).
147+
For those new to `R` and/or data visualization, [R for Data Science](https://r4ds.had.co.nz/) provides a excellent foundation for understanding the vast majority of concepts covered in this book [@r4ds]. In particular, if you have a solid grasp on [Part I: Explore](https://r4ds.had.co.nz/explore-intro.html), [Part II: Wrangle](https://r4ds.had.co.nz/wrangle-intro.html), and [Part III: Program](https://r4ds.had.co.nz/program-intro.html), you should be able to understand almost everything here. Although not explicitly covered, the book does make references to (and was creating using) **rmarkdown**, so if you're new to **rmarkdown**, I also recommend reading the [R Markdown chapter](https://r4ds.had.co.nz/r-markdown.html).
148148

149149
## Run code examples
150150

@@ -165,7 +165,7 @@ This book wouldn't be possible without the generous assistance and mentorship of
165165

166166
* Heike Hofmann and Di Cook for their mentorship and many helpful conversations about interactive graphics.
167167
* Toby Dylan Hocking for many helpful conversations, his mentorship in the `R` packages **animint** and **plotly**, and laying the original foundation behind `ggplotly()`.
168-
* Joe Cheng for many helpful conversations and inspiring section \@ref(graphical-queries).
168+
* Joe Cheng for many helpful conversations and inspiring Section \@ref(graphical-queries).
169169
* Étienne Tétreault-Pinard, Alex Johnson, and the other plotly.js core developers for responding to my feature requests and bug reports.
170170
* Yihui Xie for his work on **knitr**, **rmarkdown**, **bookdown**, [bookdown-crc](https://github.com/yihui/bookdown-crc), and responding to my feature requests.
171171
* Anthony Unwin for helpful feedback, suggestions, and for inspiring Figure \@ref(fig:epl).

0 commit comments

Comments
 (0)