predictive to predictive question (and similar change elsewhere)

trevorcampbell · trevorcampbell · commit de2e0b80f55f · 2021-09-27T10:22:49.000-07:00
diff --git a/intro.Rmd b/intro.Rmd
@@ -77,12 +77,12 @@ all of which are defined in Table \@ref(tab:questions-table).
 Carefully formulating a question as early as possible in your analysis&mdash;and 
 correctly identifying which type of question it is&mdash;will guide your overall approach to 
 the analysis as well as the selection of appropriate tools.\index{question!data analysis}
-\index{descriptive!definition}
-\index{exploratory!definition}
-\index{predictive!definition}
-\index{inferential!definition}
-\index{causal!definition}
-\index{mechanistic!definition}
+\index{descriptive question!definition}
+\index{exploratory question!definition}
+\index{predictive question!definition}
+\index{inferential question!definition}
+\index{causal question!definition}
+\index{mechanistic question!definition}
 
 Table: (\#tab:questions-table) Types of data analysis question. From [What is the question?](https://science.sciencemag.org/content/347/6228/1314) [@leek2015question] and [The Art of Data Science](https://leanpub.com/artofdatascience) [@peng2015art].
 
@@ -157,9 +157,9 @@ Since we are using R for data analysis in this book, the first step for us is to
 load the data into R. When we load tabular data into
 R, it is represented as a *data frame* object\index{data frame!overview}. Figure
 \@ref(fig:img-spreadsheet-vs-dataframe) shows that an R data frame is very similar
-to a spreadsheet. We refer to the rows as **observations** \index{observation}; these are the things that we
-collect the data on, e.g., voters, cities, etc. We refer to the columns as
-**variables** \index{variable}; these are the characteristics of those observations, e.g., voters' political
+to a spreadsheet. We refer to the rows as \index{observation} **observations**; these are the things that we
+collect the data on, e.g., voters, cities, etc. We refer to the columns as \index{variable}
+**variables**; these are the characteristics of those observations, e.g., voters' political
 affiliations, cities' populations, etc. 
 
 
@@ -479,7 +479,7 @@ ggplot(ten_lang, aes(x = language, y = mother_tongue)) +
 > time, a single expression in R must be contained in a single line of code.
 > However, there *are* a small number of situations in which you can have a
 > single R expression span multiple lines. Above is one such case: here, R knows that a line cannot
-> end with a `+` symbol, and so it keeps reading the next line to figure out
+> end with a `+` symbol, \index{plussymb@$+$} and so it keeps reading the next line to figure out
 > what the right hand side of the `+` symbol should be.  We could, of course,
 > put all of the added layers on one line of code, but splitting them across
 > multiple lines helps a lot with code readability. \index{multi-line expression}
diff --git a/regression1.Rmd b/regression1.Rmd
@@ -32,7 +32,7 @@ By the end of the chapter, readers will be able to:
 
 ## The regression problem
 
-Regression, like classification, is a predictive \index{predictive} problem setting where we want
+Regression, like classification, is a predictive \index{predictive question} problem setting where we want
 to use past information to predict future observations. But in the case of
 regression, the goal is to predict *numerical* values instead of *categorical* values. 
 The variable that you want to predict is often called the *response variable*. \index{response variable}