jtkiley
diff --git a/‎notebooks/1c_visualization.ipynb‎
Lines changed: 34 additions & 34 deletions b/‎notebooks/1c_visualization.ipynb‎
Lines changed: 34 additions & 34 deletions
diff --git a/‎notebooks/2a_planning.ipynb‎
Lines changed: 5 additions & 5 deletions b/‎notebooks/2a_planning.ipynb‎
Lines changed: 5 additions & 5 deletions
diff --git a/‎notebooks/3a_retrieval3.ipynb‎
Lines changed: 4 additions & 4 deletions b/‎notebooks/3a_retrieval3.ipynb‎
Lines changed: 4 additions & 4 deletions
@@ -10,7 +10,7 @@
     "\n",
     "1. Identifier transformation \n",
     "1. Identifier coding\n",
-    "1. Data mock ups"
+    "1. Data mockups"
    ]
   },
   {
@@ -718,18 +718,18 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Data mock ups\n",
+    "# Data mockups\n",
     "\n",
     "At the planning and pilot study stage, we may have a complex and labor-intensive data collection yet to do.\n",
     "As a result, we will not have some of the data that we need in order to make sure that we can fit everything together.\n",
     "\n",
-    "A data mock up is a form of data that we create—often manually—to simulate the form of the data that we will retrieve in a subsequent collection.\n",
+    "A data mockup is a form of data that we create—often manually—to simulate the form of the data that we will retrieve in a subsequent collection.\n",
     "This is common for data obtained by web scraping, human coding, or other time-intensive processes.\n",
     "Before starting such a collection, we need to know that it will produce the data that we need.\n",
     "If we are designing the collection ourselves, it may serve as a target for the form of data produced.\n",
     "\n",
     "My favorite tool for producing data mockups is a manually-created CSV file.\n",
-    "Unlike Excel spreadsheets (with a lot of internal complexity and sometimes well-intended but harmful automatic behavior), a CSV file is what the name describes: comma separated values.\n",
+    "Unlike Excel spreadsheets (with a lot of internal complexity and sometimes well-intended but harmful automatic behavior), a CSV file is what the name describes: comma-separated values.\n",
     "To make one manually, we simply type (or, more likely, copy and paste) into a file in a text editor."
    ]
   },
@@ -739,7 +739,7 @@
    "source": [
     "## CSV example\n",
     "\n",
-    "The contents of a CSV file looks like this:\n",
+    "The contents of a CSV file look like this:\n",
     "\n",
     "```csv\n",
     "price,tic,yr\n",
 
@@ -480,7 +480,7 @@
     "\n",
     "First, I used the `LIMIT` keyword with a value of `10`.\n",
     "Compustat is a huge dataset, and retrieving everything would be a big download.\n",
-    "While we are experimenting or iterating on a query, using `LIMIT` asks the server to provide only a number of results up to the parameter to limit.\n",
+    "When we are experimenting or iterating on a query, using `LIMIT` asks the server to provide only a number of results up to the parameter to limit.\n",
     "This is a strong norm when using this kind of data, as it dramatically reduces the load on the server.\n",
     "`LIMIT` becomes more important as we ask the server to do transformation work for us, which increases the computational demand.\n",
     "\n",
@@ -894,7 +894,7 @@
     "First, we asked for the `cusip` column to be called `cusip9` in our results using `AS`.\n",
     "Second, we used a function to transform the `cusip` column (using the `SUBSTRING()` function) to give us only eight characters and to name it `cusip8`.\n",
     "This is a simple example of having the server do prep work for us.\n",
-    "Finally, we added a second condition to `WHERE`, a year restriction."
+    "Finally, we added a second condition to `WHERE`: a year restriction."
    ]
   },
   {
@@ -903,14 +903,14 @@
    "source": [
     "# Aggregation\n",
     "\n",
-    "Sometimes, the data in a table is more granular than the data that we want out.\n",
+    "Sometimes, the data in a table is more granular than the data that we returned to us.\n",
     "So, we can ask the server to aggregate it for us, returning an aggregated dataset.\n",
     "\n",
     "There are a few important things to know:\n",
     "\n",
     "1. We use `GROUP BY` to tell the DBMS how to group rows before aggregating.\n",
     "2. Every column must either be in the `GROUP BY` or have an aggregation function applied. A notable example here is that we ask for the `MAX` of the company name. If the name changes in the rows of the search, the DBMS would need to know how to choose. However, this is enforced as a general rule, not only when there is an actual conflict to resolve.\n",
-    "3. Order of the statements matter. For example, `WHERE` needs to be after `FROM` and before `GROUP BY`. I've done them here, so it will work, but this is a topic better explored in a book on the topic."
+    "3. Order of the statements matter. For example, `WHERE` needs to be after `FROM` and before `GROUP BY`. I've done them here, so it will work, but this is a topic better explored in an introductory book on SQL."
    ]
   },
   {