jorisvandenbossche
diff --git a/‎_solved/case1_bike_count.ipynb
Lines changed: 1 addition & 1 deletion b/‎_solved/case1_bike_count.ipynb
Lines changed: 1 addition & 1 deletion
diff --git a/‎_solved/case2_observations_processing.ipynb
Lines changed: 5 additions & 5 deletions b/‎_solved/case2_observations_processing.ipynb
Lines changed: 5 additions & 5 deletions
diff --git a/‎_solved/pandas_06_groupby_operations.ipynb
Lines changed: 122 additions & 125 deletions b/‎_solved/pandas_06_groupby_operations.ipynb
Lines changed: 122 additions & 125 deletions
@@ -1884,7 +1884,7 @@
     "editable": true
    },
    "source": [
-    "2013-06-05 was supposed to be the first time more than 10,000 bikers passed on one day (and not by coincidence: http://www.nieuwsblad.be/cnt/dmf20130605_022. Although the data shows it was not actually the first time ..."
+    "The high number of bikers passing on 2013-06-05 was not by coincidence: http://www.nieuwsblad.be/cnt/dmf20130605_022 ;-)"
    ]
   },
   {
 
@@ -58,7 +58,7 @@
     "* `scientificName`: the accepted scientific name of the species\n",
     "* `decimalLatitude`/`decimalLongitude`: coordinates of the occurrence in WGS84 format\n",
     "* `sex`: either `male` or `female` to characterize the sex of the occurrence\n",
-    "* `occurrenceID`: a identifier within the data set to identify the individual records\n",
+    "* `occurrenceID`: an identifier within the data set to identify the individual records\n",
     "* `datasetName`: a static string defining the source of the data\n",
     "\n",
     "Furthermore, additional information concerning the taxonomy will be added using an external API service"
@@ -529,7 +529,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "To check what the frequency of occurrences is for male/female of the categories, a bar chart is one possible representation:"
+    "To check what the frequency of occurrences is for male/female of the categories, a bar chart is a possible representation:"
    ]
   },
   {
@@ -859,7 +859,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "There apparently exists a double entry: `'DM and SH'`, which basically defines two records and should be decoupled to two individual records (i.e. rows). Hence, we should be able to create a additional row based on this split. To do so, Pandas provides a dedicated function since version 0.25, called `explode`. Starting from a small subset example:"
+    "There apparently exists a double entry: `'DM and SH'`, which basically defines two records and should be decoupled to two individual records (i.e. rows). Hence, we should be able to create an additional row based on this split. To do so, Pandas provides a dedicated function since version 0.25, called `explode`. Starting from a small subset example:"
    ]
   },
   {
@@ -1050,7 +1050,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The function takes a `DataFrame` as input, splits the record into separate rows and returns an updated `DataFrame`. We can use this function to get an update of the `DataFrame`, with the an additional row (observation) added by decoupling the specific field. Let's apply this new function."
+    "The function takes a `DataFrame` as input, splits the record into separate rows and returns an updated `DataFrame`. We can use this function to get an update of the `DataFrame`, with an additional row (observation) added by decoupling the specific field. Let's apply this new function."
    ]
   },
   {
@@ -1358,7 +1358,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The `record_id` is no longer a unique identifier for each observation after the decoupling of this data set. We will make a new data set specific identifier, by adding a column called `occurrenceID` that takes a new counter as identifier. As a simply and straightforward approach, we will use a new counter for the whole dataset, starting with 1:"
+    "The `record_id` is no longer a unique identifier for each observation after the decoupling of this data set. We will make a new data set specific identifier, by adding a column called `occurrenceID` that takes a new counter as identifier. As a simple and straightforward approach, we will use a new counter for the whole dataset, starting with 1:"
    ]
   },
   {
Original file line number	Diff line number	Diff line change
`@@ -1884,7 +1884,7 @@`
`1884`	`1884`	`"editable": true`
`1885`	`1885`	`},`
`1886`	`1886`	`"source": [`
`1887`		`- "2013-06-05 was supposed to be the first time more than 10,000 bikers passed on one day (and not by coincidence: http://www.nieuwsblad.be/cnt/dmf20130605_022. Although the data shows it was not actually the first time ..."`
	`1887`	`+ "The high number of bikers passing on 2013-06-05 was not by coincidence: http://www.nieuwsblad.be/cnt/dmf20130605_022 ;-)"`
`1888`	`1888`	`]`
`1889`	`1889`	`},`
`1890`	`1890`	`{`
Original file line number	Diff line number	Diff line change
`@@ -58,7 +58,7 @@`
`58`	`58`	"* `scientificName`: the accepted scientific name of the species\n",
`59`	`59`	"* `decimalLatitude`/`decimalLongitude`: coordinates of the occurrence in WGS84 format\n",
`60`	`60`	"* `sex`: either `male` or `female` to characterize the sex of the occurrence\n",
`61`		- "* `occurrenceID`: a identifier within the data set to identify the individual records\n",
	`61`	+ "* `occurrenceID`: an identifier within the data set to identify the individual records\n",
`62`	`62`	"* `datasetName`: a static string defining the source of the data\n",
`63`	`63`	`"\n",`
`64`	`64`	`"Furthermore, additional information concerning the taxonomy will be added using an external API service"`
`@@ -529,7 +529,7 @@`
`529`	`529`	`"cell_type": "markdown",`
`530`	`530`	`"metadata": {},`
`531`	`531`	`"source": [`
`532`		`- "To check what the frequency of occurrences is for male/female of the categories, a bar chart is one possible representation:"`
	`532`	`+ "To check what the frequency of occurrences is for male/female of the categories, a bar chart is a possible representation:"`
`533`	`533`	`]`
`534`	`534`	`},`
`535`	`535`	`{`
`@@ -859,7 +859,7 @@`
`859`	`859`	`"cell_type": "markdown",`
`860`	`860`	`"metadata": {},`
`861`	`861`	`"source": [`
`862`		- "There apparently exists a double entry: `'DM and SH'`, which basically defines two records and should be decoupled to two individual records (i.e. rows). Hence, we should be able to create a additional row based on this split. To do so, Pandas provides a dedicated function since version 0.25, called `explode`. Starting from a small subset example:"
	`862`	+ "There apparently exists a double entry: `'DM and SH'`, which basically defines two records and should be decoupled to two individual records (i.e. rows). Hence, we should be able to create an additional row based on this split. To do so, Pandas provides a dedicated function since version 0.25, called `explode`. Starting from a small subset example:"
`863`	`863`	`]`
`864`	`864`	`},`
`865`	`865`	`{`
`@@ -1050,7 +1050,7 @@`
`1050`	`1050`	`"cell_type": "markdown",`
`1051`	`1051`	`"metadata": {},`
`1052`	`1052`	`"source": [`
`1053`		- "The function takes a `DataFrame` as input, splits the record into separate rows and returns an updated `DataFrame`. We can use this function to get an update of the `DataFrame`, with the an additional row (observation) added by decoupling the specific field. Let's apply this new function."
	`1053`	+ "The function takes a `DataFrame` as input, splits the record into separate rows and returns an updated `DataFrame`. We can use this function to get an update of the `DataFrame`, with an additional row (observation) added by decoupling the specific field. Let's apply this new function."
`1054`	`1054`	`]`
`1055`	`1055`	`},`
`1056`	`1056`	`{`
`@@ -1358,7 +1358,7 @@`
`1358`	`1358`	`"cell_type": "markdown",`
`1359`	`1359`	`"metadata": {},`
`1360`	`1360`	`"source": [`
`1361`		- "The `record_id` is no longer a unique identifier for each observation after the decoupling of this data set. We will make a new data set specific identifier, by adding a column called `occurrenceID` that takes a new counter as identifier. As a simply and straightforward approach, we will use a new counter for the whole dataset, starting with 1:"
	`1361`	+ "The `record_id` is no longer a unique identifier for each observation after the decoupling of this data set. We will make a new data set specific identifier, by adding a column called `occurrenceID` that takes a new counter as identifier. As a simple and straightforward approach, we will use a new counter for the whole dataset, starting with 1:"
`1362`	`1362`	`]`
`1363`	`1363`	`},`
`1364`	`1364`	`{`