Skip to content

Commit 012ceaf

Browse files
stijnvanhoeyjorisvandenbossche
authored andcommitted
Fix #129
1 parent ab524c1 commit 012ceaf

File tree

5 files changed

+315
-148313
lines changed

5 files changed

+315
-148313
lines changed

_solved/case1_bike_count.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1884,7 +1884,7 @@
18841884
"editable": true
18851885
},
18861886
"source": [
1887-
"2013-06-05 was supposed to be the first time more than 10,000 bikers passed on one day (and not by coincidence: http://www.nieuwsblad.be/cnt/dmf20130605_022. Although the data shows it was not actually the first time ..."
1887+
"The high number of bikers passing on 2013-06-05 was not by coincidence: http://www.nieuwsblad.be/cnt/dmf20130605_022 ;-)"
18881888
]
18891889
},
18901890
{

_solved/case2_observations_processing.ipynb

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@
5858
"* `scientificName`: the accepted scientific name of the species\n",
5959
"* `decimalLatitude`/`decimalLongitude`: coordinates of the occurrence in WGS84 format\n",
6060
"* `sex`: either `male` or `female` to characterize the sex of the occurrence\n",
61-
"* `occurrenceID`: a identifier within the data set to identify the individual records\n",
61+
"* `occurrenceID`: an identifier within the data set to identify the individual records\n",
6262
"* `datasetName`: a static string defining the source of the data\n",
6363
"\n",
6464
"Furthermore, additional information concerning the taxonomy will be added using an external API service"
@@ -529,7 +529,7 @@
529529
"cell_type": "markdown",
530530
"metadata": {},
531531
"source": [
532-
"To check what the frequency of occurrences is for male/female of the categories, a bar chart is one possible representation:"
532+
"To check what the frequency of occurrences is for male/female of the categories, a bar chart is a possible representation:"
533533
]
534534
},
535535
{
@@ -859,7 +859,7 @@
859859
"cell_type": "markdown",
860860
"metadata": {},
861861
"source": [
862-
"There apparently exists a double entry: `'DM and SH'`, which basically defines two records and should be decoupled to two individual records (i.e. rows). Hence, we should be able to create a additional row based on this split. To do so, Pandas provides a dedicated function since version 0.25, called `explode`. Starting from a small subset example:"
862+
"There apparently exists a double entry: `'DM and SH'`, which basically defines two records and should be decoupled to two individual records (i.e. rows). Hence, we should be able to create an additional row based on this split. To do so, Pandas provides a dedicated function since version 0.25, called `explode`. Starting from a small subset example:"
863863
]
864864
},
865865
{
@@ -1050,7 +1050,7 @@
10501050
"cell_type": "markdown",
10511051
"metadata": {},
10521052
"source": [
1053-
"The function takes a `DataFrame` as input, splits the record into separate rows and returns an updated `DataFrame`. We can use this function to get an update of the `DataFrame`, with the an additional row (observation) added by decoupling the specific field. Let's apply this new function."
1053+
"The function takes a `DataFrame` as input, splits the record into separate rows and returns an updated `DataFrame`. We can use this function to get an update of the `DataFrame`, with an additional row (observation) added by decoupling the specific field. Let's apply this new function."
10541054
]
10551055
},
10561056
{
@@ -1358,7 +1358,7 @@
13581358
"cell_type": "markdown",
13591359
"metadata": {},
13601360
"source": [
1361-
"The `record_id` is no longer a unique identifier for each observation after the decoupling of this data set. We will make a new data set specific identifier, by adding a column called `occurrenceID` that takes a new counter as identifier. As a simply and straightforward approach, we will use a new counter for the whole dataset, starting with 1:"
1361+
"The `record_id` is no longer a unique identifier for each observation after the decoupling of this data set. We will make a new data set specific identifier, by adding a column called `occurrenceID` that takes a new counter as identifier. As a simple and straightforward approach, we will use a new counter for the whole dataset, starting with 1:"
13621362
]
13631363
},
13641364
{

_solved/pandas_06_groupby_operations.ipynb

Lines changed: 122 additions & 125 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)