You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: episodes/04-transforming-data.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -42,7 +42,7 @@ clean leading and trailing white spaces from all data when importing the data in
42
42
43
43
Look at the data in the column `coordinates` and split these values to obtain latitude and longitude. Make sure that the option for `Guess cell type` is checked and that `Remove this column` is not. Rename the new columns.
44
44
45
-
What type of data does OpenRefine assign to the new colunms?
45
+
What type of data does OpenRefine assign to the new columns?
46
46
47
47
::::::::::::::: solution
48
48
@@ -157,7 +157,7 @@ Once the new column is created, convert it to date using `Edit cells` > `Common
157
157
Clustering allows you to find groups of entries that are not identical but are
158
158
sufficiently similar that they may be alternative representations of the same thing (term or data value).
159
159
For example, the two strings `New York` and `new york` are very likely to refer to the same concept and just have a
160
-
capitalization differences. Likewise, `Björk` and `Bjork` probably refer to the same person. These kinds of variations
160
+
capitalization difference. Likewise, `Björk` and `Bjork` probably refer to the same person. These kinds of variations
161
161
occur a lot in scientific data. Clustering gives us a tool to resolve them.
162
162
163
163
OpenRefine provides different clustering algorithms. The best way to understand how they work is to experiment with them.
@@ -172,7 +172,7 @@ The dataset has several near-identical entries in `scientificName`. For example,
172
172
173
173
2. In the resulting pop-up window, you can change the `Method` and the `Keying Function`. Try different combinations to see what different mergers of values are suggested.
174
174
175
-
3. If you select the `key collision` method and the `metaphone3` keying function. It should identify one cluster:
175
+
3. If you select the `key collision` method and the `metaphone3` keying function, it should identify one cluster:
176
176
177
177
{alt='OpenRefine window for clustering'}
0 commit comments