Fix typos and enhance clarity in tutorial.md

Sch-Da · web-flow · commit 4a278b553a13 · 2025-10-20T13:59:44.000+02:00
Corrected typos and improved clarity in the tutorial.
diff --git a/topics/digital-humanities/tutorials/open-refine-tutorial/tutorial.md b/topics/digital-humanities/tutorials/open-refine-tutorial/tutorial.md
@@ -445,39 +445,40 @@ What can you see here? To follow along, we made all substeps of the task availab
 To answer our question of which year most elements in the museum derive from, we first cut the column of production time from the table.
 You can see this in the file `Cut on Data (Number)`.
 From this, we filter only the dates that derive from specific years, not year ranges. (See `Filter Tabular on Data (Number)`.) 
-You can click on the arrow in a cirlce button of this dataset (`Run job again`) to see what exact input was used to exclude year ranges.
+You can click on the arrow in the circle button of this dataset (`Run job again`) to see what exact input was used to exclude year ranges.
 Regular expressions help clean remaining inconsistencies in the dataset. (Dataset: `Column Regex Find And Replace on Data (Number)`)
-Sorting the production date in descending order, as done in dataset `Sort on data (lowest Number)`, reveals that one faulty dataset, which is supposed to have been created in 2041, is part of the table. 
+Sorting the production date in descending order, as done in the dataset `Sort on data (lowest Number)`, reveals that one faulty dataset, which is supposed to have been created in 2041, is part of the table. 
 We remove it in the next step with the tool `Remove beginning`.
 
 The tool **Datamash** allows for summarising how many elements arrived at the museum in each year.  (Dataset: `Datamash on data (Number)`.)
 After we apply this tool, the dataset is no longer 7738 lines long, but only 259. 
-This is because the amount of times, each year appeared in the table was summed up in a second column.
-Sorting in ascending order (`Sort on data (Number)`) shows a chronological dataset with the earliest enties in the beginning and the most recent entries at the end of the table.
-This, we can easily visualise in a (particularly crowded) bar chart directly within Galaxy. (`Bar chart on data (Number)`)
-But this is not the most optimal view to show us, what year most objects derive from.
+This is because the number of times each year appeared in the table was summed up in a second column.
+Sorting in ascending order (`Sort on data (Number)`) shows a chronological dataset with the earliest entries in the beginning and the most recent entries at the end of the table.
+ We can easily visualise this in a (particularly crowded) bar chart directly within Galaxy. (`Bar chart on data (Number)`)
+But this is not the most optimal view to show us which year most objects derive from.
 
 To determine from which year most objects originate, we use another sorting order (`Sort on Data (highest number)`). 
 
 > <question-title></question-title>
 >
-> 1. From what year does the museum have most objects?
+> 1. From what year does the museum have the most objects?
 >
 > > <solution-title></solution-title>
 > >
-> > 1. The dataset `Sort on data (highest number)` shows the amount of objexts by year, sorted from most to least. 288 items are noted for the year 1969 in the first row. This is the year from which the museum has most (clearly datable) objects.
+> > 1. The dataset `Sort on data (highest number)` shows the number of objects by year, sorted from most to least. 288 items are noted for the year 1969 in the first row. This is the year from which the museum has the most (clearly datable) objects.
 > >
 > {: .solution}
 {: .question}
 
 The next four steps parse this year as a conditional statement step by step. (`Select first on data (Number)`, `Cut on data (Number)`, `Parse parameter value on data (Number)` and `Compose text parameter value`.)  
 This means, even if you upload another dataset, the highest number is always selected and taken as an input for the next steps.
 
-Based on this input, which is determined by the year with the highest input, ` Search in textfiles on data (Number)` now searches for object descriptions from the 288 objects of the most prominent year.
-
-From all object descriptions from that year, we create a word cloud of the object descriptions by using the offered stop word list.
-This helps us quickly determine, what kinds of objects the museum has from this popular year.
-The dataset `Word cloud image` shows that most objects from the museum are negatives from Davis Mist, a famous Australian photographer, which he created that year and donated to the museum.
+Based on this input, which is determined by the year with the highest input, `Search in textfiles on data (Number)` searches for object descriptions from the 288 objects of the most prominent year. 
+The table is very rich in information, but not that easy to digest.
+To make the table more accessible, we create a word cloud of the object descriptions with the offered stop word list.
+If you click on the stop word list we provided, you see what "fill words" are excluded from the word cloud. In essence, only words conveying meaning remain.
+This helps us quickly determine what kinds of objects the museum has from this popular year.
+The dataset `Word cloud image` shows that most objects from the museum are negatives from Davis Mist, a famous Australian photographer, who created them that year and donated them to the museum.
 
 ![Word cloud of objects' descriptions](images/display_1969.png)