Skip to content

Commit 8ae66e2

Browse files
committed
Add some adjustments while reviewing
1 parent 4d5e342 commit 8ae66e2

9 files changed

+52
-31
lines changed

docs/02_Introduction_into_Metafacture-Flux.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -214,7 +214,7 @@ Luckily, we cannot only open the data we have in our `inputFile-content` field,
214214

215215
Clear your playground and copy the following Flux workflow:
216216

217-
```
217+
```text
218218
"https://openlibrary.org/books/OL2838758M.json"
219219
| open-http
220220
| as-lines
@@ -236,6 +236,7 @@ Let's take a look at what a Flux workflow does. The Flux workflow is a combinati
236236
6. Finally, we tell MF to `print` everything.
237237

238238
So let's have a small recap of what we've done and learned so far:
239+
239240
* We've played around with the Metafacture Playground.
240241
* We've learned that a Metafacture Flux workflow is a combination of modules with an inital text string or a variable.
241242
* We got to know different modules like `open-http`, `as-lines`. `decode-json`, `encode-yaml`, `print`

docs/03_Introduction_into_Metafacture-Fix.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -96,13 +96,13 @@ Using a separate Fix file is recommended if you need to write many Fix functions
9696
To add more Fixes we can again edit the Fix file.
9797
Lets add these lines in front of the retain function:
9898

99-
```
99+
```perl
100100
move_field("type.key", "pub_type")
101101
```
102102

103103
Also change the `retain` function so that you keep the new element `"pub_type"` instead of the not existing nested `"key"` element.
104104

105-
```
105+
```perl
106106
move_field("type.key","pub_type")
107107
retain("title", "publish_date", "notes.value", "pub_type")
108108
```
@@ -121,7 +121,7 @@ notes:
121121
With `move_field` we moved and renamed an existing element.
122122
As next step add the following function before the `retain` function.
123123

124-
```
124+
```perl
125125
replace_all("pub_type","/type/","")
126126
```
127127

@@ -142,7 +142,7 @@ We cleaned up the value of `"pub_type"` element for better readability.
142142

143143
Metafacture contains many Fix functions to manipulate data. Also there are many Flux commands/modules that can be used.
144144

145-
Check the documentation to get a complete list of [Flux commands](https://metafacture.github.io/metafacture-documentation/docs/flux/flux-commands.html) and [Fix functions](https://metafacture.github.io/metafacture-documentation/docs/fix/fix-functions.html). This post only presented a short introduction into Metafacture. In the next posts we will go deeper into its capabilities.
145+
Check the documentation to get a complete list of [Flux commands](https://metafacture.github.io/metafacture-documentation/docs/flux/flux-commands.html) and [Fix functions](https://metafacture.github.io/metafacture-documentation/docs/fix/Fix-functions.html). This post only presented a short introduction into Metafacture. In the next posts we will go deeper into its capabilities.
146146

147147
Besides Fix functions you can also add as many comments and linebreaks as you want to a Fix.
148148

@@ -169,7 +169,7 @@ retain("title", "publish_date", "pub_type")
169169

170170
2) [Add a field with todays date called `"map_date"`.](https://metafacture.org/playground/?flux=%22https%3A//openlibrary.org/books/OL2838758M.json%22%0A%7C+open-http%0A%7C+as-lines%0A%7C+decode-json%0A%7C+fix+%28transformationFile%29%0A%7C+encode-yaml%0A%7C+print%0A%3B&transformation=move_field%28%22type.key%22%2C%22pub_type%22%29%0Areplace_all%28%22pub_type%22%2C%22/type/%22%2C%22%22%29%0A...%28%22mape_date%22%2C%22...%22%29%0Aretain%28%22title%22%2C+%22publish_date%22%2C+%22by_statement%22%2C+%22pub_type%22%29)
171171

172-
Have a look at the [Fix functions](https://metafacture.org/metafacture-documentation/docs/fix/fix-functions.html). (Hint: you could use `add_field` or `timestamp`. And don't forget to add the new element to `retain`)
172+
Have a look at the [Fix functions](https://metafacture.org/metafacture-documentation/docs/fix/Fix-functions.html). (Hint: you could use `add_field` or `timestamp`. And don't forget to add the new element to `retain`)
173173

174174

175175
<details>

docs/05-More-Fix-Concepts.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -7,9 +7,9 @@ parent: Tutorial
77

88
# Lesson 5: More Fix concepts
99

10-
We already learned about simple Fixes aka *[Fix functions](https://metafacture.github.io/metafacture-documentation/docs/fix/fix-functions.html)* but there are three additional concepts in Fix: selector, conditionals and binds.
10+
We already learned about simple Fixes aka *[Fix functions](https://metafacture.github.io/metafacture-documentation/docs/fix/Fix-functions.html)* but there are three additional concepts in Fix: selector, conditionals and binds.
1111

12-
These Fix concepts were introduced by Catmandu (see [functions](https://librecat.org/Catmandu/#functions), [selector](https://librecat.org/Catmandu/#selectors), [conditionals](https://librecat.org/Catmandu/#conditionals) and [binds](https://librecat.org/Catmandu/#binds)). Be aware that Metafacture Fix does not support all of the specific functions, selectors, conditionals and binds from Catmandu. Check the documentation for a full overview of the supported [Fix functions](https://metafacture.org/metafacture-documentation/docs/fix/fix-functions.html).
12+
These Fix concepts were introduced by Catmandu (see [functions](https://librecat.org/Catmandu/#functions), [selector](https://librecat.org/Catmandu/#selectors), [conditionals](https://librecat.org/Catmandu/#conditionals) and [binds](https://librecat.org/Catmandu/#binds)). Be aware that Metafacture Fix does not support all of the specific functions, selectors, conditionals and binds from Catmandu. Check the documentation for a full overview of the supported [Fix functions](https://metafacture.org/metafacture-documentation/docs/fix/Fix-functions.html).
1313

1414
## Additional concepts
1515

@@ -47,11 +47,11 @@ Fix functions are used to add, change, remove or otherwise manipulate elements.
4747

4848
The other three concepts help when you intend to use more complex transformations:
4949

50-
*[Conditionals](https://metafacture.github.io/metafacture-documentation/docs/fix/fix-functions.html#conditionals)* are used to control the processing of Fix functions. The included Fix functions are not processed with every workflow but only under certain conditions.
50+
*[Conditionals](https://metafacture.github.io/metafacture-documentation/docs/fix/Fix-functions.html#conditionals)* are used to control the processing of Fix functions. The included Fix functions are not processed with every workflow but only under certain conditions.
5151

52-
*[Selectors](https://metafacture.github.io/metafacture-documentation/docs/fix/fix-functions.html#selectors)* can be used to filter the records you want.
52+
*[Selectors](https://metafacture.github.io/metafacture-documentation/docs/fix/Fix-functions.html#selectors)* can be used to filter the records you want.
5353

54-
*[Binds](https://metafacture.github.io/metafacture-documentation/docs/fix/fix-functions.html#binds)* are wrappers for one or more Fixes. They give extra control functionality for Fixes such as loops. All binds have the same syntax:
54+
*[Binds](https://metafacture.github.io/metafacture-documentation/docs/fix/Fix-functions.html#binds)* are wrappers for one or more Fixes. They give extra control functionality for Fixes such as loops. All binds have the same syntax:
5555

5656
```perl
5757
do Bind(params,...)
@@ -160,7 +160,7 @@ else
160160
end
161161
```
162162

163-
Metafacture supports lots of conditionals, find a list of all of them [here](https://metafacture.github.io/metafacture-documentation/docs/fix/fix-functions.html#conditionals).
163+
Metafacture supports lots of conditionals, find a list of all of them [here](https://metafacture.github.io/metafacture-documentation/docs/fix/Fix-functions.html#conditionals).
164164

165165
## Selectors
166166

@@ -176,11 +176,11 @@ end
176176

177177
Selectors work in combination with conditionals to define the conditions that you want to kick out.
178178

179-
See the [list of supported selectors](https://metafacture.github.io/metafacture-documentation/docs/fix/fix-functions.html#selectors).
179+
See the [list of supported selectors](https://metafacture.github.io/metafacture-documentation/docs/fix/Fix-functions.html#selectors).
180180

181181
## Binds
182182

183-
As mentioned above [Binds](https://metafacture.github.io/metafacture-documentation/docs/fix/fix-functions.html#binds) are wrappers for one or more Fixes. They give extra control functionality for Fixes such as loops. All binds have the same syntax:
183+
As mentioned above [Binds](https://metafacture.github.io/metafacture-documentation/docs/fix/Fix-functions.html#binds) are wrappers for one or more Fixes. They give extra control functionality for Fixes such as loops. All binds have the same syntax:
184184

185185
```perl
186186
do Bind(params,...)
@@ -247,7 +247,7 @@ end
247247

248248
[See this example in the playground.](https://metafacture.org/playground/?flux=inputFile%0A%7C+open-file%0A%7C+as-records%0A%7C+decode-yaml%0A%7C+fix%28transformationFile%29%0A%7C+encode-yaml%0A%7C+print%0A%3B&transformation=do+list%28path%3A%22colours%5B%5D%22%2C%22var%22%3A%22%24i%22%29%0A++++if+any_equal%28%22%24i%22%2C%22green%22%29%0A++++++++upcase%28%22%24i%22%29%0A++++++++append%28%22%24i%22%2C%22+is+a+nice+color%22%29%0A++++++++copy_field%28%22%24i%22%2C%22result%5B%5D.%24append%22%29%0A++++end%0Aend&data=---%0Acolours%3A%0A+-+red%0A+-+yellow%0A+-+green)
249249

250-
See the [list of supported binds](https://metafacture.github.io/metafacture-documentation/docs/fix/fix-functions.html#binds).
250+
See the [list of supported binds](https://metafacture.github.io/metafacture-documentation/docs/fix/Fix-functions.html#binds).
251251

252252
TODO: Add excercises.
253253

docs/06_MetafactureCLI.md

Lines changed: 21 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ parent: Tutorial
99

1010
## Get Metafacture Runner as CLI Tool
1111

12-
_This lesson requires basic practical knowledge of the command line and Shell.
12+
This lesson requires basic practical knowledge of the command line and Shell.
1313
If you want to get familiar with it, have a look at the great [intro to Unix Shell by Library Carpentry](https://librarycarpentry.github.io/lc-shell/) (Session 1 - 3). You could also have a look at the great [introdution by the Programming Historian to Powershell](https://programminghistorian.org/en/lessons/intro-to-powershell)_
1414

1515
While we had fun with our Metafacture Playground another way to use Metafacture is by
@@ -28,6 +28,22 @@ In the folder you find the `flux.bat` and `flux.sh`
2828

2929
The code below assumes you moved the resulting folder to your home directory and renamed it to `"metafacture"`.
3030

31+
If you run
32+
33+
Unix:
34+
35+
```bash
36+
~/metafacture/flux.sh
37+
```
38+
39+
or Windows:
40+
41+
```bash
42+
~\metafacture\flux.bat
43+
```
44+
45+
Metafacture will list all currently available Flux Commands.
46+
3147
## How to run Metafacture via CLI
3248

3349
You can run your workflows:
@@ -74,7 +90,7 @@ To simplify the code examples we will be using unix paths for the terminal comma
7490
The result of running the Flux script via CLI should be the same as with the Playground.
7591

7692
The Metafacture CLI tool expects a Flux file for every workflow.
77-
Our workflow only has the following Flux and no additional files since it is fetching data from the web and it has no fix transformations. The file should have the following content, defining the playground specific variables and the Flux workflow that you also saw in the playground. You can delete the playground specific variables since they are not needed so you woul end with this:
93+
Our workflow only has the following Flux and no additional files since it is fetching data from the web and it has no Fix transformations. The file should have the following content, defining the playground specific variables and the Flux workflow that you also saw in the playground. You can delete the playground specific variables since they are not needed so you would end with this:
7894

7995
```text
8096
"https://openlibrary.org/books/OL2838758M.json"
@@ -108,7 +124,7 @@ Run it again as shown above.
108124

109125
It should output:
110126

111-
```JSON
127+
```json
112128
{
113129
"professionOrOccupation" : [ {
114130
"id" : "https://d-nb.info/gnd/4629643-8",
@@ -276,7 +292,7 @@ It should output:
276292
}
277293
```
278294

279-
If we want to use Fix we need to reference the Fix file (in the playground we only referenced t via `| fix`):
295+
If we want to use Fix we need to reference the Fix file (in the playground we only referenced the variable `transformationFile` via `| fix`):
280296

281297
```text
282298
"path/to/your/file/11942150X.json"
@@ -329,7 +345,7 @@ FILE
329345
;
330346
```
331347

332-
which you use like:
348+
Which you use like:
333349

334350
```bash
335351
~/metafacture/flux.sh path/to/your.flux FILE="path/to/your/file.json"

docs/07_Processing_MARC.md

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -10,15 +10,15 @@ parent: Tutorial
1010

1111
In the previous lessons we learned how we can use Metafacture to process structured data like JSON. In this lesson we will use Metafacture to process MARC metadata records. In this process we will see that MARC can be processed using FIX paths.
1212

13-
[Transformation of MARC data with Mmetafacture can be used for multiple things, e.g. you could transform MARC binary files to MARC XML.](https://metafacture.org/playground/?flux=%22https%3A//raw.githubusercontent.com/metafacture/metafacture-tutorial/main/data/sample.mrc%22%0A%7C+open-http%0A%7C+as-lines%0A%7C+decode-marc21%28emitleaderaswhole%3D%22true%22%29%0A%7C+encode-marcxml%0A%7C+print%0A%3B)
13+
[Transformation of MARC data with Metafacture can be used for multiple things, e.g. you could transform MARC binary files to MARC XML.](https://metafacture.org/playground/?flux=%22https%3A//raw.githubusercontent.com/metafacture/metafacture-tutorial/main/data/sample.mrc%22%0A%7C+open-http%0A%7C+as-lines%0A%7C+decode-marc21%28emitleaderaswhole%3D%22true%22%29%0A%7C+encode-marcxml%0A%7C+print%0A%3B)
1414

1515
As always, we will need to set up a small metafacture Flux script.
1616

1717
Lets inspect a MARC file: https://raw.githubusercontent.com/metafacture/metafacture-tutorial/main/data/sample.marc
1818

1919
Create the following Flux in a new file and name it e.g. `marc1.flux`:
2020

21-
```
21+
```text
2222
"https://raw.githubusercontent.com/metafacture/metafacture-tutorial/main/data/sample.mrc"
2323
| open-http
2424
| as-lines
@@ -27,7 +27,8 @@ Create the following Flux in a new file and name it e.g. `marc1.flux`:
2727
```
2828

2929
Run this Flux via CLI, e.g.:
30-
```
30+
31+
```bash
3132
/path/to/your/metafix-runner path/to/your/marc1.flux
3233
```
3334

@@ -352,7 +353,7 @@ You will see this as output:
352353

353354
In the Fix above we mapped the field 245 to the title, and iterated over every subfield with the help of the list-bind and the `?`- wildcard. The ISBN is in the 020-field. Because MARC records can contain one or more 020 fields we created an isbn array with add_array and added the values using the isbn.$append syntax. Next we turned the isbn array back into a comma separated string using the join_field fix. As last step we deleted all the fields we didn’t need in the output with the `retain` syntax.
354355

355-
Different versions of MARC-Serialization need different workflows: e.g. h[ere see an example of Aseq-MARC Files that are transformed to MARCxml.](https://test.metafacture.org/playground/?flux=%22https%3A//raw.githubusercontent.com/LibreCat/Catmandu-MARC/dev/t/rug01.aleph%22%0A%7C+open-http%0A%7C+as-lines%0A%7C+decode-aseq%0A%7C+merge-same-ids%0A%7C+encode-marcxml%0A%7C+print%0A%3B)
356+
Different versions of MARC-Serialization need different workflows: e.g. [here see an example of Aseq-MARC Files that are transformed to MARCxml.](https://test.metafacture.org/playground/?flux=%22https%3A//raw.githubusercontent.com/LibreCat/Catmandu-MARC/dev/t/rug01.aleph%22%0A%7C+open-http%0A%7C+as-lines%0A%7C+decode-aseq%0A%7C+merge-same-ids%0A%7C+encode-marcxml%0A%7C+print%0A%3B)
356357

357358
In this post we demonstrated how to process MARC data. In the next post we will show some examples how Catmandu typically can be used to process library data.
358359

docs/08_Harvest_data_with_OAI-PMH.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ You can also harvest MARC data, serialize it to MARC-binary and store it in a fi
4949
;
5050
```
5151

52-
You can also transform incoming data and store/index it with MongoDB or Elasticsearch. For the transformation you need to create a fix (see Lesson 3) in the playground or in a text editor:
52+
You can also transform incoming data and prepare it for indexing it with Elasticsearch. For the transformation you need to create a fix (see Lesson 3) in the playground or in a text editor:
5353

5454
Add the following fixes to the file:
5555

@@ -61,7 +61,7 @@ copy_field("260??.c","date")
6161
retain("_id","title","creator[]","date")
6262
```
6363

64-
Now you can run an ETL process (extract, transform, load) with this worklflow:
64+
Now you can run an ETL process (extract, transform, load) with this worklflow, we use `json-to-elasticsearch-bulk` to prepare the output for elastic search indexing:
6565

6666
```text
6767
"https://lib.ugent.be/oai"

docs/09_Working_with_CSV.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,6 @@ Convert the data to different serializations, like JSON, YAML and XML by decodin
4343
See that the elements have no literal names but only numbers.
4444
As the CSV has a header we need to add the option `(hasHeader="true")` to `decode-csv` in the Flux.
4545

46-
4746
You can extract specified fields while converting to another tabular format by using the Fix. This is quite handy for analysis of specific fields or to generate reports. In the following example we only keep three columns (`"ISBN"`,`"Title"`,`"Author"`):
4847

4948
Flux:
@@ -60,7 +59,8 @@ Flux:
6059
```
6160

6261
With Fix:
63-
```
62+
63+
```perl
6464
retain("ISBN","Title","Author")
6565
```
6666

@@ -82,7 +82,7 @@ Flux:
8282

8383
Fix:
8484

85-
```
85+
```perl
8686
replace_all("?","^\\$|\\$$","")
8787
```
8888

docs/10_Working_with_XML.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ title:
103103
[For our example above, to get rid of the value subfields in the yaml, we need to change the hirachy:](https://metafacture.org/playground/?flux=inputFile%0A%7C+open-file%0A%7C+decode-xml%0A%7C+handle-generic-xml%0A%7C+fix%28transformationFile%29%0A%7C+encode-yaml%0A%7C+print%0A%3B&transformation=move_field%28%22title.value%22%2C%22@title%22%29%0Amove_field%28%22@title%22%2C%22title%22%29%0Amove_field%28%22author.value%22%2C%22@author%22%29%0Amove_field%28%22@author%22%2C%22author%22%29%0Amove_field%28%22datePublished.value%22%2C%22@datePublished%22%29%0Amove_field%28%22@datePublished%22%2C%22datePublished%22%29&data=%3C%3Fxml+version%3D%221.0%22+encoding%3D%22utf-8%22%3F%3E%0A%3Crecord%3E%0A++%3Ctitle%3EGRM%3C/title%3E%0A++%3Cauthor%3ESibille+Berg%3C/author%3E%0A++%3CdatePublished%3E2019%3C/datePublished%3E%0A%3C/record%3E)
104104

105105

106-
```
106+
```text
107107
inputFile
108108
| open-file
109109
| decode-xml
@@ -115,6 +115,7 @@ inputFile
115115
```
116116

117117
with Fix:
118+
118119
```perl
119120
move_field("title.value","@title")
120121
move_field("@title","title")
@@ -135,6 +136,7 @@ inputFile
135136
| print
136137
;
137138
```
139+
138140
results in:
139141

140142
```xml

docs/11_MARC_to_Dublin_Core.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,8 @@ parent: Tutorial
66
---
77

88

9-
# Lesson 11 : From MARC to Dublin Core as Linked Open Usable Data (LOUD)
9+
## Lesson 11 : From MARC to Dublin Core as Linked Open Usable Data (LOUD)
10+
1011
TODO: Use better example. But the following is missing isbns: https://github.com/metafacture/metafacture-examples/blob/master/Swissbib-Extensions/MARC-CSV/
1112

1213
Today we will look a bit further into MARC processing with Metafacture. We already saw a bit of MARC processing and today we will transform MARC records into Dublin Core providing the data as linked open usable data.

0 commit comments

Comments
 (0)