You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/08_Harvest_data_with_OAI-PMH.md
+9-11Lines changed: 9 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,22 +10,20 @@ parent: Tutorial
10
10
11
11
The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) is a protocol to harvest metadata records from OAI compliant repositories. It was developed by the Open Archives Initiative as a low-barrier mechanism for repository interoperability. The Open Archives Initiative maintains a registry of OAI data providers.
12
12
13
-
Metafacture provides an opener flux module for harvesting metadata from OAI-PMH: `open-oaipmh`
13
+
Metafacture provides a Flux module for harvesting metadata from OAI-PMH: `open-oaipmh`.
14
14
15
15
Lets have a look at the documentation of open-oaipmh:
There you see the specific options that can be used to configure your OAI PMH Harvesting.
19
+
You see the specific options that can be used to configure your OAI PMH harvesting.
20
20
21
-
Every OAI server must provide metadata records in Dublin Core, other (bibliographic) formats like MARC may be supported additionally. Available metadata formats can be detected with the OAI verb `ListMetadataFormats`: https://lib.ugent.be/oai?verb=ListMetadataFormats
22
-
23
-
This OAI-PMH API provides MODS and Dublin Core. For specifying the metadataformat you use the `metadataprefix:` Option.
21
+
Every OAI server must provide metadata records in Dublin Core, other (bibliographic) formats like MARC may be supported additionally. Available metadata formats can be detected with the OAI verb `ListMetadataFormats`, [see an example](https://lib.ugent.be/oai?verb=ListMetadataFormats) which provides MODS and Dublin Core. For specifying the metadata format use the `metadataprefix` option.
24
22
25
23
The OAI server may support selective harvesting, so OAI clients can get only subsets of records from a repository.
26
-
The client requests could be limited via datestamps (`datefrom`, `dateuntil`) or set membership (`setSpec`).
24
+
The client requests could be limited via datestamps (`datefrom`, `dateuntil`) or by setting the membership (`setSpec`).
27
25
28
-
To get some Dublin Core records from the collection of Ghent University Library and convert it to JSON (default) run the following Metafacture worklow via Playground or CLI:
26
+
To get some Dublin Core records from the collection of Ghent University Library and convert it to JSON (default) run the following Metafacture workflow via Playground or CLI:
29
27
30
28
```text
31
29
"https://lib.ugent.be/oai"
@@ -37,9 +35,9 @@ To get some Dublin Core records from the collection of Ghent University Library
37
35
;
38
36
```
39
37
40
-
But if you just want to use the specific metadata records and not the oai-pmh specific metadata wrappers then specify the xml handler like this: `| handle-generic-xml(recordtagname="dc")`
38
+
If you just want to use the specific metadata records and not the OAI-PMH specific metadata wrappers then specify the XML handler like this: `| handle-generic-xml(recordtagname="dc")`
41
39
42
-
You can also harvest MARC data, serialze it to marc-binary and store it in a file:
40
+
You can also harvest MARC data, serialize it to MARC-binary and store it in a file:
43
41
44
42
```text
45
43
"https://lib.ugent.be/oai"
@@ -51,7 +49,7 @@ You can also harvest MARC data, serialze it to marc-binary and store it in a fil
51
49
;
52
50
```
53
51
54
-
You can also transform incoming data and immediately store/index it with MongoDB or Elasticsearch. For the transformation you need to create a fix (see Lesson 3) in the playground or in a text editor:
52
+
You can also transform incoming data and store/index it with MongoDB or Elasticsearch. For the transformation you need to create a fix (see Lesson 3) in the playground or in a text editor:
55
53
56
54
Add the following fixes to the file:
57
55
@@ -77,7 +75,7 @@ Now you can run an ETL process (extract, transform, load) with this worklflow:
77
75
;
78
76
```
79
77
80
-
Excercise: Try to fetch data from a OAI-PMH you know. (e.g. the [DNB OAI](https://www.dnb.de/DE/Professionell/Metadatendienste/Datenbezug/OAI/oai_node.html))
78
+
Excercise: Try to fetch data from an OAI-PMH you know. (e.g. the [DNB OAI](https://www.dnb.de/DE/Professionell/Metadatendienste/Datenbezug/OAI/oai_node.html))
0 commit comments