Skip to content
This repository was archived by the owner on Dec 4, 2019. It is now read-only.

Use Case Recipes

Mark Allen Matney, Jr edited this page Aug 11, 2017 · 12 revisions

Here are some ideas for how your institution might use this software.

Content provider (source)

Each set of resources to be made available for discovery via ResourceSync needs to be processed with rs_oaipmh_src.py. The only requirement is that those resources are already made available for discovery via OAI-PMH.

A real world example:

  • your OAI-PMH provider's base URL is http://example.com/oai/provider
  • the collection's setSpec is testcol
  • you want to share the records in MODS format
  • you will host the ResourceSync sitemap documents at http://test.com with the Apache 2 HTTP Server
  • you want the ResourceSync sitemap documents for this collection to be available at http://test.com/resourcesync/testcol/

The very first time you process this collection, you generate a resourcelist:

sudo python3 rs_oaipmh_src.py single http://test.com apache http://example.com/oai/provider mods resourcelist testcol

A couple of notes:

  • if you're hosting with Tomcat 7, you can replace apache with tomcat; otherwise, you must explicitly specify the server's root directory (e.g., for Tomcat 6, it would be /usr/local/tomcat6/webapps/default)
  • if you want to make Dublin Core records available instead, replace mods with oai_dc
  • for full usage information: python3 rs_oaipmh_src.py single --help

Whenever changes are made to resources (records) in this collection, you need to create (or update) an inc_changelist (incremental changelist):

sudo python3 rs_oaipmh_src.py single http://test.com apache http://example.com/oai/provider mods inc_changelist testcol

If you have many collections to generate ResourceSync documents for, you can use the multi subcommand, passing it a CSV file with the parameters for each collection:

sudo python3 rs_oaipmh_src.py multi collections.csv

IMPORTANT: you must not overwrite resourcelists by using the resourcelist strategy for generation after any changes are made! this will cause changes to be missed by destinations!

Content aggregator (destination)

I want to populate a Solr index with OAI-PMH resources from one or more content providers. I want to use ResourceSync to do this. I have a local TinyDB instance at /my/tiny/db.json with one row per resource set (according to the schema) and a Solr index at http://example.com/solr/resourcesync. I want to update Solr every Sunday at 2 AM.

# /etc/crontab
...
0 2 * * 0 root python3 rs-oaipmh-dest.py /my/tiny/db.json http://example.com/solr/resourcesync
Clone this wiki locally