Prepare a corpus for publication or use in Oni by generating the OCFL object contained in a root OCFL storage with a specific layout.
This tool requires an input of an RO-Crate directory containing all the required data:
ro-crate-metadata.json
metadata file as per the specification- any other files referenced in the metadata (e.g. data files)
Clone the repo then install:
npm install
Either set the environment variable as described below or replace it with the proper value.
node index.js \
-r "${REPO_OUT_DIR}" \
-d "${DATA_DIR}" \
-s "${NAMESPACE}" \
--multiple \
--sf \
--vm "${MODEFILE}"
Specify the output directory ${REPO_OUT_DIR}
, which is the path to the OCFL repository or storage root.
Specify the input directory ${DATA_DIR}
, which is the path to the RO-Crate directory containing the ro-crate-metadata.json
file and the data files.
${NAMESPACE}
is a name for the top-level collection which must be unique to the repository. This is used to create an ARCP identifier arcp://name,<namespace>
to make the @id
of the Root Data Entity into a valid absolute IRI.
If --multiple
is specified, a distributed crate will be created. The input crate will be split to output multiple crates. Each RepositoryObject and RepositoryCollection in the input crate will be put into its own OCFL storage object.
Using --sf
flag requires Siegfried to be installed. It will run it and cache the output to .siegfried.json
.
Delete file .siegfried.json
to force it to rerun Siegfried.
Using the --vm "${MODEFILE}"
argument will enable validation against the mode file ${MODEFILE}
which can be a file path or a URL.
The directory ${REPO_OUT_DIR}
will be created, which will contain all the OCFL objects. If a distributed crate is created, the OCFL storage layout will look something like this:
- arcp://name,<${NAMESPACE}>
- __object__
- collection1
- __object__
- object1
- object2