Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/spi_release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ jobs:
with:
java-version: '17'
distribution: 'adopt'
server-id: ossrh
server-id: central
server-username: MAVEN_USERNAME
server-password: MAVEN_PASSWORD
- uses: actions/cache@v4
Expand Down Expand Up @@ -80,7 +80,7 @@ jobs:
with:
java-version: '17'
distribution: 'adopt'
server-id: ossrh
server-id: central
server-username: MAVEN_USERNAME
server-password: MAVEN_PASSWORD
gpg-private-key: ${{ secrets.DATAVERSEBOT_GPG_KEY }}
Expand Down
2 changes: 2 additions & 0 deletions doc/release-notes/11766-new-io.gdcc.dataverse-spi.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
The ExportDataProvider framework in the dataverse-spi package has been extended, adding some extra options for developers of metadata exporter plugins.
See the [documentation](https://guides.dataverse.org/en/latest/developers/metadataexport.html#building-an-exporter) in the Metadata Export guide for details.
26 changes: 26 additions & 0 deletions doc/sphinx-guides/source/developers/making-library-releases.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,32 @@ Releasing a Snapshot Version to Maven Central

That is to say, to make a snapshot release, you only need to get one or more commits into the default branch.

It's possible, of course, to make snapshot releases outside of GitHub Actions, from environments such as your laptop. Generally, you'll want to look at the GitHub Action and try to do the equivalent. You'll need a file set up locally at ``~/.m2/settings.xml`` with the following (contact a core developer for the redacted bits):

.. code-block:: bash

<settings>
<servers>
<server>
<id>central</id>
<username>REDACTED</username>
<password>REDACTED</password>
</server>
</servers>
</settings>

Then, study the GitHub Action and perform similar commands from your local environment. For example, as of this writing, for the dataverse-spi project, you can run the following commands, substituting the suffix you need:

``mvn -f modules/dataverse-spi -Dproject.version.suffix="2.1.0-PR11767-SNAPSHOT" verify``

``mvn -f modules/dataverse-spi -Dproject.version.suffix="2.1.0-PR11767-SNAPSHOT" deploy``

This will upload the snapshot here, for example: https://central.sonatype.com/repository/maven-snapshots/io/gdcc/dataverse-spi/2.1.02.1.0-PR11767-SNAPSHOT/dataverse-spi-2.1.02.1.0-PR11767-20250827.182026-1.jar

Before OSSRH was retired, you could browse through snapshot jars you published at https://s01.oss.sonatype.org/content/repositories/snapshots/io/gdcc/dataverse-spi/2.0.0-PR9685-SNAPSHOT/, for example. Now, even though you may see the URL of the jar as shown above during the "deploy" step, if you try to browse the various snapshot jars at https://central.sonatype.com/repository/maven-snapshots/io/gdcc/dataverse-spi/2.1.02.1.0-PR11767-SNAPSHOT/ you'll see "This maven2 hosted repository is not directly browseable at this URL. Please use the browse or HTML index views to inspect the contents of this repository." Sadly, the "browse" and "HTML index" links don't work, as noted in a `question <https://community.sonatype.com/t/this-maven2-group-repository-is-not-directly-browseable-at-this-url/8991>`_ on the Sonatype Community forum. Below is a suggestion for confirming that the jar was uploaded properly, which is to use Maven to copy the jar to your local directory. You could then compare checksums.

``mvn dependency:copy -DrepoUrl=https://central.sonatype.com/repository/maven-snapshots/ -Dartifact=io.gdcc:dataverse-spi:2.1.02.1.0-PR11767-SNAPSHOT -DoutputDirectory=.``

Releasing a Release (Non-Snapshot) Version to Maven Central
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
10 changes: 7 additions & 3 deletions modules/dataverse-spi/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@

<groupId>io.gdcc</groupId>
<artifactId>dataverse-spi</artifactId>
<version>2.0.0${project.version.suffix}</version>
<version>2.1.0${project.version.suffix}</version>
<packaging>jar</packaging>

<name>Dataverse SPI Plugin API</name>
Expand Down Expand Up @@ -64,11 +64,13 @@

<distributionManagement>
<snapshotRepository>
<id>ossrh</id>
<url>https://s01.oss.sonatype.org/content/repositories/snapshots</url>
<id>central</id>
<url>https://central.sonatype.com/repository/maven-snapshots/</url>
</snapshotRepository>
<repository>
<!--TODO: change this from ossrh to central?-->
<id>ossrh</id>
<!--TODO: change this url?-->
<url>https://s01.oss.sonatype.org/service/local/staging/deploy/maven2/</url>
</repository>
</distributionManagement>
Expand Down Expand Up @@ -110,7 +112,9 @@
<artifactId>nexus-staging-maven-plugin</artifactId>
<extensions>true</extensions>
<configuration>
<!--TODO: change this from ossrh to central?-->
<serverId>ossrh</serverId>
<!--TODO: change this URL?-->
<nexusUrl>https://s01.oss.sonatype.org</nexusUrl>
<autoReleaseAfterClose>true</autoReleaseAfterClose>
</configuration>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
package io.gdcc.spi.export;

/**
*
* @author landreev
* Provides an optional mechanism for defining various data retrieval options
* for the export subsystem in a way that should allow us adding support for
* more options going forward with minimal or no changes to the already
* implemented export plugins.
*/
public class ExportDataContext {
private boolean datasetMetadataOnly = false;
private boolean publicFilesOnly = false;
private Integer offset = null;
private Integer length = null;

private ExportDataContext() {

}

public static ExportDataContext context() {
ExportDataContext context = new ExportDataContext();
return context;
}

public ExportDataContext withDatasetMetadataOnly() {
this.datasetMetadataOnly = true;
return this;
}

public ExportDataContext withPublicFilesOnly() {
this.publicFilesOnly = true;
return this;
}

public ExportDataContext withOffset(Integer offset) {
this.offset = offset;
return this;
}

public ExportDataContext withLength(Integer length) {
this.length = length;
return this;
}

public boolean isDatasetMetadataOnly() {
return datasetMetadataOnly;
}

public boolean isPublicFilesOnly() {
return publicFilesOnly;
}

public Integer getOffset() {
return offset;
}

public Integer getLength() {
return length;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
package io.gdcc.spi.export;

/**
*
* @author landreev
* Provides a mechanism for defining various data retrieval options for the
* export subsystem in a way that should allow us adding support for more
* options going forward with minimal or no changes to the existing code in
* export plugins.
*/
@Deprecated
public class ExportDataOption {
Copy link
Contributor

@stevenwinship stevenwinship Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this is a new unused class. Instead of Deprecating why not just remove it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just saw your comment (ExportDataOption.java is still checked in for reference). I don't think checking in unused code is good. Also, there are no tests included. Does this lower the code coverage? I'm not sure if code not under src/ is included in code coverage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevenwinship Thanks for reviewing this. Unfortunately, I missed your comments above yesterday. Yes, it would have been ideal to remove this before merging. I may make a quick followup pr removing it, before the interface jar is published on maven central.

And yes, I only left it in place for the sake of showing the reviewers an alternative implementation I first tried.


public enum SupportedOptions {
DatasetMetadataOnly,
PublicFilesOnly;
}

private SupportedOptions optionType;

/*public static ExportDataOption addOption(String option) {
ExportDataOption ret = new ExportDataOption();

for (SupportedOptions supported : SupportedOptions.values()) {
if (supported.toString().equals(option)) {
ret.optionType = supported;
}
}
return ret;
}*/

public static ExportDataOption addDatasetMetadataOnly() {
ExportDataOption ret = new ExportDataOption();
ret.optionType = SupportedOptions.DatasetMetadataOnly;
return ret;
}

public static ExportDataOption addPublicFilesOnly() {
ExportDataOption ret = new ExportDataOption();
ret.optionType = SupportedOptions.PublicFilesOnly;
return ret;
}

public boolean isDatasetMetadataOnly() {
return SupportedOptions.DatasetMetadataOnly.equals(optionType);
}

public boolean isPublicFilesOnly() {
return SupportedOptions.PublicFilesOnly.equals(optionType);
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,14 @@ public interface ExportDataProvider {
* OAI_ORE export are the only two that provide 'complete'
* dataset-level metadata along with basic file metadata for each file
* in the dataset.
* @param context - supplies optional parameters. Needs to support
* context.isDatasetMetadataOnly(). In a situation where we
* need to generate a format like DC that has no use for the
* file-level metadata, it makes sense to skip retrieving and
* formatting it, since there can be a very large number of
* files in a dataset.
*/
JsonObject getDatasetJson();
JsonObject getDatasetJson(ExportDataContext... context);

/**
*
Expand All @@ -32,24 +38,42 @@ public interface ExportDataProvider {
* @apiNote - THis, and the JSON format are the only two that provide complete
* dataset-level metadata along with basic file metadata for each file
* in the dataset.
* @param context - supplies optional parameters.
*/
JsonObject getDatasetORE();
JsonObject getDatasetORE(ExportDataContext... context);

/**
* Dataverse is capable of extracting DDI-centric metadata from tabular
* datafiles. This detailed metadata, which is only available for successfully
* "ingested" tabular files, is not included in the output of any other methods
* in this interface.
* in this interface.
*
* @return - a JSONArray with one entry per ingested tabular dataset file.
* @apiNote - there is no JSON schema available for this output and the format
* is not well documented. Implementers may wish to expore the @see
* edu.harvard.iq.dataverse.export.DDIExporter and the @see
* edu.harvard.iq.dataverse.util.json.JSONPrinter classes where this
* output is used/generated (respectively).
* @param context - supplies optional parameters.
*/
JsonArray getDatasetFileDetails();
JsonArray getDatasetFileDetails(ExportDataContext... context);

/**
* Similar to the above, but
* a) retrieves the information for the ingested/tabular data files _only_
* b) provides an option for retrieving this stuff in batches
* c) provides an option for skipping restricted/embargoed etc. files.
* Intended for datasets with massive numbers of tabular files and datavariables.
* @param context - supplies optional parameters.
* current (2.1.0) known use cases:
* context.isPublicFilesOnly();
* context.getOffset();
* context.getLength();
* @return json array containing the datafile/filemetadata->datatable->datavariable metadata
* @throws ExportException
*/
JsonArray getTabularDataDetails(ExportDataContext ... context) throws ExportException;

/**
*
* @return - the subset of metadata conforming to the schema.org standard as
Expand All @@ -58,8 +82,9 @@ public interface ExportDataProvider {
* @apiNote - as this metadata export is not complete, it should only be used as
* a starting point for an Exporter if it simplifies your exporter
* relative to using the JSON or OAI_ORE exports.
* @param context - supplies optional parameters.
*/
JsonObject getDatasetSchemaDotOrg();
JsonObject getDatasetSchemaDotOrg(ExportDataContext... context);

/**
*
Expand All @@ -68,8 +93,9 @@ public interface ExportDataProvider {
* @apiNote - as this metadata export is not complete, it should only be used as
* a starting point for an Exporter if it simplifies your exporter
* relative to using the JSON or OAI_ORE exports.
* @param context - supplies optional parameters.
*/
String getDataCiteXml();
String getDataCiteXml(ExportDataContext... context);

/**
* If an Exporter has specified a prerequisite format name via the
Expand All @@ -88,9 +114,10 @@ public interface ExportDataProvider {
* malfunction, e.g. if you depend on format "ddi" and a third party
* Exporter is configured to replace the internal ddi Exporter in
* Dataverse.
* @param context - supplies optional parameters.
*/
default Optional<InputStream> getPrerequisiteInputStream() {
default Optional<InputStream> getPrerequisiteInputStream(ExportDataContext... context) {
return Optional.empty();
}

}
}
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,6 @@ default Optional<String> getPrerequisiteFormatName() {
return Optional.empty();
}


/**
* Harvestable Exporters will be available as options in Dataverse's Harvesting mechanism.
* @return true to make this exporter available as a harvesting option.
Expand Down