Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions doc/release-notes/11747-review-dataset-type.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
### New Dataset Type: Review

A new, experimental dataset type called "review" has been added. When this type is published, it will be sent to DataCite as "Other" for resourceTypeGeneral. See #11747.
30 changes: 25 additions & 5 deletions doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1015,8 +1015,8 @@ You should expect an HTTP 200 ("OK") response and JSON indicating the database I

.. _api-create-dataset-with-type:

Create a Dataset with a Dataset Type (Software, etc.)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Create a Dataset with a Dataset Type (Software, Workflow, Review, etc.)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

By default, datasets are given the type "dataset" but if your installation had added additional types (see :ref:`api-add-dataset-type`), you can specify the type.

Expand Down Expand Up @@ -1070,8 +1070,8 @@ Before calling the API, make sure the data files referenced by the ``POST``\ ed

.. _import-dataset-with-type:

Import a Dataset with a Dataset Type (Software, etc.)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Import a Dataset with a Dataset Type (Software, Workflow, Review, etc.)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

By default, datasets are given the type "dataset" but if your installation had added additional types (see :ref:`api-add-dataset-type`), you can specify the type.

Expand Down Expand Up @@ -4193,7 +4193,27 @@ The fully expanded example above (without environment variables) looks like this
Add Dataset Type
^^^^^^^^^^^^^^^^

Note: Before you add any types of your own, there should be a single type called "dataset". If you add "software" or "workflow", these types will be sent to DataCite (if you use DataCite). Otherwise, the only functionality you gain currently from adding types is an entry in the "Dataset Type" facet but be advised that if you add a type other than "software" or "workflow", you will need to add your new type to your Bundle.properties file for it to appear in Title Case rather than lower case in the "Dataset Type" facet.
Note: Before you add any types of your own, there should be a single type called "dataset".

Adding certain dataset types will result in a value other than "Dataset" being sent to DataCite (if you use DataCite) as shown in the table below.

.. list-table:: Values sent to DataCite for resourceTypeGeneral by Dataset Type
:header-rows: 1
:stub-columns: 1
:align: left

* - Dataset Type
- Value sent to DataCite
* - dataset
- Dataset
* - software
- Software
* - workflow
- Workflow
* - review
- Other

Other than sending a different resourceTypeGeneral to DataCite, the only functionality you gain currently from adding types is an entry in the "Dataset Type" facet but be advised that if you add a type other than "software", "workflow", or "review", you will need to add your new type to your Bundle.properties file for it to appear in Title Case rather than lower case in the "Dataset Type" facet.

With all that said, we'll add a "software" type in the example below. This API endpoint is superuser only. The "name" of a type cannot be only digits. Note that this endpoint also allows you to add metadata blocks and available licenses for your new dataset type by adding "linkedMetadataBlocks" and/or "availableLicenses" arrays to your JSON.

Expand Down
2 changes: 2 additions & 0 deletions doc/sphinx-guides/source/user/appendix.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,8 @@ Unlike supported metadata, experimental metadata is not enabled by default in a
- Computational Workflow Metadata (`see .tsv <https://github.com/IQSS/dataverse/blob/master/scripts/api/data/metadatablocks/computational_workflow.tsv>`__): adapted from `Bioschemas Computational Workflow Profile, version 1.0 <https://bioschemas.org/profiles/ComputationalWorkflow/1.0-RELEASE>`__ and `Codemeta <https://codemeta.github.io/terms/>`__.
- Archival Metadata (`see .tsv <https://github.com/IQSS/dataverse/blob/master/scripts/api/data/metadatablocks/archival.tsv>`__): Enables repositories to register metadata relating to the potential archiving of the dataset at a depositor archive, whether that be your own institutional archive or an external archive, i.e. a historical archive.
- Local Contexts Metadata (`see .tsv <https://github.com/gdcc/dataverse-external-vocab-support/blob/main/packages/local_contexts/cvocLocalContexts.tsv>`__): Supports integration with the `Local Contexts <https://localcontexts.org/>`__ platform, enabling the use of Traditional Knowledge and Biocultural Labels, and Notices. For more information on setup and configuration, see :doc:`../installation/localcontexts`.
- Trusted Data Dimensions and Intensities (`see .tsv <https://github.com/IQSS/dataverse/blob/master/scripts/api/data/metadatablocks/trusteddatadimensionsintensities.tsv>`__): Enables repositories to indicate dimensions of trust.
- Repository Characteristics (`see .tsv <https://github.com/IQSS/dataverse/blob/master/scripts/api/data/metadatablocks/repositorycharacteristics.tsv>`__): Details related to the security, sustainability, and certifications of the repository.

Please note: these custom metadata schemas are not included in the Solr schema for indexing by default, you will need
to add them as necessary for your custom metadata blocks. See "Update the Solr Schema" in :doc:`../admin/metadatacustomization`.
Expand Down
4 changes: 2 additions & 2 deletions doc/sphinx-guides/source/user/dataset-management.rst
Original file line number Diff line number Diff line change
Expand Up @@ -847,11 +847,11 @@ Dataset Types

.. note:: Development of the dataset types feature is ongoing. Please see https://github.com/IQSS/dataverse-pm/issues/307 for details.

Out of the box, all datasets have a dataset type of "dataset". Superusers can add additional types such as "software" or "workflow" using the :ref:`api-add-dataset-type` API endpoint.
Out of the box, all datasets have a dataset type of "dataset". Superusers can add additional types such as "software", "workflow", or "review" using the :ref:`api-add-dataset-type` API endpoint.

Once more than one type appears in search results, a facet called "Dataset Type" will appear allowing you to filter down to a certain type.

If your installation is configured to use DataCite as a persistent ID (PID) provider, the appropriate type ("Dataset", "Software", "Workflow") will be sent to DataCite when the dataset is published for those three types.
If your installation is configured to use DataCite as a persistent ID (PID) provider, the appropriate type ("Dataset", "Software", "Workflow", "Review") will be sent to DataCite when the dataset is published for those types.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this say ("Dataset", "Software", "Workflow", "Other")? Review is sent to Datacite as "Other"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, good point. Maybe I'll change this to "certain types" and link to the table I made elsewhere, or something. Thanks! 😅


Currently, specifying a type for a dataset can only be done via API and only when the dataset is created. The type can't currently be changed afterward. For details, see the following sections of the API guide:

Expand Down
40 changes: 40 additions & 0 deletions scripts/api/data/metadatablocks/review.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
#metadataBlock name dataverseAlias displayName
review Review Metadata
#datasetField name title description watermark fieldType displayOrder displayFormat advancedSearchField allowControlledVocabulary allowmultiples facetable displayoncreate required parent metadatablock_id termURI
itemReviewed Item Reviewed The item being reviewed none 1 FALSE FALSE FALSE FALSE FALSE FALSE review
itemReviewedUrl URL The URL of the item being reviewed url 2 FALSE FALSE FALSE FALSE FALSE FALSE itemReviewed review
itemReviewedType Type The type of the item being reviewed text 3 FALSE TRUE FALSE FALSE FALSE FALSE itemReviewed review
itemReviewedCitation Citation The full bibliographic citation of the item being reviewed textbox 4 FALSE FALSE FALSE FALSE FALSE FALSE itemReviewed review
#controlledVocabulary DatasetField Value identifier displayOrder
itemReviewedType Audiovisual 0
itemReviewedType Award 1
itemReviewedType Book 2
itemReviewedType Book Chapter 3
itemReviewedType Collection 4
itemReviewedType Computational Notebook 5
itemReviewedType Conference Paper 6
itemReviewedType Conference Proceeding 7
itemReviewedType DataPaper 8
itemReviewedType Dataset 9
itemReviewedType Dissertation 10
itemReviewedType Event 11
itemReviewedType Image 12
itemReviewedType Interactive Resource 13
itemReviewedType Instrument 14
itemReviewedType Journal 15
itemReviewedType Journal Article 16
itemReviewedType Model 17
itemReviewedType Output Management Plan 18
itemReviewedType Peer Review 19
itemReviewedType Physical Object 20
itemReviewedType Preprint 21
itemReviewedType Project 22
itemReviewedType Report 23
itemReviewedType Service 24
itemReviewedType Software 25
itemReviewedType Sound 26
itemReviewedType Standard 27
itemReviewedType Study Registration 28
itemReviewedType Text 29
itemReviewedType Workflow 30
itemReviewedType Other 31
3 changes: 3 additions & 0 deletions src/main/java/edu/harvard/iq/dataverse/DataCitation.java
Original file line number Diff line number Diff line change
Expand Up @@ -740,8 +740,11 @@ public Map<String, String> getDataCiteMetadata() {

public JsonObject getCSLJsonFormat() {
CSLItemDataBuilder itemBuilder = new CSLItemDataBuilder();
// TODO consider making this a switch
if (type.equals(DatasetType.DATASET_TYPE_SOFTWARE)) {
itemBuilder.type(CSLType.SOFTWARE);
} else if (type.equals(DatasetType.DATASET_TYPE_REVIEW)) {
itemBuilder.type(CSLType.REVIEW);
Comment on lines +746 to +747
Copy link
Member Author

@pdurbin pdurbin Aug 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qqmyers I made this change without testing anything because I wasn't sure how to. Can you please advise? I'd like to add something under "how to test" about it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you can unit test w.r.t. making sure the CSLJson output has the review type if your datasettype is review. Beyond that, the CSL Json is used in front side JavaScript to generate any of thousands of citation formats, but any given format may or may not use CSLType in generating its output. I can't think of any easy way to test that (but that would really be testing the CSL Java and JavaScript libraries - if our code gets the CSLType into the CSL Json output here, things should work).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, thanks. Do you know of any format that uses CSLType? Maybe we can test that type by manually inspecting its output?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know. AI says APA for the social sciences and Chicago for history and arts, and MLA for humanities are popular for reviews - hopefully that means they check the type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm still confused. Still not sure how to test this from JSF. I did look at APA and it says [dataset] (not [review] ) like this:

Screenshot 2025-08-21 at 11 28 18 AM

But I'm not sure if I'm looking in the right place.

If you think I should simply back out of this change, I'm ok with that. Less to mention in the release note. 😄

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can get the CSL format itself via api, e.g. curl https://demo.dataverse.org/api/datasets/:persistentId/versions/1.0/citation/CSL?persistentId=doi:10.70122/FK2/GJF7SF - note the native api guide appears to have an error and lists the format as "CSLJson" rather than "CSL".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, at https://dev1.dataverse.org/api/datasets/:persistentId/versions/1.0/citation/CSL?persistentId=doi%3A10.5072/FK2/RMEZNL I'm getting this:

{
    "id": "-GEN-2mqfdnmq0r",
    "type": "dataset",
    "categories": [
    ],
    "author": [
        {
            "family": "Simpson",
            "given": "Homer",
            "isInstitution": false
        }
    ],
    "issued": {
        "date-parts": [
            [
                2025
            ]
        ]
    },
    "DOI": "10.5072/FK2/RMEZNL",
    "publisher": "Root",
    "title": "Review of Darwin's Finches",
    "URL": "http://ec2-44-214-43-137.compute-1.amazonaws.com/citation?persistentId=doi:10.5072/FK2/RMEZNL",
    "version": "V1"
}

(I need to fix my siteUrl, obviously.) 😅

I guess you're saying ideally "type" would be "review" or at least not "dataset" in that output. It makes me wonder what my change did, if anything. 🤔

Also, thanks for the heads up about the typo in the guides.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant to say you've given me an API endpoint to dig into. Thanks!

} else {
itemBuilder.type(CSLType.DATASET);
}
Expand Down
28 changes: 21 additions & 7 deletions src/main/java/edu/harvard/iq/dataverse/api/Datasets.java
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
import edu.harvard.iq.dataverse.externaltools.ExternalToolHandler;
import edu.harvard.iq.dataverse.globus.GlobusServiceBean;
import edu.harvard.iq.dataverse.globus.GlobusUtil;
import edu.harvard.iq.dataverse.i18n.i18nUtil;
import edu.harvard.iq.dataverse.ingest.IngestServiceBean;
import edu.harvard.iq.dataverse.ingest.IngestUtil;
import edu.harvard.iq.dataverse.makedatacount.*;
Expand Down Expand Up @@ -100,13 +101,12 @@
import java.util.stream.Collectors;
import static edu.harvard.iq.dataverse.api.ApiConstants.*;

import edu.harvard.iq.dataverse.dataset.DatasetType;
import edu.harvard.iq.dataverse.dataset.DatasetTypeServiceBean;
import edu.harvard.iq.dataverse.license.License;

import static edu.harvard.iq.dataverse.util.json.JsonPrinter.*;
import static edu.harvard.iq.dataverse.util.json.NullSafeJsonBuilder.jsonObjectBuilder;

import static jakarta.ws.rs.core.HttpHeaders.ACCEPT_LANGUAGE;
import static jakarta.ws.rs.core.Response.Status.BAD_REQUEST;
import static jakarta.ws.rs.core.Response.Status.NOT_FOUND;
import static jakarta.ws.rs.core.Response.Status.FORBIDDEN;
Expand Down Expand Up @@ -5720,17 +5720,19 @@ public Response resetPidGenerator(@Context ContainerRequestContext crc, @PathPar

@GET
@Path("datasetTypes")
public Response getDatasetTypes() {
public Response getDatasetTypes(@HeaderParam(ACCEPT_LANGUAGE) String acceptLanguage) {
Locale locale = i18nUtil.parseAcceptLanguageHeader(acceptLanguage);
JsonArrayBuilder jab = Json.createArrayBuilder();
for (DatasetType datasetType : datasetTypeSvc.listAll()) {
jab.add(datasetType.toJson());
jab.add(datasetType.toJson(locale));
}
return ok(jab);
}

@GET
@Path("datasetTypes/{idOrName}")
public Response getDatasetTypes(@PathParam("idOrName") String idOrName) {
public Response getDatasetTypes(@PathParam("idOrName") String idOrName, @HeaderParam(ACCEPT_LANGUAGE) String acceptLanguage) {
Locale locale = i18nUtil.parseAcceptLanguageHeader(acceptLanguage);
DatasetType datasetType = null;
if (StringUtils.isNumeric(idOrName)) {
try {
Expand All @@ -5743,7 +5745,7 @@ public Response getDatasetTypes(@PathParam("idOrName") String idOrName) {
datasetType = datasetTypeSvc.getByName(idOrName);
}
if (datasetType != null) {
return ok(datasetType.toJson());
return ok(datasetType.toJson(locale));
} else {
return error(NOT_FOUND, "Could not find a dataset type with name " + idOrName);
}
Expand All @@ -5768,6 +5770,8 @@ public Response addDatasetType(@Context ContainerRequestContext crc, String json
}

String nameIn = null;
String displayNameIn = null;
String descriptionIn = null;

JsonArrayBuilder datasetTypesAfter = Json.createArrayBuilder();
List<MetadataBlock> metadataBlocksToSave = new ArrayList<>();
Expand All @@ -5776,6 +5780,8 @@ public Response addDatasetType(@Context ContainerRequestContext crc, String json
try {
JsonObject datasetTypeObj = JsonUtil.getJsonObject(jsonIn);
nameIn = datasetTypeObj.getString("name");
displayNameIn = datasetTypeObj.getString("displayName", null);
descriptionIn = datasetTypeObj.getString("description", null);

JsonArray arr = datasetTypeObj.getJsonArray("linkedMetadataBlocks");
if (arr != null && !arr.isEmpty()) {
Expand Down Expand Up @@ -5813,6 +5819,9 @@ public Response addDatasetType(@Context ContainerRequestContext crc, String json
if (nameIn == null) {
return error(BAD_REQUEST, "A name for the dataset type is required");
}
if (displayNameIn == null) {
return error(BAD_REQUEST, "A displayName for the dataset type is required");
}
if (StringUtils.isNumeric(nameIn)) {
// getDatasetTypes supports id or name so we don't want a names that looks like an id
return error(BAD_REQUEST, "The name of the type cannot be only digits.");
Expand All @@ -5821,12 +5830,17 @@ public Response addDatasetType(@Context ContainerRequestContext crc, String json
try {
DatasetType datasetType = new DatasetType();
datasetType.setName(nameIn);
datasetType.setDisplayName(displayNameIn);
datasetType.setDescription(descriptionIn);
datasetType.setMetadataBlocks(metadataBlocksToSave);
datasetType.setLicenses(licensesToSave);
DatasetType saved = datasetTypeSvc.save(datasetType);
Long typeId = saved.getId();
String name = saved.getName();
return ok(saved.toJson());
// Locale is null because when creating the dataset type we are relying entirely
// on the database. The new dataset type has not yet been localized in a
// properties file.
return ok(saved.toJson(null));
} catch (WrappedResponse ex) {
return error(BAD_REQUEST, ex.getMessage());
}
Expand Down
Loading