Skip to content

NullPointerException in DOI creation causes silent registration failures during dataset creation #11941

@jamessi1989

Description

@jamessi1989

What steps does it take to reproduce the issue?

  • When does this issue occur?

This issue occurs intermittently during dataset creation when author affiliations are selected from the ROR (Research Organization Registry) autocomplete dropdown. The failure is timing-dependent, occurring approximately 10-50% of the time depending on system load.

  • Which page(s) does it occurs on?

Dataset creation page (dataset.xhtml?ownerId=X&editMode=CREATE)

  • What happens?
  1. User creates a new dataset and fills in metadata
  2. User adds an author and types in the "Author Affiliation" field
  3. User selects an organization from the ROR autocomplete dropdown (e.g., "UPC - Universidad Politécnica de Cataluña")
  4. User clicks "Save"
  5. Dataset saves successfully and DOI identifier is assigned (e.g., doi:10.82201/XXXXXX)
  6. However, the DOI is NOT registered with DataCite
  7. Database shows: identifierregistered=false and globalidcreatetime=NULL
  8. User sees "DOI not reserved" banner on the dataset page
  9. Logs show: java.lang.NullPointerException in AbstractPidProvider.getTargetUrl()

This is a silent data integrity failure - the dataset appears to save successfully, but the DOI is not actually registered with the PID provider.

  • To whom does it occur (all users, curators, superusers)?

All users who create datasets. The issue is not permission-based or role-based.

  • What did you expect to happen?

Expected behavior:

  • Dataset saves successfully
  • DOI is registered with DataCite
  • Database shows identifierregistered=true and globalidcreatetime has a timestamp
  • No "DOI not reserved" banner appears
  • DOI resolves correctly

Which version of Dataverse are you using?

Confirmed on both Dataverse 6.5 AND 6.8

Production environment (6.5):

  • Dataverse: 6.5
  • Payara: 6.2023.8
  • PostgreSQL: 16.6
  • OS: Rocky Linux 9

Test environment (6.8):

  • Dataverse: 6.8
  • Payara: 6.2023.8+
  • Same bug occurs

Configuration:

  • DOI Provider: DataCite (REST API)
  • External Vocab Support: Enabled (ROR + ORCID)
  • JVM properties: dataverse.pid.default-provider=datacite-prod

Any related open or closed issues to this bug report?

None found.

This appears to be an unreported race condition in the DOI creation transaction flow.

Screenshots:

Log excerpt showing the NullPointerException pattern:

[2025-10-30T08:23:36.855] FINE: DataCiteDOIProvider.createIdentifier
[2025-10-30T08:23:36.856] FINE: AbstractPidProvider.getMetadataForCreateIndicator
[2025-10-30T08:23:36.857] FINE: AbstractPidProvider.getTargetUrl
[2025-10-30T08:23:36.860] WARNING: DataCiteDOIProvider
  Identifier not created: create failed
java.lang.NullPointerException
[2025-10-30T08:23:36.860] INFO: AbstractDatasetCommand
  Call to globalIdServiceBean.createIdentifier failed: java.lang.NullPointerException

Database state after failure:

SELECT identifier, identifierregistered, globalidcreatetime
FROM dvobject WHERE identifier = 'EWDNT1';

identifier | identifierregistered | globalidcreatetime
-----------+----------------------+-------------------
EWDNT1     | false                | NULL

Root Cause:

The getTargetUrl() method in AbstractPidProvider attempts to call dvObjectIn.getGlobalId().asString(), but getGlobalId() returns null because the GlobalId hasn't been assigned yet during the DOI creation process. This is a race condition in the transaction flow.

Code location: src/main/java/edu/harvard/iq/dataverse/pidproviders/AbstractPidProvider.java

Failing line (approximate):

public String getTargetUrl(DvObject dvObjectIn) {
    return SystemConfig.getDataverseSiteUrlStatic()
         + dvObjectIn.getTargetUrl()
         + dvObjectIn.getGlobalId().asString();  // ← NPE: getGlobalId() returns null
}

Reproduction Steps:

  1. Enable debug logging:

    asadmin set-log-levels edu.harvard.iq.dataverse.pidproviders=FINE
    asadmin set-log-levels edu.harvard.iq.dataverse.pidproviders.doi.datacite=FINEST
  2. Create new dataset with:

    • Title: "Test Dataset"
    • Description: "Testing DOI bug"
    • Author Name: "Test Author"
    • Author Affiliation: Select an organization from ROR dropdown (don't just type text)
      • Type "Universidad" or "UPC" or "Catalan"
      • Select from autocomplete dropdown
  3. Click Save

  4. Check logs for NullPointerException

  5. Check database:

    SELECT identifier, identifierregistered, globalidcreatetime
    FROM dvobject
    WHERE protocol = 'doi'
    ORDER BY createtime DESC
    LIMIT 1;

If identifierregistered = false and globalidcreatetime IS NULL, the bug occurred.

Workaround (100% Success Rate):

Users can avoid this issue by:

  1. Create dataset with minimal metadata (title + description only, no author)
  2. Save → DOI registers successfully ✅
  3. Edit dataset to add author with ROR affiliation
  4. Save changes ✅

This works because the DOI is already registered during the initial save, and the edit operation doesn't trigger DOI recreation.

Impact:

In our production environment:

  • 10 datasets affected since October 27, 2025
  • Failure rate: 10-50% when ROR affiliations are selected
  • 6 different institutions affected (not institution-specific)
  • All required manual remediation

Severity: High - Silent data integrity failure affecting persistent identifiers

Are you thinking about creating a pull request for this issue?

Not currently feasible

nullpointer-example-log.txt

We have extensive debug logs and can provide additional information as needed. We're committed to helping resolve this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions