Skip to content

A fix for a recent bug in harvesting from some generic OAI archives#11576

Merged
ofahimIQSS merged 3 commits intodevelopfrom
11479-harvester-broken-generic-oai
Jun 23, 2025
Merged

A fix for a recent bug in harvesting from some generic OAI archives#11576
ofahimIQSS merged 3 commits intodevelopfrom
11479-harvester-broken-generic-oai

Conversation

@landreev
Copy link
Copy Markdown
Contributor

What this PR does / why we need it:

This fixes an annoying harvesting bug that crept into 6.6. As we were focusing on some new harvesting features being added, such as being able to harvest from Datacite and some other archives with unique and idiosyncratic features, something very basic got broken - namely being able to harvest from the most basic/generic non-Dataverse OAI archives (for example, DSpace - as was reported in the users group in recent week).

The fix is extremely straightforward - 5 or so lines of code that needed to be dropped.

It is quite late in the release cycle, but because the fix is so trivial - and also quite useful to many users, apparently - I strongly suggest that we add it to 6.7.

Which issue(s) this PR closes:

Special notes for your reviewer:

Suggestions on how to test this:

The reproduction scenario submitted by the user in the issue above can be used to verify the fix (harvesting from a DSpace OAI archive):

Create a harvesting client with the following settings:

oai server: https://darchive.mblwhoilibrary.org/server/oai/request
oai set: col_1912_4841
metadata format: oai_dc
archive type: "Generic OAI archive"

Harvesting will fail in develop branch.
Should work, and harvest 7 records with this build. (It may be necessary to erase the harvesting client and re-create it from scratch before retesting).

Does this PR introduce a user interface change? If mockups are available, please link/include them here:

Is there a release notes update needed for this change?:

Additional documentation:

@landreev landreev moved this to Ready for Review ⏩ in IQSS Dataverse Project Jun 12, 2025
@landreev landreev added this to the 6.7 milestone Jun 12, 2025
@landreev landreev changed the title 11479 harvester broken generic oai A fix for a recent bug in harvesting from some generic OAI archives Jun 12, 2025
@github-actions
Copy link
Copy Markdown

📦 Pushed preview images as

ghcr.io/gdcc/dataverse:11479-harvester-broken-generic-oai
ghcr.io/gdcc/configbaker:11479-harvester-broken-generic-oai

🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name.

@cmbz cmbz added the FY25 Sprint 25 FY25 Sprint 25 (2025-06-04 - 2025-06-18) label Jun 16, 2025
@pdurbin pdurbin added the Size: 3 A percentage of a sprint. 2.1 hours. label Jun 16, 2025
@stevenwinship stevenwinship self-assigned this Jun 16, 2025
@stevenwinship stevenwinship moved this from Ready for Review ⏩ to In Review 🔎 in IQSS Dataverse Project Jun 16, 2025
@github-project-automation github-project-automation bot moved this from In Review 🔎 to Ready for QA ⏩ in IQSS Dataverse Project Jun 16, 2025
@stevenwinship stevenwinship removed their assignment Jun 16, 2025
@ofahimIQSS ofahimIQSS self-assigned this Jun 16, 2025
@ofahimIQSS ofahimIQSS moved this from Ready for QA ⏩ to QA ✅ in IQSS Dataverse Project Jun 16, 2025
@ofahimIQSS
Copy link
Copy Markdown
Contributor

Before:
image

After:
image

PR validated - no issues to report.

@landreev It looks like continuous integration is failing (ansible)

@cmbz cmbz added the FY25 Sprint 26 FY25 Sprint 26 (2025-06-18 - 2025-07-02) label Jun 19, 2025
@landreev
Copy link
Copy Markdown
Contributor Author

I never got around to looking into the continuous integration issue last week, but on it now.

@landreev landreev self-assigned this Jun 23, 2025
@landreev
Copy link
Copy Markdown
Contributor Author

P.S. Actually, the last 3 failures are all 4 sec. "ansible job terminated abnormally" - i.e., seem like Jenkins flukes rather than actual test failures. Let me re-run and see what happens.

@landreev
Copy link
Copy Markdown
Contributor Author

Once the tests actually ran, everything has passed apparently.

@landreev landreev removed their assignment Jun 23, 2025
@ofahimIQSS
Copy link
Copy Markdown
Contributor

Awesome! Merging.

@ofahimIQSS ofahimIQSS merged commit 9e63eea into develop Jun 23, 2025
23 checks passed
@ofahimIQSS ofahimIQSS deleted the 11479-harvester-broken-generic-oai branch June 23, 2025 18:59
@github-project-automation github-project-automation bot moved this from QA ✅ to Merged 🚀 in IQSS Dataverse Project Jun 23, 2025
@ofahimIQSS ofahimIQSS removed their assignment Jun 23, 2025
@scolapasta scolapasta moved this from Merged 🚀 to Done 🧹 in IQSS Dataverse Project Jun 24, 2025
@cmbz cmbz added the FY26 Sprint 1 FY26 Sprint 1 (2025-07-02 - 2025-07-16) label Jul 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Feature: Harvesting FY25 Sprint 25 FY25 Sprint 25 (2025-06-04 - 2025-06-18) FY25 Sprint 26 FY25 Sprint 26 (2025-06-18 - 2025-07-02) FY26 Sprint 1 FY26 Sprint 1 (2025-07-02 - 2025-07-16) Size: 3 A percentage of a sprint. 2.1 hours. Type: Bug a defect

Projects

Status: Done 🧹

Development

Successfully merging this pull request may close these issues.

Record import from DSpace via OAI fails

6 participants