Skip to content

Fixed the Problem (Step 2): Unable to Identifying Certain Data File T…#2048

Open
Meghanxuxx wants to merge 1 commit intoartefactual:qa/1.xfrom
Meghanxuxx:dev/issue-1745-Parse-Dataverse-METS-XML-Fails
Open

Fixed the Problem (Step 2): Unable to Identifying Certain Data File T…#2048
Meghanxuxx wants to merge 1 commit intoartefactual:qa/1.xfrom
Meghanxuxx:dev/issue-1745-Parse-Dataverse-METS-XML-Fails

Conversation

@Meghanxuxx
Copy link

Related issue: archivematica/Issues#1745

This PR is still continuing to solve the Issue #1745, the solution is split into two PRs because both the stroge service files and the archivematica files need to be modified, and here's the first part: artefactual/archivematica-storage-service#772

If the step1 PR change is taken, the only issue left now is that files extracted using extract_contents.py will end up with different file paths than they did with function extract_and_remove_bundle. The new paths include dates like:


So we need to fix the get_db_objects function in parse_dataverse_mets.py to make sure all files can be correctly matched using the pattern "[original_directory].zip-[timestamp]/[original_file_path]"

This PR modifies the get_db_objects function, which now searches for files in the database using three different methods:

  1. First tries using the complete METS path to find the file (existing method)
  2. Then it uses timestamp pattern matching. If multiple matches are found, it selects the top-level path (newly added)
  3. Tries matching by filename (existing method)

@replaceafill
Copy link
Member

@Meghanxuxx I think this has similar problems to artefactual/archivematica-storage-service#772 (comment). The Test mpc-client jobs are also failing because of the import changes in your branch.

@sarah-mason sarah-mason added the Community Pull requests that have been contributed from community members outside Artefactual label Apr 28, 2025
@sarah-mason
Copy link
Contributor

Hi @Meghanxuxx - just letting your know we're going to test 1.18 in a few weeks and will stop accepting contributions for it. If you have any questions about the failing tests, do let us know. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Community Pull requests that have been contributed from community members outside Artefactual

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants