added logic to create file references and added logging by sherwoodf · Pull Request #4 · BioImage-Archive/BIA-ro-crate

sherwoodf · 2025-04-15T10:48:14Z

ticket: https://app.clickup.com/t/8698qrqq9

sherwoodf · 2025-04-15T10:59:10Z

bia_ro_crate/ro_crate_to_bia/entity_conversion/FileReference.py

+    file_path: str, study_uuid: str, dataset_uuid: str, crate_path: pathlib.Path
+) -> list[APIModels.FileReference]:
+
+    relative_path = pathlib.Path(file_path).relative_to(crate_path).as_posix()


using as_posix because i'm assuming our system will be posix & we don't want this (and therefore the UUID) to vary depending on whether a windows system was used to ingest the objects.

bia_ro_crate/cli.py

bia_ro_crate/model/example/S-BIAD1494/ro-crate-version/ro-crate-metadata.json

kbab

Left some comments - also not sure if you are currently writing to an API. I think the API versions start from 0 not 1. If this is the case the version in the functions converting ROCrate models to API ones needs to change.

bia_ro_crate/ro_crate_to_bia/crate_reader.py

bia_ro_crate/ro_crate_to_bia/entity_conversion/AnnotationMethod.py

bia_ro_crate/ro_crate_to_bia/entity_conversion/Dataset.py

kbab · 2025-04-16T14:59:24Z

bia_ro_crate/ro_crate_to_bia/entity_conversion/FileReference.py

+        "file_path": str(relative_path),
+        "version": 1,
+        "size_in_bytes": pathlib.Path(file_path).stat().st_size,
+        "format": pathlib.Path(file_path).suffix,


This is interesting. I think in the original bia_shared_models this property was from biostudies - which distinguished between file / directory (and I think we also used this in the past to flag files in zip archives).

During biostudies ingest we take this from the BioStudiesAPIFile object and its value is usually 'file'. However, EMPIAR ingest populates this with the suffix of the file path (compare value of format from example from biostudies ingest with EMPIAR ingest).

If we decide to go this way it would be useful to know whether to identify special suffixes e.g. (ome.zarr, nii.gz, ome.zarr.zip, zarr.zip, ome.tiff) and whether to standardise suffixes (e.g. TIF vs tif vs tiff vs TIFF) - we have a function for this when creating images/image_representations

bia_ro_crate/ro_crate_to_bia/entity_conversion/ImageAcquisitionProtocol.py

bia_ro_crate/ro_crate_to_bia/entity_conversion/Protocol.py

test/ro_crate_to_bia/output.json

kbab · 2025-04-16T15:23:18Z

test/test_ro_crate_to_bia.py

-    assert cli_out == expected_out
+    # Account for different ordering of JSON objects due to file reference order being somewhat arbitrary.
+    assert len(cli_out) == len(expected_out)
+    for json_obj in cli_out:


Could there be a scenario where two identical objects are created instead of two distinct ones? e.g. If we have FileReference1 twice instead of FileReference1 and FileReference2. In such a scenario the above will pass the test. However, if the for loop and following assertion explicitly go through all expected objects, the test will fail.

Oh interesting edge case - will update as you suggest

…e testing more resistant to edge cases

kbab

LGTM

sherwoodf requested a review from kbab April 15, 2025 10:48

sherwoodf force-pushed the file_ref_maker branch from f16db24 to d3c5ad1 Compare April 15, 2025 10:52

added logic to create file references and added logging

b3f995e

sherwoodf force-pushed the file_ref_maker branch from d3c5ad1 to b3f995e Compare April 15, 2025 10:57

sherwoodf commented Apr 15, 2025

View reviewed changes

kbab reviewed Apr 16, 2025

View reviewed changes

bia_ro_crate/cli.py Show resolved Hide resolved

kbab reviewed Apr 16, 2025

View reviewed changes

bia_ro_crate/model/example/S-BIAD1494/ro-crate-version/ro-crate-metadata.json Outdated Show resolved Hide resolved

kbab reviewed Apr 16, 2025

View reviewed changes

updates to functions with unnecessary variables, added todos, and mak…

96c6aef

…e testing more resistant to edge cases

sherwoodf requested a review from kbab April 17, 2025 13:57

kbab approved these changes Apr 17, 2025

View reviewed changes

sherwoodf merged commit 9f18380 into main Apr 17, 2025
4 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added logic to create file references and added logging#4

added logic to create file references and added logging#4
sherwoodf merged 2 commits intomainfrom
file_ref_maker

sherwoodf commented Apr 15, 2025

Uh oh!

sherwoodf Apr 15, 2025

Uh oh!

Uh oh!

Uh oh!

kbab left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kbab Apr 16, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kbab Apr 16, 2025

Uh oh!

sherwoodf Apr 17, 2025

Uh oh!

kbab left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sherwoodf commented Apr 15, 2025

Uh oh!

sherwoodf Apr 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kbab left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kbab Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kbab Apr 16, 2025

Choose a reason for hiding this comment

Uh oh!

sherwoodf Apr 17, 2025

Choose a reason for hiding this comment

Uh oh!

kbab left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants