Support mixed-media manifests for items with both images and videos#158
Support mixed-media manifests for items with both images and videos#158
Conversation
Co-authored-by: mekarpeles <978325+mekarpeles@users.noreply.github.com>
Co-authored-by: mekarpeles <978325+mekarpeles@users.noreply.github.com>
Co-authored-by: mekarpeles <978325+mekarpeles@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This pull request adds support for mixed-media IIIF manifests that can handle items containing both images and videos, addressing a limitation where the manifest generator assumed homogeneous media types per item.
Changes:
- Introduced
check_mixed_media()function to detect items with both original images and videos, excluding thumbnails - Added
create_image_canvas()andcreate_video_canvas()helper functions to generate appropriate IIIF v3 canvases for each media type - Modified
create_manifest3()to check for mixed-media items before routing to mediatype-specific logic, processing all originals when both types are detected
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
iiify/resolver.py |
Added IMAGE_FORMATS constant, mixed-media detection function, canvas creation helpers, and integrated mixed-media handling into manifest generation |
tests/test_mixed_media.py |
Added unit tests for mixed-media detection logic and integration test for manifest structure |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| slugged_id = normalised_id.replace(" ", "-") | ||
|
|
||
| # Create canvas ID | ||
| c_id = f"{URI_PRIFIX}/{identifier}/{slugged_id}/canvas" |
There was a problem hiding this comment.
Canvas IDs for images in mixed-media items don't include a sequential number (line 783 generates {identifier}/{slugged_id}/canvas), but the existing multi-file image handling includes pageCount in the canvas ID (line 1079: {identifier}${pageCount}/canvas). This inconsistency could cause issues if canvas IDs are expected to follow a specific pattern. Consider using a consistent ID pattern, either including the canvas number or ensuring the slugged filename is sufficient for uniqueness.
| @@ -0,0 +1,119 @@ | |||
| import os | |||
There was a problem hiding this comment.
The os module is imported but never used in this test file. Consider removing this unused import.
| import os |
| try: | ||
| canvas = create_image_canvas(identifier, file, metadata, domain, canvas_number) | ||
| manifest.add_item(canvas) | ||
| canvas_number += 1 | ||
| except Exception as e: | ||
| print(f'Failed to create image canvas for {file_name}: {e}') |
There was a problem hiding this comment.
The exception handling catches all exceptions broadly (except Exception as e:), which could mask unexpected errors like KeyError or AttributeError from bugs in the code. The existing image handling at lines 1084-1089 catches specific exceptions (requests.exceptions.HTTPError) and creates a fallback canvas with error information. Consider either: (1) catching specific expected exceptions, or (2) creating a fallback canvas similar to the existing pattern, or (3) logging the error more robustly rather than just printing.
| try: | ||
| canvas = create_video_canvas(identifier, file, metadata, domain, derivatives, vttfiles) | ||
| manifest.add_item(canvas) | ||
| canvas_number += 1 | ||
| except Exception as e: | ||
| print(f'Failed to create video canvas for {file_name}: {e}') |
There was a problem hiding this comment.
The exception handling catches all exceptions broadly (except Exception as e:), which could mask unexpected errors. Consider catching specific expected exceptions or logging errors more robustly. The existing image handling at lines 1084-1089 catches specific exceptions and creates fallback canvases.
|
|
||
| return (has_images, has_videos) | ||
|
|
||
| def create_image_canvas(identifier, file, metadata, domain, canvas_number): |
There was a problem hiding this comment.
The canvas_number parameter is defined but never used in the function body. This parameter appears to be unnecessary since the canvas ID is generated from the file name, not from a sequential number.
| def create_image_canvas(identifier, file, metadata, domain, canvas_number): | |
| def create_image_canvas(identifier, file, metadata, domain, _canvas_number): |
| # Handle both direct body and Choice bodies | ||
| if isinstance(body, dict): | ||
| canvas_types.add(body.get('type', 'Unknown')) | ||
| elif hasattr(body, 'items'): | ||
| # Choice body | ||
| for item in body.items: | ||
| if hasattr(item, 'type'): | ||
| canvas_types.add(item.type) |
There was a problem hiding this comment.
The test is using hasattr(body, 'items') on line 107, but body is a dict (as checked on line 105). The hasattr check will always return True because dicts have an 'items' method. This should check if 'items' is a key in the dict using 'items' in body or check if body is a Choice object. Similarly, line 110 checks hasattr(item, 'type') but item could be a dict from body.items (the dict method), not from the Choice items list.
| # Handle both direct body and Choice bodies | |
| if isinstance(body, dict): | |
| canvas_types.add(body.get('type', 'Unknown')) | |
| elif hasattr(body, 'items'): | |
| # Choice body | |
| for item in body.items: | |
| if hasattr(item, 'type'): | |
| canvas_types.add(item.type) | |
| # Handle both direct body and Choice bodies (IIIF v3 JSON is dict-based) | |
| if isinstance(body, dict): | |
| body_type = body.get('type') | |
| if body_type == 'Choice' and 'items' in body: | |
| # Choice body: iterate through the list of items | |
| for item in body['items']: | |
| if isinstance(item, dict) and 'type' in item: | |
| canvas_types.add(item['type']) | |
| else: | |
| # Direct body: use its type | |
| canvas_types.add(body.get('type', 'Unknown')) |
| if (file_format in IMAGE_FORMATS and | ||
| 'thumb' not in file_name.lower() and | ||
| file_format not in ['JPEG Thumb', 'Thumbnail']): |
There was a problem hiding this comment.
The thumbnail exclusion logic is duplicated between check_mixed_media (lines 762-764) and the mixed-media processing loop (lines 957-959). This creates a maintenance risk if the criteria for identifying thumbnails changes. Consider extracting this into a helper function like is_thumbnail_image(file_format, file_name) to ensure consistency.
IIIF manifest generation assumed homogeneous media types per item. Items like
2025-highland-house-walkthrough-macontaining both images and videos only rendered one type in the manifest.Changes
Added mixed-media detection (
check_mixed_media)(has_images, has_videos)tupleAdded canvas creation helpers
create_image_canvas()- IIIF Image API v3 canvases with Cantaloupe service integrationcreate_video_canvas()- Video canvases with derivative choices and VTT subtitle supportModified
create_manifest3()Added
IMAGE_FORMATSconstantExample
Before: Video-only manifest despite 30+ images present
{ "items": [ {"type": "Canvas", "duration": 123.45, ...} ] }After: Complete manifest with all media
{ "items": [ {"type": "Canvas", "height": 1800, "width": 1200, ...}, // Image {"type": "Canvas", "height": 1800, "width": 1200, ...}, // Image {"type": "Canvas", "duration": 123.45, ...} // Video ] }Backward compatible - only affects items with both image and video originals.
Warning
Firewall rules blocked me from connecting to one or more addresses (expand for details)
I tried to connect to the following addresses, but was blocked by firewall rules:
archive.org/usr/bin/curl curl -s REDACTED(dns block)/usr/bin/python3 python3 -m unittest tests.test_mixed_media -v(dns block)/usr/bin/python3 python3 -m unittest tests.test_mixed_media tests.test_resolver -v(dns block)If you need me to access, download, or install something from one of these locations, you can either:
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.