-
Notifications
You must be signed in to change notification settings - Fork 7
Support mixed-media manifests for items with both images and videos #158
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -33,6 +33,7 @@ class MaxLimitException(Exception): | |||||
| valid_filetypes = ['jpg', 'jpeg', 'png', 'gif', 'tif', 'jp2', 'pdf', 'tiff'] | ||||||
| AUDIO_FORMATS = ['VBR MP3', '32Kbps MP3', '56Kbps MP3', '64Kbps MP3', '96Kbps MP3', '128Kbps MP3', 'MPEG-4 Audio', 'Flac', 'AIFF', 'Apple Lossless Audio', 'Ogg Vorbis', 'WAVE', '24bit Flac', 'Shorten'] | ||||||
| VIDEO_FORMATS = ['MPEG4', 'h.264 HD', 'h.264 MPEG4', '512Kb MPEG4', 'HiRes MPEG4', 'MPEG2', 'h.264', 'Matroska', 'Ogg Video', 'Ogg Theora', 'WebM', 'Windows Media', 'Cinepack','QuickTime'] | ||||||
| IMAGE_FORMATS = ['JPEG', 'PNG', 'GIF', 'TIFF'] | ||||||
|
|
||||||
| class IsCollection(Exception): | ||||||
| # Used for when we need to raise to the route handler from inside the manifest function | ||||||
|
|
@@ -737,6 +738,178 @@ def create_canvas_from_br(br_page, zipFile, identifier, pageCount, uri): | |||||
|
|
||||||
| return canvas | ||||||
|
|
||||||
| def check_mixed_media(metadata): | ||||||
| """ | ||||||
| Check if an item contains both original images and videos. | ||||||
| Returns (has_images, has_videos) tuple. | ||||||
| Excludes thumbnail images to avoid false positives. | ||||||
| """ | ||||||
| has_images = False | ||||||
| has_videos = False | ||||||
|
|
||||||
| for file in metadata.get('files', []): | ||||||
| if file.get('source') != 'original': | ||||||
| continue | ||||||
|
|
||||||
| file_format = file.get('format', '') | ||||||
| file_name = file.get('name', '') | ||||||
|
|
||||||
| # Check for video formats | ||||||
| if file_format in VIDEO_FORMATS: | ||||||
| has_videos = True | ||||||
|
|
||||||
| # Check for image formats (excluding thumbnails) | ||||||
| if (file_format in IMAGE_FORMATS and | ||||||
| 'thumb' not in file_name.lower() and | ||||||
| file_format not in ['JPEG Thumb', 'Thumbnail']): | ||||||
| has_images = True | ||||||
|
|
||||||
| # Early exit if we've found both | ||||||
| if has_images and has_videos: | ||||||
| return (True, True) | ||||||
|
|
||||||
| return (has_images, has_videos) | ||||||
|
|
||||||
| def create_image_canvas(identifier, file, metadata, domain, canvas_number): | ||||||
|
||||||
| def create_image_canvas(identifier, file, metadata, domain, canvas_number): | |
| def create_image_canvas(identifier, file, metadata, domain, _canvas_number): |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Canvas IDs for images in mixed-media items don't include a sequential number (line 783 generates {identifier}/{slugged_id}/canvas), but the existing multi-file image handling includes pageCount in the canvas ID (line 1079: {identifier}${pageCount}/canvas). This inconsistency could cause issues if canvas IDs are expected to follow a specific pattern. Consider using a consistent ID pattern, either including the canvas number or ensuring the slugged filename is sufficient for uniqueness.
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exception handling catches all exceptions broadly (except Exception as e:), which could mask unexpected errors like KeyError or AttributeError from bugs in the code. The existing image handling at lines 1084-1089 catches specific exceptions (requests.exceptions.HTTPError) and creates a fallback canvas with error information. Consider either: (1) catching specific expected exceptions, or (2) creating a fallback canvas similar to the existing pattern, or (3) logging the error more robustly rather than just printing.
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The exception handling catches all exceptions broadly (except Exception as e:), which could mask unexpected errors. Consider catching specific expected exceptions or logging errors more robustly. The existing image handling at lines 1084-1089 catches specific exceptions and creates fallback canvases.
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| @@ -0,0 +1,119 @@ | ||||||||||||||||||||||||||||||||||||||||
| import os | ||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||
| import os |
Copilot
AI
Feb 3, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is using hasattr(body, 'items') on line 107, but body is a dict (as checked on line 105). The hasattr check will always return True because dicts have an 'items' method. This should check if 'items' is a key in the dict using 'items' in body or check if body is a Choice object. Similarly, line 110 checks hasattr(item, 'type') but item could be a dict from body.items (the dict method), not from the Choice items list.
| # Handle both direct body and Choice bodies | |
| if isinstance(body, dict): | |
| canvas_types.add(body.get('type', 'Unknown')) | |
| elif hasattr(body, 'items'): | |
| # Choice body | |
| for item in body.items: | |
| if hasattr(item, 'type'): | |
| canvas_types.add(item.type) | |
| # Handle both direct body and Choice bodies (IIIF v3 JSON is dict-based) | |
| if isinstance(body, dict): | |
| body_type = body.get('type') | |
| if body_type == 'Choice' and 'items' in body: | |
| # Choice body: iterate through the list of items | |
| for item in body['items']: | |
| if isinstance(item, dict) and 'type' in item: | |
| canvas_types.add(item['type']) | |
| else: | |
| # Direct body: use its type | |
| canvas_types.add(body.get('type', 'Unknown')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thumbnail exclusion logic is duplicated between
check_mixed_media(lines 762-764) and the mixed-media processing loop (lines 957-959). This creates a maintenance risk if the criteria for identifying thumbnails changes. Consider extracting this into a helper function likeis_thumbnail_image(file_format, file_name)to ensure consistency.