Skip to content

Comments

Support mixed-media manifests for items with both images and videos#158

Draft
Copilot wants to merge 4 commits intomainfrom
copilot/support-mixed-media-manifests
Draft

Support mixed-media manifests for items with both images and videos#158
Copilot wants to merge 4 commits intomainfrom
copilot/support-mixed-media-manifests

Conversation

Copy link

Copilot AI commented Feb 3, 2026

IIIF manifest generation assumed homogeneous media types per item. Items like 2025-highland-house-walkthrough-ma containing both images and videos only rendered one type in the manifest.

Changes

Added mixed-media detection (check_mixed_media)

  • Scans for original images (JPEG, PNG, GIF, TIFF) and videos (MPEG4, h.264, etc.)
  • Excludes thumbnails by format and filename patterns
  • Returns (has_images, has_videos) tuple

Added canvas creation helpers

  • create_image_canvas() - IIIF Image API v3 canvases with Cantaloupe service integration
  • create_video_canvas() - Video canvases with derivative choices and VTT subtitle support

Modified create_manifest3()

  • Checks for mixed media before mediatype-based routing
  • When detected, processes all originals sorted by name
  • Creates appropriate canvas types per file format
  • Falls through to existing logic for single-mediatype items

Added IMAGE_FORMATS constant

  • Eliminates duplicate format lists across detection and processing logic

Example

Before: Video-only manifest despite 30+ images present

{
  "items": [
    {"type": "Canvas", "duration": 123.45, ...}
  ]
}

After: Complete manifest with all media

{
  "items": [
    {"type": "Canvas", "height": 1800, "width": 1200, ...},  // Image
    {"type": "Canvas", "height": 1800, "width": 1200, ...},  // Image
    {"type": "Canvas", "duration": 123.45, ...}              // Video
  ]
}

Backward compatible - only affects items with both image and video originals.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • archive.org
    • Triggering command: /usr/bin/curl curl -s REDACTED (dns block)
    • Triggering command: /usr/bin/python3 python3 -m unittest tests.test_mixed_media -v (dns block)
    • Triggering command: /usr/bin/python3 python3 -m unittest tests.test_mixed_media tests.test_resolver -v (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>Support Mixed-Media Manifests for Items with both Images and Video</issue_title>
<issue_description># Issue: Support Multi-media manifests for items containing both Images and Videos

Description

Currently, IIIF manifest generation (specifically in iiif/resolver.py) seems to assume a single mediatype (often video) for items that actually contain both images and videos. As noted by @mekarpeles , multi-media archive.org items don't produce complete manifests that enable access to all relevant files.

The goal is to move away from strictly mirroring the IA "Theater" view and instead present all original files as canvases within the manifest, ensuring viewers like Clover or Theseus can render the full collection.

Proposed Implementation

In iiif/resolver.py, we should extend create_manifest3 to check for the presence of both images and videos within the metadata's file list. Proposed by AI agent:

def create_manifest3(identifier, domain=None, page=None):
    # ... existing setup code ...
    
    mediatype = metadata.get("mediatype")
    
    # NEW: Check if this is a mixed-media item
    has_images = any(f.get('format') == 'JPEG' and f.get('source') == 'original' 
                     for f in metadata.get('files', []))
    has_videos = any(f.get('format') in ['MPEG4', 'h.264 IA'] 
                     for f in metadata.get('files', []))
    
    if has_images and has_videos:
        # Handle mixed media
        for file in sorted(metadata.get('files', []), key=lambda x: x.get('name', '')):
            if file.get('source') != 'original':
                continue
                
            file_format = file.get('format')
            file_name = file.get('name')
            
            if file_format == 'JPEG':
                # Add image canvas
                canvas = create_image_canvas(identifier, file, metadata, domain)
                manifest.add_item(canvas)
                
            elif file_format in ['MPEG4', 'h.264 IA']:
                # Add video canvas
                canvas = create_video_canvas(identifier, file, metadata, domain)
                manifest.add_item(canvas)
    
    elif mediatype.lower() == 'movies':
        # ... existing video-only logic ...
    
    elif mediatype.lower() == 'image':
        # ... existing image-only logic ...

Key Considerations from Discussion

  • Thumbnail Case: We should distinguish between primary media and supporting thumbnails. If we ignore the thumbnail case (where a video just has associated frames), we can treat items with non-thumbnail images and videos as "multi-media" and fallback otherwise.
  • IA Theater Parity: As discussed with Mark Baggett, this involves a choice to ignore the current IA theater display logic in favor of showing all original files together.
  • Viewer Compatibility: David Schober noted that while RAMP may struggle, viewers like Clover handle multiple content types on a canvas well, though video can occasionally hit CORS issues.
  • Reference Item: 2025-highland-house-walkthrough-ma
    • This item has ~30 unique files (images and video) but the current manifest defaults to a single item/video view.

Related Issues

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 3 commits February 3, 2026 06:14
Co-authored-by: mekarpeles <978325+mekarpeles@users.noreply.github.com>
Co-authored-by: mekarpeles <978325+mekarpeles@users.noreply.github.com>
Co-authored-by: mekarpeles <978325+mekarpeles@users.noreply.github.com>
Copilot AI changed the title [WIP] Support mixed-media manifests for items with images and video Support mixed-media manifests for items with both images and videos Feb 3, 2026
Copilot AI requested a review from mekarpeles February 3, 2026 06:19
@mekarpeles mekarpeles requested a review from Copilot February 3, 2026 20:56
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for mixed-media IIIF manifests that can handle items containing both images and videos, addressing a limitation where the manifest generator assumed homogeneous media types per item.

Changes:

  • Introduced check_mixed_media() function to detect items with both original images and videos, excluding thumbnails
  • Added create_image_canvas() and create_video_canvas() helper functions to generate appropriate IIIF v3 canvases for each media type
  • Modified create_manifest3() to check for mixed-media items before routing to mediatype-specific logic, processing all originals when both types are detected

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.

File Description
iiify/resolver.py Added IMAGE_FORMATS constant, mixed-media detection function, canvas creation helpers, and integrated mixed-media handling into manifest generation
tests/test_mixed_media.py Added unit tests for mixed-media detection logic and integration test for manifest structure

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

slugged_id = normalised_id.replace(" ", "-")

# Create canvas ID
c_id = f"{URI_PRIFIX}/{identifier}/{slugged_id}/canvas"
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Canvas IDs for images in mixed-media items don't include a sequential number (line 783 generates {identifier}/{slugged_id}/canvas), but the existing multi-file image handling includes pageCount in the canvas ID (line 1079: {identifier}${pageCount}/canvas). This inconsistency could cause issues if canvas IDs are expected to follow a specific pattern. Consider using a consistent ID pattern, either including the canvas number or ensuring the slugged filename is sufficient for uniqueness.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,119 @@
import os
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The os module is imported but never used in this test file. Consider removing this unused import.

Suggested change
import os

Copilot uses AI. Check for mistakes.
Comment on lines +960 to +965
try:
canvas = create_image_canvas(identifier, file, metadata, domain, canvas_number)
manifest.add_item(canvas)
canvas_number += 1
except Exception as e:
print(f'Failed to create image canvas for {file_name}: {e}')
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception handling catches all exceptions broadly (except Exception as e:), which could mask unexpected errors like KeyError or AttributeError from bugs in the code. The existing image handling at lines 1084-1089 catches specific exceptions (requests.exceptions.HTTPError) and creates a fallback canvas with error information. Consider either: (1) catching specific expected exceptions, or (2) creating a fallback canvas similar to the existing pattern, or (3) logging the error more robustly rather than just printing.

Copilot uses AI. Check for mistakes.
Comment on lines +969 to +974
try:
canvas = create_video_canvas(identifier, file, metadata, domain, derivatives, vttfiles)
manifest.add_item(canvas)
canvas_number += 1
except Exception as e:
print(f'Failed to create video canvas for {file_name}: {e}')
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The exception handling catches all exceptions broadly (except Exception as e:), which could mask unexpected errors. Consider catching specific expected exceptions or logging errors more robustly. The existing image handling at lines 1084-1089 catches specific exceptions and creates fallback canvases.

Copilot uses AI. Check for mistakes.

return (has_images, has_videos)

def create_image_canvas(identifier, file, metadata, domain, canvas_number):
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The canvas_number parameter is defined but never used in the function body. This parameter appears to be unnecessary since the canvas ID is generated from the file name, not from a sequential number.

Suggested change
def create_image_canvas(identifier, file, metadata, domain, canvas_number):
def create_image_canvas(identifier, file, metadata, domain, _canvas_number):

Copilot uses AI. Check for mistakes.
Comment on lines +104 to +111
# Handle both direct body and Choice bodies
if isinstance(body, dict):
canvas_types.add(body.get('type', 'Unknown'))
elif hasattr(body, 'items'):
# Choice body
for item in body.items:
if hasattr(item, 'type'):
canvas_types.add(item.type)
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is using hasattr(body, 'items') on line 107, but body is a dict (as checked on line 105). The hasattr check will always return True because dicts have an 'items' method. This should check if 'items' is a key in the dict using 'items' in body or check if body is a Choice object. Similarly, line 110 checks hasattr(item, 'type') but item could be a dict from body.items (the dict method), not from the Choice items list.

Suggested change
# Handle both direct body and Choice bodies
if isinstance(body, dict):
canvas_types.add(body.get('type', 'Unknown'))
elif hasattr(body, 'items'):
# Choice body
for item in body.items:
if hasattr(item, 'type'):
canvas_types.add(item.type)
# Handle both direct body and Choice bodies (IIIF v3 JSON is dict-based)
if isinstance(body, dict):
body_type = body.get('type')
if body_type == 'Choice' and 'items' in body:
# Choice body: iterate through the list of items
for item in body['items']:
if isinstance(item, dict) and 'type' in item:
canvas_types.add(item['type'])
else:
# Direct body: use its type
canvas_types.add(body.get('type', 'Unknown'))

Copilot uses AI. Check for mistakes.
Comment on lines +762 to +764
if (file_format in IMAGE_FORMATS and
'thumb' not in file_name.lower() and
file_format not in ['JPEG Thumb', 'Thumbnail']):
Copy link

Copilot AI Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thumbnail exclusion logic is duplicated between check_mixed_media (lines 762-764) and the mixed-media processing loop (lines 957-959). This creates a maintenance risk if the criteria for identifying thumbnails changes. Consider extracting this into a helper function like is_thumbnail_image(file_format, file_name) to ensure consistency.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support Mixed-Media Manifests for Items with both Images and Video

2 participants