Skip to content

Conversation

@yyuuttaaoo
Copy link
Contributor

@yyuuttaaoo yyuuttaaoo commented Jan 3, 2026

  • Introduced MultimodalProcessingMixin for async processing of multimodal data in telemetry handlers.
  • Added environment variables for configuring multimodal upload behavior.
  • Implemented FsUploader for file uploads using fsspec, supporting various storage backends.
  • Created MultimodalPreUploader for preprocessing multimodal data before upload.
  • Enhanced ExtendedTelemetryHandler to utilize multimodal processing features.
  • Added new types and utilities for handling multimodal data, including Blob and Uri classes.

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Add unit tests

Does This PR Require a Core Repo Change?

  • Yes. - Link to PR:
  • No.

Checklist:

See contributing.md for styleguide, changelog guidelines, and more.

  • Followed the style guidelines of this project
  • Changelogs have been updated
  • Unit tests have been added
  • Documentation has been updated

- Introduced `MultimodalProcessingMixin` for async processing of multimodal data in telemetry handlers.
- Added environment variables for configuring multimodal upload behavior.
- Implemented `FsUploader` for file uploads using fsspec, supporting various storage backends.
- Created `MultimodalPreUploader` for preprocessing multimodal data before upload.
- Enhanced `ExtendedTelemetryHandler` to utilize multimodal processing features.
- Added new types and utilities for handling multimodal data, including Blob and Uri classes.
…ndling

- Replaced `asdict` with `obj_to_dict` for converting dataclass instances to dictionaries across multiple files.
- Improved type hints and annotations for better clarity and type safety.
- Enhanced `MultimodalProcessingMixin` and related classes to streamline async processing of multimodal data.
- Updated `ExtendedTelemetryHandler` to leverage new utilities for handling multimodal metadata.
- Refactored upload handling in `FsUploader` and `MultimodalPreUploader` to ensure consistent data processing and error handling.
…ils feature

- Adjusted import statement in `patch.py` to disable pylint warning for no-name-in-module.
- Updated `CHANGELOG-loongsuite.md` to reflect the correct pull request number for multimodal separation and upload support.
- Replaced `obj_to_dict` with `asdict` for converting dataclass instances to dictionaries in multiple files.
- Updated `MultimodalPreUploader` to support `Base64Blob` alongside `Blob` and `Uri`.
- Enhanced type hints and improved error handling for multimodal data processing.
- Streamlined attribute setting in telemetry spans for input and output messages.
- Removed specific version constraint for `httpx` in `pyproject.toml` for `multimodal_upload`.
- Added explicit rejection of 3xx redirects in `FsUploader` to prevent incorrect body retrieval with older `httpx` versions.
- Cleaned up import statements in `pre_uploader.py` for better readability.
- Ensured consistent formatting and added a comment for clarity in `MultimodalPreUploader` class.
- Changed package source URLs to use mirrors.aliyun.com for better accessibility.
- Updated the revision number in `uv.lock` to reflect the latest changes.
- Cleaned up import statements in `extended_handler.py` and `pre_uploader.py` for improved readability and consistency.
@Cirilla-zmh Cirilla-zmh changed the title [feat] Add multimodal processing capabilities and upload support [feat] Add support for processing and uploading multimodal messages Jan 5, 2026
… code

- Convert multimodal processing related comments from Chinese to English to improve code consistency
- Improve async task processing, add exception handling to ensure worker stability
- Optimize multimodal upload component FsUploader for thread safety and restart logic
- Enhance task download, upload, and retry mechanisms with detailed exception categorization and logging
- Update multimodal preprocessor MultimodalPreUploader event loop management
- Implement graceful event loop shutdown with active task counting and timeout control
- Improve multimodal metadata extraction and upload process to ensure data integrity
- Remove unnecessary circular imports to improve module decoupling and performance
- Enhance concurrency control to ensure proper resource reconstruction after multi-process fork
- Fix file write sections to use variable names, making code more readable and maintainable
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive support for processing and uploading multimodal messages (images, audio, video) in the GenAI telemetry utilities. The implementation introduces async processing to avoid blocking user applications during upload operations.

Key Changes

  • Introduces async multimodal processing via MultimodalProcessingMixin with queue-based background processing
  • Adds FsUploader for generic file uploads using fsspec (supports OSS, SLS, local filesystems)
  • Implements MultimodalPreUploader for preprocessing multimodal data, including audio format detection/conversion and URI metadata fetching
  • Adds new data types (Base64Blob, Uri) and environment variables for configuring multimodal behavior

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
types.py Adds Base64Blob dataclass for inline binary data and monotonic_end_s field to LLMInvocation for async processing timing
extended_metrics.py Removes early return to allow extended metrics processing flow-through
extended_handler.py Integrates MultimodalProcessingMixin and overrides stop_llm/fail_llm for async multimodal processing
extended_environment_variables.py Defines environment variables for multimodal storage path, upload mode, download settings, and SSL verification
pre_uploader.py Implements preprocessing logic with audio format detection, PCM-to-WAV conversion, and concurrent URI metadata fetching
fs_uploader.py Provides queue-based async uploader with LRU cache, retry logic, and support for multiple storage backends
_base.py Defines abstract interfaces Uploader, PreUploader, and data types UploadItem/PreUploadItem
__init__.py Manages global singleton instances of uploader and pre-uploader with set/get interfaces
_multimodal_processing.py Implements mixin for async multimodal processing with worker thread, queue management, and graceful shutdown
gen_ai_extended_attributes.py Adds semantic convention attributes for input/output multimodal metadata
pyproject.toml Adds multimodal_upload optional dependency group with httpx
CHANGELOG-loongsuite.md Documents the new multimodal feature

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Collaborator

@Cirilla-zmh Cirilla-zmh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job! There're several comments needs to be addressed.

…t test coverage

- Replace json serialization in _multimodal_processing.py with gen_ai_json_dumps to optimize serialization performance
- Adjust import exception handling in _multimodal_upload/__init__.py to prevent errors when dependencies are not installed
- Attempt to import audio processing dependencies and add missing warnings in pre_uploader.py, improving audio preprocessing robustness
- Fix comments in types.py to accurately reflect LoongSuite extensions
- Update test-requirements.txt to include async and audio dependency library versions, ensuring complete test environment
- Add comprehensive unit tests for FsUploader and MultimodalPreUploader in the _multimodal_upload/tests directory
- Supplement extensive test scenarios for pre-upload modules including URI processing, metadata retrieval, extension mapping, exception handling, etc., enhancing code quality and stability
Copy link
Collaborator

@Cirilla-zmh Cirilla-zmh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @123liuziming Could you please take another look on this PR?

@Cirilla-zmh Cirilla-zmh self-assigned this Jan 7, 2026
@Cirilla-zmh Cirilla-zmh merged commit 3cf34b1 into alibaba:main Jan 7, 2026
676 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants