Skip to content

Commit f3b84f3

Browse files
authored
πŸ“ Multimodal sdk document
2 parents 28b21df + d8553ef commit f3b84f3

File tree

2 files changed

+639
-0
lines changed

2 files changed

+639
-0
lines changed
Lines changed: 327 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,327 @@
1+
# Multimodal Module
2+
3+
This module provides a native multimodal data processing bus designed for agents. With the `@load_object` and `@save_object` decorators, it supports real-time transmission and processing of text, images, audio, video, and other data formats, enabling seamless cross-modal data flow.
4+
5+
## πŸ“‹ Table of Contents
6+
7+
- [LoadSaveObjectManager Initialization](#loadsaveobjectmanager-initialization)
8+
- [@load_object Decorator](#load_object-decorator)
9+
- [@save_object Decorator](#save_object-decorator)
10+
- [Combined Usage Example](#combined-usage-example)
11+
12+
## LoadSaveObjectManager Initialization
13+
14+
Before using the decorators, you need to initialize a `LoadSaveObjectManager` instance and pass in a storage client (for example, a MinIO client):
15+
16+
```python
17+
from sdk.nexent.multi_modal.load_save_object import LoadSaveObjectManager
18+
from backend.database.client import minio_client
19+
20+
21+
# Create manager instance
22+
Multimodal = LoadSaveObjectManager(storage_client=minio_client)
23+
```
24+
25+
You can also implement your own storage client based on the `StorageClient` base class in `sdk.nexent.storage.storage_client_base`.
26+
The storage client must implement:
27+
28+
- `get_file_stream(object_name, bucket)`: get a file stream from storage (for download)
29+
- `upload_fileobj(file_obj, object_name, bucket)`: upload a file-like object to storage (for save)
30+
31+
## @load_object Decorator
32+
33+
The `@load_object` decorator downloads files from URLs (S3 / HTTP / HTTPS) **before** the wrapped function is executed, and passes the file content (or transformed data) into the wrapped function.
34+
35+
### Features
36+
37+
- **Automatic download**: Automatically detect and download files pointed to by S3, HTTP, or HTTPS URLs.
38+
- **Data transformation**: Use custom transformer functions to convert downloaded bytes into types required by the wrapped function (for example, `PIL.Image`, text, etc.).
39+
- **Batch processing**: Support a single URL or a list of URLs.
40+
41+
### Parameters
42+
43+
- `input_names` (`List[str]`): names of function parameters to transform.
44+
- `input_data_transformer` (`Optional[List[Callable[[bytes], Any]]]`): optional list of transformers; each transformer converts raw `bytes` into the target type for the corresponding parameter.
45+
46+
### Supported URL Formats
47+
48+
The decorator supports:
49+
50+
- **S3 URLs**
51+
- `s3://bucket-name/object/file.jpg`
52+
- `/bucket-name/object/file.jpg` (short form)
53+
- **HTTP / HTTPS URLs**
54+
- `http://example.com/file.jpg`
55+
- `https://example.com/file.jpg`
56+
57+
URL type detection:
58+
59+
- Starts with `http://` β†’ HTTP URL
60+
- Starts with `https://` β†’ HTTPS URL
61+
- Starts with `s3://` or looks like `/bucket/object` β†’ S3 URL
62+
63+
### Examples
64+
65+
#### Basic: download as bytes
66+
67+
```python
68+
@Multimodal.load_object(input_names=["image_url"])
69+
def process_image(image_url: bytes):
70+
"""image_url will be replaced with downloaded bytes."""
71+
print(f"File size: {len(image_url)} bytes")
72+
return image_url
73+
74+
75+
# Call process_image
76+
result = process_image(image_url="http://example.com/pic.PNG")
77+
```
78+
79+
#### Advanced: convert bytes to PIL Image
80+
81+
If the function parameter is not `bytes` (for example, it expects `PIL.Image.Image`), define a converter (such as `bytes_to_pil`) and pass it to the decorator.
82+
83+
```python
84+
import io
85+
from PIL import Image
86+
87+
88+
def bytes_to_pil(binary_data: bytes) -> Image.Image:
89+
image_stream = io.BytesIO(binary_data)
90+
img = Image.open(image_stream)
91+
return img
92+
93+
94+
@Multimodal.load_object(
95+
input_names=["image_url"],
96+
input_data_transformer=[bytes_to_pil],
97+
)
98+
def process_image(image_url: Image.Image) -> Image.Image:
99+
"""image_url will be converted into a PIL Image object."""
100+
resized = image_url.resize((800, 600))
101+
return resized
102+
103+
104+
result = process_image(image_url="http://example.com/pic.PNG")
105+
```
106+
107+
#### Multiple inputs
108+
109+
```python
110+
from PIL import Image
111+
112+
113+
@Multimodal.load_object(
114+
input_names=["image_url1", "image_url2"],
115+
input_data_transformer=[bytes_to_pil, bytes_to_pil],
116+
)
117+
def process_two_images(image_url1: Image.Image, image_url2: Image.Image) -> Image.Image:
118+
"""Both image URLs will be downloaded and converted into PIL Images."""
119+
combined = Image.new("RGB", (1600, 600))
120+
combined.paste(image_url1, (0, 0))
121+
combined.paste(image_url2, (800, 0))
122+
return combined
123+
124+
125+
result = process_two_images(
126+
image_url1="http://example.com/pic1.PNG",
127+
image_url2="http://example.com/pic2.PNG",
128+
)
129+
```
130+
131+
#### List of URLs
132+
133+
```python
134+
from typing import List
135+
from PIL import Image
136+
137+
138+
@Multimodal.load_object(
139+
input_names=["image_urls"],
140+
input_data_transformer=[bytes_to_pil],
141+
)
142+
def process_image_list(image_urls: List[Image.Image]) -> List[Image.Image]:
143+
"""Support a list of URLs, each will be downloaded and converted."""
144+
results: List[Image.Image] = []
145+
for img in image_urls:
146+
results.append(img.resize((200, 200)))
147+
return results
148+
149+
150+
result = process_image_list(
151+
image_urls=[
152+
"http://example.com/pic1.PNG",
153+
"http://example.com/pic2.PNG",
154+
]
155+
)
156+
```
157+
158+
## @save_object Decorator
159+
160+
The `@save_object` decorator uploads return values to storage (MinIO) **after** the wrapped function finishes, and returns S3 URLs.
161+
162+
### Features
163+
164+
- **Automatic upload**: Automatically upload function return values to MinIO.
165+
- **Data transformation**: Use transformers to convert return values into `bytes` (for example, `PIL.Image` β†’ `bytes`).
166+
- **Batch processing**: Support a single return value or multiple values (tuple).
167+
- **URL return**: Return S3 URLs of the form `s3://bucket/object_name`.
168+
169+
### Parameters
170+
171+
- `output_names` (`List[str]`): logical names for each return value.
172+
- `output_transformers` (`Optional[List[Callable[[Any], bytes]]]`): transformers that convert each return value into `bytes`.
173+
- `bucket` (`str`): target bucket name, default `"nexent"`.
174+
175+
### Examples
176+
177+
#### Basic: save raw bytes
178+
179+
```python
180+
@Multimodal.save_object(
181+
output_names=["content"],
182+
)
183+
def generate_file() -> bytes:
184+
"""Returned bytes will be uploaded to MinIO automatically."""
185+
content = b"Hello, World!"
186+
return content
187+
```
188+
189+
#### Advanced: convert PIL Image to bytes before upload
190+
191+
If the function does not return `bytes` (for example, it returns `PIL.Image.Image`), define a converter such as `pil_to_bytes` and pass it to the decorator.
192+
193+
```python
194+
import io
195+
from typing import Optional
196+
from PIL import Image, ImageFilter
197+
198+
199+
def pil_to_bytes(img: Image.Image, format: Optional[str] = None) -> bytes:
200+
"""
201+
Convert a PIL Image to binary data (bytes).
202+
"""
203+
if img is None:
204+
raise ValueError("Input image cannot be None")
205+
206+
buffer = io.BytesIO()
207+
208+
# Decide which format to use
209+
if format is None:
210+
# Use original format if available, otherwise default to PNG
211+
format = img.format if img.format else "PNG"
212+
213+
# For JPEG, ensure RGB (no alpha channel)
214+
if format.upper() == "JPEG" and img.mode in ("RGBA", "LA", "P"):
215+
rgb_img = Image.new("RGB", img.size, (255, 255, 255))
216+
if img.mode == "P":
217+
img = img.convert("RGBA")
218+
rgb_img.paste(
219+
img,
220+
mask=img.split()[-1] if img.mode in ("RGBA", "LA") else None,
221+
)
222+
rgb_img.save(buffer, format=format)
223+
else:
224+
img.save(buffer, format=format)
225+
226+
data = buffer.getvalue()
227+
buffer.close()
228+
return data
229+
230+
231+
@Multimodal.save_object(
232+
output_names=["processed_image"],
233+
output_transformers=[pil_to_bytes],
234+
)
235+
def process_image(image: Image.Image) -> Image.Image:
236+
"""Returned PIL Image will be converted to bytes and uploaded."""
237+
blurred = image.filter(ImageFilter.GaussianBlur(radius=5))
238+
return blurred
239+
```
240+
241+
#### Multiple files
242+
243+
```python
244+
from typing import Tuple
245+
246+
247+
@Multimodal.save_object(
248+
output_names=["resized1", "resized2"],
249+
output_transformers=[pil_to_bytes, pil_to_bytes],
250+
)
251+
def process_two_images(
252+
img1: Image.Image,
253+
img2: Image.Image,
254+
) -> Tuple[Image.Image, Image.Image]:
255+
"""Both returned images will be uploaded and return corresponding S3 URLs."""
256+
resized1 = img1.resize((800, 600))
257+
resized2 = img2.resize((800, 600))
258+
return resized1, resized2
259+
```
260+
261+
### Return Format
262+
263+
- **Single return value**: a single S3 URL string, `s3://bucket/object_name`.
264+
- **Multiple return values (tuple)**: a tuple where each element is the corresponding S3 URL.
265+
266+
### Notes
267+
268+
- If you do **not** provide a transformer, the function return value must be `bytes`.
269+
- If you provide a transformer, the transformer **must** return `bytes`.
270+
- The number of return values must match the length of `output_names`.
271+
272+
## Combined Usage Example
273+
274+
In practice, `@load_object` and `@save_object` are often used together to build a full **download β†’ process β†’ upload** pipeline:
275+
276+
```python
277+
from typing import Union, List
278+
from PIL import Image, ImageFilter
279+
280+
from backend.database.client import minio_client
281+
from sdk.nexent.multi_modal.load_save_object import LoadSaveObjectManager
282+
283+
284+
Multimodal = LoadSaveObjectManager(storage_client=minio_client)
285+
286+
287+
@Multimodal.load_object(
288+
input_names=["image_url"],
289+
input_data_transformer=[bytes_to_pil],
290+
)
291+
@Multimodal.save_object(
292+
output_names=["blurred_image"],
293+
output_transformers=[pil_to_bytes],
294+
)
295+
def blur_image_tool(
296+
image_url: Union[str, List[str]],
297+
blur_radius: int = 5,
298+
) -> Image.Image:
299+
"""
300+
Apply a Gaussian blur filter to an image.
301+
302+
Args:
303+
image_url: S3 URL or HTTP/HTTPS URL of the image.
304+
blur_radius: Blur radius (default 5, valid range 1–50).
305+
306+
Returns:
307+
Processed PIL Image object (it will be uploaded and returned as an S3 URL).
308+
"""
309+
# At this point, image_url has already been converted to a PIL Image
310+
if image_url is None:
311+
raise ValueError("Failed to load image")
312+
313+
# Clamp blur radius
314+
blur_radius = max(1, min(50, blur_radius))
315+
316+
# Apply blur
317+
blurred_image = image_url.filter(ImageFilter.GaussianBlur(radius=blur_radius))
318+
return blurred_image
319+
320+
321+
# Example usage
322+
result_url = blur_image_tool(
323+
image_url="s3://nexent/images/input.png",
324+
blur_radius=10,
325+
)
326+
# result_url is something like "s3://nexent/attachments/xxx.png"
327+
```

0 commit comments

Comments
Β (0)