Skip to content

Commit ccce9bf

Browse files
authored
Merge pull request #68 from jgwill/claude/langfuse-media-support-01DKbZAHAypsV2cbMgs2Joqf
Claude/langfuse media support 01 d kb zah ayps v2cb mgs2 joqf
2 parents 4fe73fe + ad577a7 commit ccce9bf

21 files changed

+18532
-10
lines changed
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
. _env.sh
2+
export session_id__parent_jgwill_Miadi_STCMastery_Claude_Agent_SDK_and_PR87_2510311507
3+
export session_id__issue_65__Medias_Support_2511161357
4+
5+
claude " in @references/openapi.json you will find the media endpoints (operations) we will want both @coaiapy/cofuse.py and the WHOLE CLI wrapper exposed in @pyproject.yaml and the @coaiapy-mcp/ to support the medias operations. we will also need to make sure that we have clear testing for that, the @tests/.env has both Langfuse and redis environment for you to prepare adequate testing ground.
6+
@tests/image_medias.txt @tests/dropbox_shared.txt are potential medias URL to embed or something, I never achived to make this working so... there is also : tests/notebook_graph.jpg that I dont know if we can upload a jpg to langfuse, you will investigate that.
7+
8+
the ./references/ has other files that might help you.
9+
10+
Analyze first before you prepare your plan.
11+
12+
You might also plan at how this will fits in the 'coaia pipeline' subcommands as we might be adding medias somehow in a sequence of action that are part of the pipeline.
13+
14+
ADDITIONAL INFO:
15+
16+
the tests/dropbox_shared.txt was created with the CLI command 'droxul' using 'droxul upload <file.jpg> /<derired_full_path_on_dropbox>' then 'droxul share <derired_full_path_on_dropbox>' which outputs that shared URL which we will want to know if they can be use within a Trace where we probably add a media somewhere in the trace, I dont know if that is in an observation or if that it has its own media entity, you will make this happen and known.
17+
18+
You will add a new trace using the 'coaiapy_aetherial' (that uses this very tool that you are working on) and in the Input is the context, my request etc and you will hopefully Patch the trace 'Output' at the end and during your process, you will add Observations to that trace_id : $session_id__issue_65__Medias_Support_2511161357 within the session_id : $session_id__parent_jgwill_Miadi_STCMastery_Claude_Agent_SDK_and_PR87_2510311507
19+
20+
" \
21+
--session-id ${session_id__issue_65__Medias_Support_2511161357} \
22+
--mcp-config /src/.mcp.coaiapy.env.aetherial.json /src/.mcp.github.json \
23+
--permission-mode plan
24+

MEDIA_UPLOAD_GUIDE.md

Lines changed: 373 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,373 @@
1+
# Langfuse Media Upload Guide
2+
3+
## Overview
4+
5+
CoaiaPy now supports uploading media files (images, videos, audio, documents) to Langfuse traces and observations with automatic token attachment for inline rendering in the Langfuse UI.
6+
7+
## Quick Start
8+
9+
### CLI Usage
10+
11+
```bash
12+
# Upload image to trace input
13+
coaia fuse media upload photo.jpg trace_abc123
14+
15+
# Upload video to observation output
16+
coaia fuse media upload video.mp4 trace_abc123 \
17+
--observation-id obs_456 \
18+
--field output
19+
20+
# Upload audio with explicit content type
21+
coaia fuse media upload recording.wav trace_abc123 \
22+
--content-type audio/wav \
23+
--field metadata
24+
25+
# Get media details
26+
coaia fuse media get media_xyz789
27+
```
28+
29+
### Python API Usage
30+
31+
```python
32+
from coaiapy.cofuse import upload_and_attach_media, get_media, format_media_display
33+
34+
# Upload image to trace
35+
result = upload_and_attach_media(
36+
file_path="screenshot.png",
37+
trace_id="trace_abc123",
38+
field="input"
39+
)
40+
41+
if result["success"]:
42+
print(f"✅ Media ID: {result['media_id']}")
43+
print(f"📎 Token: {result['media_token']}")
44+
print(format_media_display(result['media_data']))
45+
else:
46+
print(f"❌ Error: {result['error']}")
47+
48+
# Upload video to observation
49+
result = upload_and_attach_media(
50+
file_path="demo.mp4",
51+
trace_id="trace_abc123",
52+
observation_id="obs_456",
53+
field="output"
54+
)
55+
```
56+
57+
### MCP Tool Usage
58+
59+
```python
60+
# In MCP client/server
61+
result = await coaia_fuse_media_upload(
62+
file_path="diagram.pdf",
63+
trace_id="trace_abc123",
64+
field="input"
65+
)
66+
67+
# Get media details
68+
media = await coaia_fuse_media_get(
69+
media_id=result["media_id"]
70+
)
71+
```
72+
73+
## Complete Workflow
74+
75+
The media upload process follows these steps:
76+
77+
### 1. File Validation
78+
- Validates file exists
79+
- Auto-detects MIME type from extension
80+
- Validates against 52 supported content types
81+
- Calculates SHA-256 hash for deduplication
82+
83+
### 2. Upload Initialization
84+
- POST to `/api/public/media`
85+
- Receives `mediaId` and presigned S3 `uploadUrl`
86+
87+
### 3. File Upload
88+
- PUT file to presigned S3 URL
89+
- **Security**: Validates URL domain is from trusted cloud storage providers:
90+
- AWS S3 (`amazonaws.com`, `s3.amazonaws.com`)
91+
- Google Cloud Storage (`storage.googleapis.com`)
92+
- Azure Blob Storage (`blob.core.windows.net`)
93+
- Cloudflare R2 (`r2.cloudflarestorage.com`)
94+
95+
### 4. Status Update
96+
- PATCH to `/api/public/media/{mediaId}`
97+
- Reports upload success/failure
98+
99+
### 5. Token Attachment (NEW!)
100+
- Generates Langfuse Media Token:
101+
```
102+
@@@langfuseMedia:type={MIME_TYPE}|id={MEDIA_ID}|source=file@@@
103+
```
104+
- Attaches token to trace or observation field
105+
- Enables inline rendering in Langfuse UI
106+
107+
## Langfuse Media Token Format
108+
109+
The media token is a standardized string that Langfuse UI automatically detects and renders:
110+
111+
```
112+
@@@langfuseMedia:type={MIME_TYPE}|id={MEDIA_ID}|source={SOURCE_TYPE}@@@
113+
```
114+
115+
### Components
116+
- **MIME_TYPE**: Content type (e.g., `image/jpeg`, `video/mp4`, `audio/mp3`)
117+
- **MEDIA_ID**: Langfuse media ID from upload (e.g., `media_xyz789`)
118+
- **SOURCE_TYPE**: Source of media - `file`, `base64_data_uri`, or `bytes`
119+
120+
### Examples
121+
```
122+
@@@langfuseMedia:type=image/png|id=media_abc123|source=file@@@
123+
@@@langfuseMedia:type=video/mp4|id=media_xyz789|source=file@@@
124+
@@@langfuseMedia:type=application/pdf|id=media_def456|source=file@@@
125+
```
126+
127+
## Supported Content Types (52 total)
128+
129+
### Images (16)
130+
- image/jpeg, image/jpg, image/png, image/gif, image/webp
131+
- image/bmp, image/tiff, image/svg+xml, image/heic, image/heif
132+
- image/avif, image/x-icon, image/vnd.microsoft.icon
133+
- image/apng, image/jxl, image/x-png
134+
135+
### Videos (13)
136+
- video/mp4, video/mpeg, video/quicktime, video/x-msvideo
137+
- video/x-ms-wmv, video/x-flv, video/webm, video/3gpp, video/3gpp2
138+
- video/x-matroska, video/ogg, video/mp2t, video/x-m4v
139+
140+
### Audio (13)
141+
- audio/mpeg, audio/mp3, audio/wav, audio/x-wav, audio/wave
142+
- audio/ogg, audio/webm, audio/aac, audio/x-aac, audio/flac
143+
- audio/x-flac, audio/mp4, audio/m4a
144+
145+
### Documents (7)
146+
- application/pdf, application/msword
147+
- application/vnd.openxmlformats-officedocument.wordprocessingml.document
148+
- application/vnd.ms-excel
149+
- application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
150+
- application/vnd.ms-powerpoint
151+
- application/vnd.openxmlformats-officedocument.presentationml.presentation
152+
153+
### Archives (3)
154+
- application/zip, application/x-rar-compressed, application/x-7z-compressed
155+
156+
## Field Options
157+
158+
Media can be attached to three semantic fields:
159+
160+
- **input**: Media as input to the trace/observation
161+
- **output**: Media as output from the trace/observation
162+
- **metadata**: Media as metadata/context
163+
164+
## Response Format
165+
166+
Successful upload returns:
167+
168+
```python
169+
{
170+
"success": True,
171+
"media_id": "media_xyz789",
172+
"media_token": "@@@langfuseMedia:type=image/jpeg|id=media_xyz789|source=file@@@",
173+
"media_data": {
174+
"id": "media_xyz789",
175+
"traceId": "trace_abc123",
176+
"observationId": None,
177+
"field": "input",
178+
"contentType": "image/jpeg",
179+
"contentLength": 193424,
180+
"sha256Hash": "a1b2c3...",
181+
"uploadedAt": "2025-11-22T12:34:56Z"
182+
},
183+
"message": "Successfully uploaded photo.jpg (193424 bytes)",
184+
"upload_time_ms": 1234.56
185+
}
186+
```
187+
188+
Error returns:
189+
190+
```python
191+
{
192+
"success": False,
193+
"error": "File not found: missing.jpg"
194+
}
195+
```
196+
197+
## Troubleshooting
198+
199+
### Upload fails with "Security error: Upload URL domain..."
200+
201+
The presigned URL is not from a trusted cloud storage provider. This is a security measure to prevent data exfiltration. Contact your Langfuse administrator to verify the storage configuration.
202+
203+
**Trusted domains:**
204+
- AWS S3: `amazonaws.com`, `s3.amazonaws.com`
205+
- Google Cloud: `storage.googleapis.com`
206+
- Azure: `blob.core.windows.net`
207+
- Cloudflare R2: `r2.cloudflarestorage.com`
208+
209+
### Upload succeeds but media not visible in Langfuse UI
210+
211+
1. Verify the media token was attached to the correct field
212+
2. Check the trace/observation in Langfuse UI
213+
3. Ensure your Langfuse version supports media rendering
214+
4. Use `get_media()` to verify the upload completed:
215+
```bash
216+
coaia fuse media get media_xyz789
217+
```
218+
219+
### File validation fails
220+
221+
Check:
222+
- File exists at the specified path
223+
- File extension matches a supported content type
224+
- Content type is in the list of 52 supported types
225+
226+
### SHA-256 hash calculation fails
227+
228+
The file may be locked by another process or you may not have read permissions. Verify:
229+
```bash
230+
ls -la /path/to/file
231+
```
232+
233+
## Advanced Usage
234+
235+
### Manual Token Generation
236+
237+
```python
238+
from coaiapy.cofuse import create_langfuse_media_token
239+
240+
# Generate token manually
241+
token = create_langfuse_media_token(
242+
media_id="media_xyz789",
243+
content_type="image/jpeg",
244+
source="file"
245+
)
246+
print(token)
247+
# @@@langfuseMedia:type=image/jpeg|id=media_xyz789|source=file@@@
248+
```
249+
250+
### Manual Token Attachment
251+
252+
```python
253+
from coaiapy.cofuse import (
254+
create_langfuse_media_token,
255+
attach_media_token_to_trace,
256+
attach_media_token_to_observation
257+
)
258+
259+
# Create token
260+
token = create_langfuse_media_token("media_xyz789", "image/png")
261+
262+
# Attach to trace
263+
attach_media_token_to_trace(
264+
trace_id="trace_abc123",
265+
media_token=token,
266+
field="output"
267+
)
268+
269+
# Attach to observation
270+
attach_media_token_to_observation(
271+
observation_id="obs_456",
272+
trace_id="trace_abc123",
273+
media_token=token,
274+
field="input"
275+
)
276+
```
277+
278+
## Best Practices
279+
280+
1. **Use appropriate fields**:
281+
- `input` for media provided TO the system
282+
- `output` for media generated BY the system
283+
- `metadata` for contextual media
284+
285+
2. **Let auto-detection work**: Don't specify `content_type` unless necessary
286+
287+
3. **Check return values**: Always verify `success` before using `media_id`
288+
289+
4. **Handle errors gracefully**: Upload failures should not crash your application
290+
291+
5. **Use SHA-256 deduplication**: Langfuse automatically deduplicates identical files
292+
293+
## Examples
294+
295+
### Upload Screenshot from Test Run
296+
297+
```python
298+
import os
299+
from coaiapy.cofuse import add_trace, upload_and_attach_media
300+
301+
# Create trace
302+
trace_id = "test-run-" + os.environ.get("CI_BUILD_ID", "local")
303+
add_trace(
304+
trace_id=trace_id,
305+
name="E2E Test Run",
306+
input={"test_suite": "checkout_flow"}
307+
)
308+
309+
# Attach failure screenshot
310+
result = upload_and_attach_media(
311+
file_path="/tmp/failure_screenshot.png",
312+
trace_id=trace_id,
313+
field="output"
314+
)
315+
316+
if result["success"]:
317+
print(f"Screenshot attached: {result['media_id']}")
318+
```
319+
320+
### Upload Audio Recording
321+
322+
```python
323+
from coaiapy.cofuse import add_observation, upload_and_attach_media
324+
325+
# Create observation
326+
obs_id = "voice-input-001"
327+
add_observation(
328+
observation_id=obs_id,
329+
trace_id="conversation-123",
330+
observation_type="SPAN",
331+
name="Voice Input Processing"
332+
)
333+
334+
# Attach audio
335+
result = upload_and_attach_media(
336+
file_path="user_recording.mp3",
337+
trace_id="conversation-123",
338+
observation_id=obs_id,
339+
field="input"
340+
)
341+
```
342+
343+
### Upload Multiple Images
344+
345+
```python
346+
from coaiapy.cofuse import upload_and_attach_media
347+
import glob
348+
349+
trace_id = "image-processing-001"
350+
media_ids = []
351+
352+
for img_path in glob.glob("screenshots/*.png"):
353+
result = upload_and_attach_media(
354+
file_path=img_path,
355+
trace_id=trace_id,
356+
field="input"
357+
)
358+
359+
if result["success"]:
360+
media_ids.append(result["media_id"])
361+
print(f"{img_path}: {result['media_id']}")
362+
else:
363+
print(f"{img_path}: {result['error']}")
364+
365+
print(f"\nUploaded {len(media_ids)} images")
366+
```
367+
368+
## See Also
369+
370+
- [Langfuse Multi-Modality Documentation](https://langfuse.com/docs/observability/features/multi-modality)
371+
- [Langfuse Public API Reference](https://langfuse.com/docs/api)
372+
- CoaiaPy Media API: `coaiapy/cofuse.py` lines 3640-4350
373+
- CLI Reference: `coaia fuse media --help`

0 commit comments

Comments
 (0)