-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Labels
enhancementNew feature or requestNew feature or request
Description
What
Add a YouTubeFetcher that matches youtube.com/watch?v={id} and youtu.be/{id} URLs, returning video metadata and transcript text.
Why
Agents can't watch video, but frequently encounter YouTube links in research, documentation, and discussions. Extracting the transcript turns video content into LLM-consumable text. This is a unique value-add — most tools simply return the noisy YouTube HTML page.
Requirements
- Match:
https://youtube.com/watch?v={id},https://www.youtube.com/watch?v={id},https://youtu.be/{id} - Return: title, channel name, description, duration, view count, publish date, tags
- Extract transcript/captions via timedtext API (when available)
- Indicate transcript language and whether auto-generated
- Format field:
"youtube_video" - Handle videos without transcripts gracefully (return metadata only)
Design Notes
- YouTube oEmbed API provides basic metadata:
https://www.youtube.com/oembed?url={url} - Transcript extraction is the key differentiator — investigate timedtext/innertube APIs
- Similar dual-API strategy to existing
TwitterFetcher(primary + fallback) - Transcripts can be long — consider truncation with
truncated: truefor very long videos - No official public transcript API — may need to parse from page data or use undocumented endpoints
Tier
3 — Differentiated capability
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request