Skip to content

Feature: Programmatic motion-graphics video pipeline (HTML/GSAP → MP4 + GitTools + RenderBackendProtocol) #26

@MervinPraison

Description

@MervinPraison

@claude

Overview

Add a programmatic motion-graphics video pipeline to praisonai_tools so users can turn natural-language prompts into short explainer MP4s (algorithms, concepts, code walkthroughs) without depending on paid generative video APIs (Sora/Veo/Runway). The pipeline is an agent-centric team — a coordinator routes to specialists that research, read code, author an HTML/CSS/JS (GSAP-style) composition, and render to MP4 via a headless browser, iterating on render failures.

This extends the existing praisonai_tools/video/ module with a code-based, deterministic, open-source rendering path that complements the generative VideoAgent in praisonaiagents.

End state:

from praisonai_tools.video.motion_graphics import (
    create_motion_graphics_agent,
    motion_graphics_team,
)
from praisonai_tools.tools.git_tools import GitTools

team = motion_graphics_team()  # researcher + code_explorer + animator
team.start("Animate Dijkstra's algorithm on a small weighted graph, 30s.")
# => Returns rendered MP4 bytes + path; streamed/inline in UI.
praisonai video "Explain CAP theorem with a worked partition example" --duration 45

Background

What this enables

A coordinator team routes every substantive request to one of three specialists:

  • Animator — owns the authoring loop. Writes a single-file HTML composition with inline CSS + a GSAP paused timeline, runs a lint pass, renders to MP4 via a headless browser, and iterates (up to N attempts) on render failures. Returns an MP4 inline so the web UI can <video> it.
  • CodeExplorer — on-demand git clone (shorthand owner/repo or full URL), read-only inspection (read_file, grep, find, git_log, git_blame, git_diff, git_show). Never writes. Path-escape protected.
  • Researcher — optional web-search specialist gated on an API key; compacts findings into a brief the Animator can turn into on-screen captions.

Design characteristics worth applying:

  1. Animator's skill is a compact inlined authoring guide (~4KB in the system prompt) — not a large external skill file. Kept deliberately small because large skill docs confuse the agent into outputting HTML-as-text instead of calling tools.
  2. Render tool returns Video(content=bytes, mime_type=…) so the SSE stream carries a self-contained blob the UI renders inline.
  3. Anthropic prompt caching on the stable system block — Animator iterates 6–10 turns per run, so cache_control: ephemeral makes every turn after the first ~10% cost.
  4. Team leader has strict output-validation rules: "A render succeeded ONLY IF reply contains a concrete /renders/... path AND no error indicators. Never fabricate a file path."
  5. Bounded retry budget with escalation to user on failure.
  6. Session-scoped scratch directories (/renders, /repos).
  7. Agentic memory captures user style preferences across sessions.

Why this matters

  • praisonai_tools/video/ already ships probe_video, transcribe_video, create_edit_plan, render_video, edit_video (ffmpeg-based). A code-based motion-graphics authoring path is the natural next addition — it completes the video toolkit.
  • VideoAgent in core only wraps generative video APIs ($$/clip, non-deterministic). A programmatic motion-graphics pipeline is zero-cost-per-render and fully deterministic.
  • Building blocks are reusable across other features: GitTools for code-aware agents, RenderBackendProtocol for future Manim/Remotion adapters.

Placement decision (important — don't deviate)

After repeated audit:

What does NOT belong in Core SDK

  • RenderBackendProtocolno core consumer. Core protocols (MemoryProtocol, SessionStoreProtocol, ApprovalProtocol) exist because Agent itself consumes them. Nothing in praisonaiagents/ renders video. Plugin contract belongs with its consumer.
  • GitTools — concrete subprocess impl, not a protocol. Existing github_tools.py / shell_tools.py in core are legacy; don't extend that pattern.
  • MotionGraphicsAgent class — an Animator is just Agent(instructions=..., tools=[...]). Factory function suffices.

Correct placement — zero new files in Core SDK

PraisonAI-Tools (all heavy impl, extending existing video/):

praisonai_tools/video/motion_graphics/
├── __init__.py
├── protocols.py          # RenderBackendProtocol, RenderOpts, RenderResult, LintResult
├── backend_html.py       # Playwright + ffmpeg impl of RenderBackendProtocol
├── agent.py              # create_motion_graphics_agent() factory
├── skill.py              # compact HTML/GSAP authoring skill (string constant)
├── team.py               # motion_graphics_team() preset (returns AgentTeam)
└── _render_loop.py       # retry helper (internal)

praisonai_tools/tools/
└── git_tools.py          # GitTools toolkit (read-only, on-demand clone)

Wrapper (praisonai) — thin CLI glue only:

praisonai/cli/commands/video.py   # `praisonai video "..."` delegates to praisonai_tools

Core SDK (praisonaiagents) — zero changes.

Architecture Analysis

Existing praisonai_tools/video/

File Purpose
probe.py Video metadata extraction
transcribe.py Audio transcription w/ word-level timestamps
plan.py LLM-based edit planning
render.py FFmpeg rendering
pipeline.py End-to-end edit orchestration
__main__.py CLI

New motion-graphics code lives as a sibling package under video/ to keep the ffmpeg-based editing path separate from the HTML/GSAP authoring path while sharing the namespace.

Related agents in Core SDK (no changes required)

File Purpose
praisonaiagents/agent/video_agent.py Generative video (Sora/Veo/Runway) — polling-based
praisonaiagents/agent/image_agent.py Generative image
praisonaiagents/agents/agents.py AgentTeam orchestrator

Gap Analysis

Gap Impact Effort
No programmatic/code-based video generation Users pay per-clip; no deterministic rendering Medium
No GitTools (clone-on-demand + read-only git ops) Agents can't reason over arbitrary repos safely Low
No render-backend protocol Can't plug future engines (Manim, Remotion) Low
No render-iterate bounded-retry helper Every code-author-render agent reinvents it Low
No team preset for research → code → animate → render Users wire it by hand each time Low

Proposed Implementation

Phase 1 — Minimal

1. RenderBackendProtocol in praisonai_tools/video/motion_graphics/protocols.py

from typing import Protocol, runtime_checkable
from dataclasses import dataclass
from pathlib import Path
from typing import Literal

Quality = Literal["draft", "standard", "high"]
Format = Literal["mp4", "webm", "mov"]

@dataclass
class RenderOpts:
    output_name: str = "video.mp4"
    fps: int = 30
    quality: Quality = "standard"
    format: Format = "mp4"
    strict: bool = False
    timeout: int = 300

@dataclass
class LintResult:
    ok: bool
    messages: list[str]
    raw: str = ""

@dataclass
class RenderResult:
    ok: bool
    output_path: Path | None
    bytes_: bytes | None
    stderr: str = ""
    size_kb: int = 0

@runtime_checkable
class RenderBackendProtocol(Protocol):
    async def lint(self, workspace: Path, strict: bool = False) -> LintResult: ...
    async def render(self, workspace: Path, opts: RenderOpts) -> RenderResult: ...

2. HtmlRenderBackend in praisonai_tools/video/motion_graphics/backend_html.py

Playwright + ffmpeg implementation. Runs Chromium headless, loads workspace/index.html, reads window.__timelines, drives a paused GSAP timeline frame-by-frame via page.evaluate, pipes frames to ffmpeg for MP4 encoding. Workspace-escape protected. External network fetches blocked except an allowlisted GSAP CDN.

3. GitTools in praisonai_tools/tools/git_tools.py

from praisonai_tools.tools.git_tools import GitTools

tools = GitTools(base_dir="/tmp/repos")
# Methods: clone_repo, list_repos, repo_summary, git_log, git_diff,
# git_blame, git_show, git_branches, get_github_remote
  • Accepts owner/repo, https://…, git@…
  • PAT via GITHUB_ACCESS_TOKEN
  • No shell pass-through; predefined git subcommands only
  • Path-escape protected via is_relative_to
  • Idempotent: clone-or-pull

4. create_motion_graphics_agent() factory in praisonai_tools/video/motion_graphics/agent.py

def create_motion_graphics_agent(
    *,
    backend: RenderBackendProtocol | str = "html",
    workspace: str | Path = "./renders",
    max_retries: int = 3,
    llm: str = "claude-opus-4-7",
    **agent_kwargs,
) -> Agent:
    """Factory — returns a standard Agent wired for motion-graphics authoring."""
    return Agent(
        instructions=_BASE_INSTRUCTIONS + "\n" + _MOTION_GRAPHICS_SKILL,
        tools=[
            FileTools(base_dir=workspace),
            RenderTools(backend=_resolve_backend(backend), workspace=workspace),
        ],
        llm=llm,
        **agent_kwargs,
    )

5. Inlined skill in praisonai_tools/video/motion_graphics/skill.py

Compact ~4KB authoring guide: HTML/GSAP example, required data-* attributes, timeline contract, scene rules, hard don'ts (no Math.random, no repeat: -1, no visibility/display animation, etc.), layout approach.

6. motion_graphics_team() preset in praisonai_tools/video/motion_graphics/team.py

def motion_graphics_team(
    *,
    research: bool = True,
    code_exploration: bool = True,
    backend: str = "html",
) -> AgentTeam:
    """Preset: Coordinator + Researcher (optional) + CodeExplorer + Animator."""

Leader instructions include output-validation guard ("require concrete file path; never fabricate").

7. render_iterate() helper in praisonai_tools/video/motion_graphics/_render_loop.py

async def render_iterate(
    write_fn, lint_fn, render_fn, patch_fn, max_retries: int = 3,
) -> RenderResult:
    """Bounded write → lint → render → patch loop. Surfaces last stderr on failure."""

8. CLI in praisonai/cli/commands/video.py (wrapper repo — separate PR)

praisonai video "Animate Dijkstra's algorithm" --duration 30 --backend html
praisonai video "Animate Team.coordinate in agno-agi/agno" --with-code-explorer

Phase 2 — Production hardening

  • ManimBackend adapter (math-first animations) — pluggable via RenderBackendProtocol
  • RemotionBackend adapter (React-based)
  • Docker image with Node + FFmpeg + Chromium baked in
  • Agentic-memory preset for user style preferences
  • Anthropic prompt-cache verification + telemetry

Files to Create / Modify

New files — MervinPraison/PraisonAI-Tools

File Purpose
praisonai_tools/video/motion_graphics/__init__.py Lazy exports
praisonai_tools/video/motion_graphics/protocols.py RenderBackendProtocol + dataclasses
praisonai_tools/video/motion_graphics/backend_html.py Playwright + ffmpeg backend
praisonai_tools/video/motion_graphics/agent.py create_motion_graphics_agent() factory
praisonai_tools/video/motion_graphics/skill.py Compact HTML/GSAP authoring skill (string)
praisonai_tools/video/motion_graphics/team.py motion_graphics_team() preset
praisonai_tools/video/motion_graphics/_render_loop.py Retry helper
praisonai_tools/tools/git_tools.py GitTools toolkit
tests/unit/video/test_motion_graphics_protocols.py Protocol conformance tests
tests/unit/video/test_html_backend.py Playwright+ffmpeg pipeline tests
tests/unit/video/test_motion_graphics_agent.py Factory + render-iterate loop tests
tests/unit/tools/test_git_tools.py Clone, path-escape, read-only tests
tests/integration/test_motion_graphics_team.py Team preset integration
tests/smoke/test_motion_graphics_smoke.py Real agentic smoke test
examples/motion_graphics_example.py Minimal working example
examples/motion_graphics_team.yaml YAML preset

Modified files — MervinPraison/PraisonAI-Tools

File Change
praisonai_tools/video/__init__.py Re-export motion_graphics submodule lazily
praisonai_tools/__init__.py Register new lazy exports
pyproject.toml Add [video-motion] optional extra: playwright>=1.40, imageio-ffmpeg>=0.5

Wrapper (MervinPraison/PraisonAI) — separate follow-up issue

File Change
praisonai/cli/commands/video.py New CLI subcommand
praisonai/cli/main.py Wire video subcommand

Technical Considerations

Dependencies (all optional + lazy)

[project.optional-dependencies]
video-motion = ["playwright>=1.40", "imageio-ffmpeg>=0.5"]
  • Playwright needs playwright install chromium post-install
  • FFmpeg via imageio-ffmpeg (bundles binary) avoids system ffmpeg dependency
  • subprocess + system git for GitTools — no new Python deps
  • No new Core SDK dependencies

Safety / approval

  • GitTools.clone_repo must honour ApprovalRegistry from praisonaiagents.approval when an approval backend is configured (network + filesystem write)
  • Render output path confined to workspace via is_relative_to check
  • No shell pass-through in GitTools; all git ops are pre-defined subcommands
  • HtmlRenderBackend runs Chromium with:
    • --disable-dev-shm-usage --no-sandbox=false
    • Network request interceptor that blocks everything except allowlisted GSAP CDN
    • Workspace-scoped filesystem access

Multi-agent safety

  • Each agent gets its own workspace subdirectory (keyed by agent name or run id)
  • GitTools uses per-instance base_dir
  • Render subprocesses use per-run working dirs
  • No shared mutable state between agents

Performance

  • praisonai_tools import time: unchanged (all new modules lazy via __getattr__)
  • Playwright/ffmpeg only loaded on first backend.render() call
  • GitTools only loads when instantiated
  • Core SDK import time: unchanged (zero touches)

Acceptance Criteria

  • from praisonai_tools.video.motion_graphics import RenderBackendProtocol, create_motion_graphics_agent, motion_graphics_team works
  • from praisonai_tools.tools.git_tools import GitTools works; clone, path-escape, read-only verified
  • HtmlRenderBackend implements RenderBackendProtocol (runtime-checkable isinstance passes)
  • Agent returned by factory can author → lint → render → iterate within max_retries
  • motion_graphics_team() runs end-to-end on a real prompt and returns an MP4 path + inline bytes
  • praisonai_tools import time unchanged (verify < 200 ms — matches current baseline)
  • Leader output-validation guard prevents fabricated file paths (unit test with mock Animator returning prose-only)
  • Bounded retry: after N failures, agent surfaces last stderr and stops (does not loop)
  • Rendered MP4 surfaces inline in streaming UI (Chainlit) — smoke test
  • Example: examples/motion_graphics_example.py runs against real LLM and produces MP4
  • Zero changes to praisonaiagents core SDK
  • New [video-motion] extra documented; installable via pip install praisonai-tools[video-motion]

Implementation Notes

Key files to read first

  1. praisonai_tools/video/render.py — existing ffmpeg-based render (pattern to match)
  2. praisonai_tools/video/pipeline.py — end-to-end pipeline pattern
  3. praisonai_tools/video/__init__.py — lazy export pattern
  4. praisonaiagents/agent/video_agent.py — specialized agent pattern (reference only)
  5. praisonaiagents/agents/agents.pyAgentTeam construction
  6. praisonaiagents/approval/ — approval registry for gated operations

Design principles (must hold)

  • Zero changes to praisonaiagents core SDK
  • Protocol co-located with its consumerRenderBackendProtocol in the same module as the agent that uses it
  • Lazy: Playwright/FFmpeg never imported at module load
  • Agent-centric: factory function, not new Agent subclass — reuse Agent() + tools + instructions
  • DRY: reuse existing praisonai_tools/video/ namespace, Video media class, AgentTeam, approval registry
  • Safe by default: workspace escape prevention, bounded retries, no shell in GitTools

Testing commands

# Unit
pytest tests/unit/video/test_motion_graphics_protocols.py -v
pytest tests/unit/video/test_html_backend.py -v
pytest tests/unit/video/test_motion_graphics_agent.py -v
pytest tests/unit/tools/test_git_tools.py -v

# Integration
pytest tests/integration/test_motion_graphics_team.py -v

# Smoke (real LLM)
python examples/motion_graphics_example.py

# Verify zero core changes
cd /Users/praison/praisonai-package/src/praisonai-agents && git diff --stat
# Expected: no changes

Non-goals (v1)

  • Long-form video (>2 minutes)
  • GPU-accelerated effects
  • Audio synthesis (reuse AudioAgent separately)
  • WYSIWYG editor
  • Non-deterministic frame interpolation

References

  • Existing praisonai_tools/video/ — the namespace this extends
  • praisonaiagents/agent/video_agent.py — generative video peer
  • praisonaiagents/agents/agents.pyAgentTeam
  • praisonaiagents/approval/ — approval protocol
  • GSAP authoring library: https://gsap.com
  • Playwright Python: https://playwright.dev/python/

Related tracking issue (now superseded by this one for implementation): MervinPraison/PraisonAI#1452.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions