Skip to content

feat(cli): add video2yaml command to generate test scripts from video#2169

Open
quanru wants to merge 5 commits intomainfrom
feat/video-to-script
Open

feat(cli): add video2yaml command to generate test scripts from video#2169
quanru wants to merge 5 commits intomainfrom
feat/video-to-script

Conversation

@quanru
Copy link
Collaborator

@quanru quanru commented Mar 17, 2026

Summary

  • Add midscene video2yaml CLI subcommand that generates runnable Midscene test scripts (YAML or Playwright) from screen recording videos
  • Extract frames from video using ffmpeg (npm package @ffmpeg-installer/ffmpeg with system fallback)
  • Send frames to VLM with optimized prompts that enforce Midscene constraints (no browser chrome interaction, proper URL extraction)
  • Support two output formats via --format yaml|playwright

New files

File Description
packages/core/src/ai-model/prompt/video-to-yaml.ts VLM prompt templates and generation functions for both YAML and Playwright
packages/cli/src/video/extract-frames.ts Frame extraction from video via ffmpeg with npm/system fallback
packages/cli/src/video/index.ts video2yaml() orchestrator: extract → analyze → write
packages/cli/src/video/cli.ts Argument parser for the video2yaml subcommand
packages/cli/tests/unit-test/video-cli.test.ts Unit tests for CLI argument parsing (8 cases)
packages/core/tests/unit-test/video-to-yaml.test.ts Unit tests for core generation functions

Usage

# Generate YAML script
midscene video2yaml recording.mp4

# Generate Playwright test
midscene video2yaml recording.mp4 -f playwright -o login.test.ts

# With options
midscene video2yaml demo.webm --fps 2 --max-frames 30 --url https://example.com

Test plan

  • pnpm run lint passes
  • npx nx build @midscene/cli builds successfully
  • npx vitest --run tests/unit-test/video-cli.test.ts — 8 tests pass
  • npx vitest --run tests/unit-test/video-to-yaml.test.ts — 2 tests pass
  • Manual test: generated YAML correctly identifies URL from address bar, no browser chrome actions
  • Manual test: generated Playwright test uses page.goto() for navigation

🤖 Generated with Claude Code

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Mar 17, 2026

Deploying midscene with  Cloudflare Pages  Cloudflare Pages

Latest commit: ada2091
Status: ✅  Deploy successful!
Preview URL: https://9750da8a.midscene.pages.dev
Branch Preview URL: https://feat-video-to-script.midscene.pages.dev

View logs

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: aa8f72b4d7

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

format: VideoScriptFormat,
): string {
const ext = format === 'playwright' ? '.test.ts' : '.yaml';
return inputPath.replace(/\.[^.]+$/, ext);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve input file when deriving default output path

For extensionless video paths (for example midscene video2yaml ./recording), inputPath.replace(/\.[^.]+$/, ext) returns the original path unchanged, so outputPath becomes the same file as the input. The subsequent writeFileSync(outputPath, result.content, 'utf-8') then overwrites the source video with generated script content, which is destructive data loss in a realistic CLI usage scenario.

Useful? React with 👍 / 👎.

quanru and others added 3 commits March 18, 2026 17:14
…n recordings

Add a new CLI subcommand `midscene video2yaml` that extracts frames from a
video file using ffmpeg and sends them to a VLM to generate runnable Midscene
test scripts in YAML or Playwright format.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The npm packages @ffmpeg-installer/ffmpeg and @ffprobe-installer/ffprobe
may install binaries without execute permission. Add chmod +x before use
and verify executability with a test spawn, falling back to system binaries
when the npm binary is not usable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Long videos (>20 frames) are now automatically split into segments
using ffmpeg scene detection, each segment analyzed independently
by the VLM, then merged into a single coherent script.

- Add segment-frames.ts with detectSceneChanges and segmentFrames
- Add segment/merge prompts in video-to-yaml.ts
- Short videos (<= 20 frames) still use single-call path unchanged
- Add --max-frames-per-segment and --scene-threshold CLI options
- Add 19 unit tests covering segmentation and CLI parsing
- Extract prependWebConfigIfMissing to eliminate duplicate code
- Add MAX_TOTAL_FRAMES=600 cap to prevent OOM on very long videos

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@quanru quanru force-pushed the feat/video-to-script branch from 9f9f47d to eb7eff4 Compare March 18, 2026 09:15
quanru and others added 2 commits March 18, 2026 17:40
…LI args

- Make maxFrames the global budget for both short and long video paths,
  capped by MAX_TOTAL_FRAMES (600)
- Add validation for all numeric CLI parameters (fps, maxFrames,
  maxFramesPerSegment, sceneThreshold) with clear error messages
- Add 5 new test cases for numeric validation edge cases

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove unused `format` parameter from `generateFromVideoSegment`
- Remove redundant second sort in `detectSceneChanges` (already sorted
  in `parseSceneTimestamps`)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant