Added features: local transcription, multi-language support, preview, customizable styles#7
Added features: local transcription, multi-language support, preview, customizable styles#7ollisulopuisto wants to merge 40 commits intoriseandignite:mainfrom
Conversation
…ting and HDR video support, refactor concurrent task management."
…nd caption burning RPCs with a new CaptionEditor UI.
…nce Whisper word parsing by merging sub-word tokens.
…g, and enhance caption word wrapping with hyphen splitting and improved layout.
…e segment/word time range detection.
…ing new Rust types and RPC, and remove the save icon from the UI.
… the burn command.
…d on length and add a corresponding test case.
…ific sidecar files.
… and add double-click to edit captions.
…w Rust RPC method and Electron UI.
…in preview, and display playback error messages for unsupported formats.
… bump version to 1.0.5
There was a problem hiding this comment.
Pull request overview
This pull request adds significant new functionality to CapSlap, a video caption generation tool. The changes introduce local transcription via whisper.cpp (eliminating cloud API dependency), a caption editor with live preview, multi-language support, expanded font library (30+ fonts), and various UI/UX improvements.
Changes:
- Local transcription using whisper.cpp with caching and multiple model support (including new "turbo" model)
- Caption editor component with word-level editing, segment shifting, and live preview generation
- Expanded font library organized into 5 categories with 30+ fonts, plus font download automation
- Enhanced CI/CD with release workflow and platform-specific binary management
- UI improvements including font size control, output resolution selection, crop strategies, and progress indicators
Reviewed changes
Copilot reviewed 29 out of 64 changed files in this pull request and generated 15 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/download_fonts.sh | New script to download Google Fonts - has syntax error |
| scripts/download-binaries.sh | CI script for downloading platform-specific FFmpeg and whisper binaries - has architecture bug |
| rust/src/whisper.rs | Core transcription logic with local whisper.cpp support, model fallback chain, caching, and word merging - has duplicate comments and unused variable |
| rust/src/types.rs | Extended type definitions for new features (preview, burn, save/load captions) |
| rust/src/rpc.rs | Added comprehensive unit tests for RPC layer |
| rust/src/audio.rs | Minor formatting improvements |
| rust/src/bin/debug_ffmpeg.rs | New debugging utility for FFmpeg detection |
| rust/src/bin/core.rs | Enhanced RPC handler with cancellation support and new methods (transcribe, burn, preview, save/load) |
| electron/package.json | Version bump to 1.0.6, added dependencies for testing and UI components |
| electron/vitest.config.ts | New test configuration with coverage setup |
| electron/test/setup.ts | Test environment setup with mocked Electron APIs |
| electron/lib/utils.test.ts | Comprehensive tests for className utility |
| electron/lib/caption-utils.ts | New utility functions for caption formatting and manipulation |
| electron/lib/caption-utils.test.ts | Tests for caption utilities |
| electron/lib/preload/* | Updated API to support request IDs for cancellation |
| electron/lib/main/sidecar.ts | Enhanced logging and cancellation support |
| electron/lib/main/app.ts | Extended resource protocol to support local files and more video formats |
| electron/eslint.config.mjs | New ESLint configuration |
| electron/electron-builder.yml | Added Linux binary support |
| electron/app/components/ui/* | New UI components (Textarea, Dropdown, updated Slider/Separator) |
| electron/app/components/ModelDownloader.tsx | Added "turbo" model support |
| electron/app/components/CaptionEditor.tsx | New caption editor with segment editing and word shifting - has unused variables |
| electron/app/app.tsx | Major UI overhaul with editor integration, preview generation, and expanded settings - has duplicate field assignments |
| .github/workflows/* | New CI and release workflows for automated builds |
| README.md | Updated documentation highlighting offline capability and new features |
| .gitignore | Expanded to exclude build artifacts and binaries |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
scripts/download_fonts.sh
Outdated
| # Cinzel | ||
| download_google_font "Cinzel" "Bold" | ||
| # Bodoni Moda | ||
| download_google_font "Bodoni Moda" "Bold" // Note: Variable font usually |
There was a problem hiding this comment.
Invalid comment syntax in shell script. Shell scripts use '#' for comments, not '//'. This will cause a syntax error when the script is executed.
rust/src/whisper.rs
Outdated
| /// Get possible project FFmpeg binary paths | ||
| /// Get possible project FFmpeg binary paths |
There was a problem hiding this comment.
Duplicate docstring comment. Line 533 and 534 both have the same comment "/// Get possible project FFmpeg binary paths". Remove the duplicate on line 533.
rust/src/whisper.rs
Outdated
| /// Get possible project ffprobe binary paths | ||
| /// Get possible project ffprobe binary paths |
There was a problem hiding this comment.
Duplicate docstring comment. Line 596 and 597 both have the same comment "/// Get possible project ffprobe binary paths". Remove the duplicate on line 596.
electron/app/app.tsx
Outdated
| fontName: getFontName(videoSettings.selectedFont), | ||
| textColor: videoSettings.textColor, | ||
| highlightWordColor: videoSettings.highlightWordColor, | ||
| outlineColor: videoSettings.outlineColor, | ||
| glowEffect: videoSettings.glowEffect, | ||
| position: videoSettings.captionPosition, | ||
| outputSize: videoSettings.outputSize, | ||
| position: videoSettings.captionPosition, | ||
| outputSize: videoSettings.outputSize, |
There was a problem hiding this comment.
Duplicate field assignment. The position field is assigned twice on lines 842-843. Remove the duplicate assignment on line 842.
| fontName: getFontName(videoSettings.selectedFont), | |
| textColor: videoSettings.textColor, | |
| highlightWordColor: videoSettings.highlightWordColor, | |
| outlineColor: videoSettings.outlineColor, | |
| glowEffect: videoSettings.glowEffect, | |
| position: videoSettings.captionPosition, | |
| outputSize: videoSettings.outputSize, | |
| position: videoSettings.captionPosition, | |
| outputSize: videoSettings.outputSize, | |
| fontName: getFontName(videoSettings.selectedFont, | |
| textColor: videoSettings.textColor, | |
| highlightWordColor: videoSettings.highlightWordColor, | |
| outlineColor: videoSettings.outlineColor, | |
| glowEffect: videoSettings.glowEffect, | |
| position: videoSettings.captionPosition, | |
| outputSize: videoSettings.outputSize, |
electron/app/app.tsx
Outdated
| glowEffect: videoSettings.glowEffect, | ||
| position: videoSettings.captionPosition, | ||
| outputSize: videoSettings.outputSize, | ||
| outputSize: videoSettings.outputSize, |
There was a problem hiding this comment.
Duplicate field assignment. The outputSize field is assigned twice on lines 929-930. Remove the duplicate assignment on line 929.
| outputSize: videoSettings.outputSize, |
scripts/download-binaries.sh
Outdated
| ARCH=$(uname -m) | ||
| if [[ "$ARCH" == "arm64" ]]; then | ||
| WHISPER_ASSET="whisper-cli-macos-arm64" | ||
| else | ||
| WHISPER_ASSET="whisper-cli-macos-x64" | ||
| fi | ||
|
|
||
| curl -L "https://github.com/ggerganov/whisper.cpp/releases/download/${WHISPER_VERSION}/${WHISPER_ASSET}" \ | ||
| -o "$BIN_DIR/whisper-cli-macos-arm64" | ||
| chmod +x "$BIN_DIR/whisper-cli-macos-arm64" |
There was a problem hiding this comment.
Incorrect Linux binary paths for macOS build. In the download_macos function (lines 16-52), the whisper binary is always saved as "whisper-cli-macos-arm64" regardless of the architecture detected. On line 48, both arm64 and x64 architectures result in the same output filename. The x64 binary should be saved as "whisper-cli-macos-x64" when ARCH is x64.
| const getFontName = (fontId: string): string => { | ||
| return FONT_NAMES[fontId as keyof typeof FONT_NAMES] || 'Montserrat Black' | ||
| } | ||
|
|
There was a problem hiding this comment.
Unused variable getFontName.
| const getFontName = (fontId: string): string => { | |
| return FONT_NAMES[fontId as keyof typeof FONT_NAMES] || 'Montserrat Black' | |
| } |
electron/app/app.tsx
Outdated
| SelectGroup, | ||
| SelectLabel, |
There was a problem hiding this comment.
Unused imports SelectGroup, SelectLabel.
| SelectGroup, | |
| SelectLabel, |
electron/app/app.tsx
Outdated
| DropdownMenuLabel, | ||
| DropdownMenuSeparator, | ||
| DropdownMenuTrigger, | ||
| DropdownMenuSub, | ||
| DropdownMenuSubTrigger, | ||
| DropdownMenuSubContent, | ||
| DropdownMenuGroup, |
There was a problem hiding this comment.
Unused imports DropdownMenuGroup, DropdownMenuLabel, DropdownMenuSeparator.
| DropdownMenuLabel, | |
| DropdownMenuSeparator, | |
| DropdownMenuTrigger, | |
| DropdownMenuSub, | |
| DropdownMenuSubTrigger, | |
| DropdownMenuSubContent, | |
| DropdownMenuGroup, | |
| DropdownMenuTrigger, | |
| DropdownMenuSub, | |
| DropdownMenuSubTrigger, | |
| DropdownMenuSubContent, |
| # Download FFmpeg from gyan.dev (Windows builds with all codecs) | ||
| echo "Downloading FFmpeg..." | ||
| FFMPEG_URL="https://github.com/BtbN/FFmpeg-Builds/releases/download/latest/ffmpeg-master-latest-win64-gpl.zip" | ||
| curl -L "$FFMPEG_URL" -o /tmp/ffmpeg-win.zip | ||
| unzip -o /tmp/ffmpeg-win.zip -d /tmp/ffmpeg-win | ||
|
|
||
| # Find and copy binaries | ||
| find /tmp/ffmpeg-win -name "ffmpeg.exe" -exec cp {} "$BIN_DIR/" \; | ||
| find /tmp/ffmpeg-win -name "ffprobe.exe" -exec cp {} "$BIN_DIR/" \; | ||
|
|
||
| # Download whisper.cpp Windows build | ||
| echo "Downloading whisper-cli..." | ||
| curl -L "https://github.com/ggerganov/whisper.cpp/releases/download/${WHISPER_VERSION}/whisper-cli-win64.exe" \ | ||
| -o "$BIN_DIR/whisper-cli.exe" |
There was a problem hiding this comment.
This script downloads and executes precompiled ffmpeg and whisper-cli binaries from third-party GitHub URLs (including the mutable latest FFmpeg builds) without any checksum, signature, or pinning to an immutable identifier. If the upstream project, tag, or delivery path is compromised (or a MITM/TLS termination point is abused), a malicious binary could be fetched and then executed in CI or shipped to users, leading to remote code execution. To mitigate this, pin these downloads to immutable artifacts and verify their integrity (e.g., using published checksums/signatures or vendoring known-good binaries) instead of relying on latest and unverified downloads.
- scripts/download_fonts.sh: Fix invalid shell comment syntax (// -> #) - scripts/download-binaries.sh: Fix architecture bug - save whisper binary with correct filename for both arm64 and x64 - rust/src/whisper.rs: Remove duplicate docstrings for get_project_ffmpeg_paths and get_project_ffprobe_paths - electron/app/components/CaptionEditor.tsx: Remove unused FONT_NAMES and getFontName - electron/app/app.tsx: Remove unused imports (SelectGroup, SelectLabel, DropdownMenuLabel, DropdownMenuSeparator, DropdownMenuGroup) and fix duplicate field assignments
Added features: local transcription, multi-language support, preview, customizable styles, editing and saving subs