Skip to content

Added features: local transcription, multi-language support, preview, customizable styles#7

Open
ollisulopuisto wants to merge 40 commits intoriseandignite:mainfrom
ollisulopuisto:main
Open

Added features: local transcription, multi-language support, preview, customizable styles#7
ollisulopuisto wants to merge 40 commits intoriseandignite:mainfrom
ollisulopuisto:main

Conversation

@ollisulopuisto
Copy link
Copy Markdown

Added features: local transcription, multi-language support, preview, customizable styles, editing and saving subs

…ting and HDR video support, refactor concurrent task management."
…nd caption burning RPCs with a new CaptionEditor UI.
…nce Whisper word parsing by merging sub-word tokens.
…g, and enhance caption word wrapping with hyphen splitting and improved layout.
…ing new Rust types and RPC, and remove the save icon from the UI.
…d on length and add a corresponding test case.
…in preview, and display playback error messages for unsupported formats.
Copilot AI review requested due to automatic review settings January 30, 2026 13:47
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds significant new functionality to CapSlap, a video caption generation tool. The changes introduce local transcription via whisper.cpp (eliminating cloud API dependency), a caption editor with live preview, multi-language support, expanded font library (30+ fonts), and various UI/UX improvements.

Changes:

  • Local transcription using whisper.cpp with caching and multiple model support (including new "turbo" model)
  • Caption editor component with word-level editing, segment shifting, and live preview generation
  • Expanded font library organized into 5 categories with 30+ fonts, plus font download automation
  • Enhanced CI/CD with release workflow and platform-specific binary management
  • UI improvements including font size control, output resolution selection, crop strategies, and progress indicators

Reviewed changes

Copilot reviewed 29 out of 64 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
scripts/download_fonts.sh New script to download Google Fonts - has syntax error
scripts/download-binaries.sh CI script for downloading platform-specific FFmpeg and whisper binaries - has architecture bug
rust/src/whisper.rs Core transcription logic with local whisper.cpp support, model fallback chain, caching, and word merging - has duplicate comments and unused variable
rust/src/types.rs Extended type definitions for new features (preview, burn, save/load captions)
rust/src/rpc.rs Added comprehensive unit tests for RPC layer
rust/src/audio.rs Minor formatting improvements
rust/src/bin/debug_ffmpeg.rs New debugging utility for FFmpeg detection
rust/src/bin/core.rs Enhanced RPC handler with cancellation support and new methods (transcribe, burn, preview, save/load)
electron/package.json Version bump to 1.0.6, added dependencies for testing and UI components
electron/vitest.config.ts New test configuration with coverage setup
electron/test/setup.ts Test environment setup with mocked Electron APIs
electron/lib/utils.test.ts Comprehensive tests for className utility
electron/lib/caption-utils.ts New utility functions for caption formatting and manipulation
electron/lib/caption-utils.test.ts Tests for caption utilities
electron/lib/preload/* Updated API to support request IDs for cancellation
electron/lib/main/sidecar.ts Enhanced logging and cancellation support
electron/lib/main/app.ts Extended resource protocol to support local files and more video formats
electron/eslint.config.mjs New ESLint configuration
electron/electron-builder.yml Added Linux binary support
electron/app/components/ui/* New UI components (Textarea, Dropdown, updated Slider/Separator)
electron/app/components/ModelDownloader.tsx Added "turbo" model support
electron/app/components/CaptionEditor.tsx New caption editor with segment editing and word shifting - has unused variables
electron/app/app.tsx Major UI overhaul with editor integration, preview generation, and expanded settings - has duplicate field assignments
.github/workflows/* New CI and release workflows for automated builds
README.md Updated documentation highlighting offline capability and new features
.gitignore Expanded to exclude build artifacts and binaries

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

# Cinzel
download_google_font "Cinzel" "Bold"
# Bodoni Moda
download_google_font "Bodoni Moda" "Bold" // Note: Variable font usually
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invalid comment syntax in shell script. Shell scripts use '#' for comments, not '//'. This will cause a syntax error when the script is executed.

Copilot uses AI. Check for mistakes.
Comment on lines 533 to 534
/// Get possible project FFmpeg binary paths
/// Get possible project FFmpeg binary paths
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate docstring comment. Line 533 and 534 both have the same comment "/// Get possible project FFmpeg binary paths". Remove the duplicate on line 533.

Copilot uses AI. Check for mistakes.
Comment on lines 596 to 597
/// Get possible project ffprobe binary paths
/// Get possible project ffprobe binary paths
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate docstring comment. Line 596 and 597 both have the same comment "/// Get possible project ffprobe binary paths". Remove the duplicate on line 596.

Copilot uses AI. Check for mistakes.
Comment on lines +835 to +843
fontName: getFontName(videoSettings.selectedFont),
textColor: videoSettings.textColor,
highlightWordColor: videoSettings.highlightWordColor,
outlineColor: videoSettings.outlineColor,
glowEffect: videoSettings.glowEffect,
position: videoSettings.captionPosition,
outputSize: videoSettings.outputSize,
position: videoSettings.captionPosition,
outputSize: videoSettings.outputSize,
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate field assignment. The position field is assigned twice on lines 842-843. Remove the duplicate assignment on line 842.

Suggested change
fontName: getFontName(videoSettings.selectedFont),
textColor: videoSettings.textColor,
highlightWordColor: videoSettings.highlightWordColor,
outlineColor: videoSettings.outlineColor,
glowEffect: videoSettings.glowEffect,
position: videoSettings.captionPosition,
outputSize: videoSettings.outputSize,
position: videoSettings.captionPosition,
outputSize: videoSettings.outputSize,
fontName: getFontName(videoSettings.selectedFont,
textColor: videoSettings.textColor,
highlightWordColor: videoSettings.highlightWordColor,
outlineColor: videoSettings.outlineColor,
glowEffect: videoSettings.glowEffect,
position: videoSettings.captionPosition,
outputSize: videoSettings.outputSize,

Copilot uses AI. Check for mistakes.
glowEffect: videoSettings.glowEffect,
position: videoSettings.captionPosition,
outputSize: videoSettings.outputSize,
outputSize: videoSettings.outputSize,
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate field assignment. The outputSize field is assigned twice on lines 929-930. Remove the duplicate assignment on line 929.

Suggested change
outputSize: videoSettings.outputSize,

Copilot uses AI. Check for mistakes.
Comment on lines +40 to +49
ARCH=$(uname -m)
if [[ "$ARCH" == "arm64" ]]; then
WHISPER_ASSET="whisper-cli-macos-arm64"
else
WHISPER_ASSET="whisper-cli-macos-x64"
fi

curl -L "https://github.com/ggerganov/whisper.cpp/releases/download/${WHISPER_VERSION}/${WHISPER_ASSET}" \
-o "$BIN_DIR/whisper-cli-macos-arm64"
chmod +x "$BIN_DIR/whisper-cli-macos-arm64"
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incorrect Linux binary paths for macOS build. In the download_macos function (lines 16-52), the whisper binary is always saved as "whisper-cli-macos-arm64" regardless of the architecture detected. On line 48, both arm64 and x64 architectures result in the same output filename. The x64 binary should be saved as "whisper-cli-macos-x64" when ARCH is x64.

Copilot uses AI. Check for mistakes.
Comment on lines +47 to +50
const getFontName = (fontId: string): string => {
return FONT_NAMES[fontId as keyof typeof FONT_NAMES] || 'Montserrat Black'
}

Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable getFontName.

Suggested change
const getFontName = (fontId: string): string => {
return FONT_NAMES[fontId as keyof typeof FONT_NAMES] || 'Montserrat Black'
}

Copilot uses AI. Check for mistakes.
Comment on lines +12 to +13
SelectGroup,
SelectLabel,
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused imports SelectGroup, SelectLabel.

Suggested change
SelectGroup,
SelectLabel,

Copilot uses AI. Check for mistakes.
Comment on lines +33 to +39
DropdownMenuLabel,
DropdownMenuSeparator,
DropdownMenuTrigger,
DropdownMenuSub,
DropdownMenuSubTrigger,
DropdownMenuSubContent,
DropdownMenuGroup,
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused imports DropdownMenuGroup, DropdownMenuLabel, DropdownMenuSeparator.

Suggested change
DropdownMenuLabel,
DropdownMenuSeparator,
DropdownMenuTrigger,
DropdownMenuSub,
DropdownMenuSubTrigger,
DropdownMenuSubContent,
DropdownMenuGroup,
DropdownMenuTrigger,
DropdownMenuSub,
DropdownMenuSubTrigger,
DropdownMenuSubContent,

Copilot uses AI. Check for mistakes.
Comment on lines +66 to +79
# Download FFmpeg from gyan.dev (Windows builds with all codecs)
echo "Downloading FFmpeg..."
FFMPEG_URL="https://github.com/BtbN/FFmpeg-Builds/releases/download/latest/ffmpeg-master-latest-win64-gpl.zip"
curl -L "$FFMPEG_URL" -o /tmp/ffmpeg-win.zip
unzip -o /tmp/ffmpeg-win.zip -d /tmp/ffmpeg-win

# Find and copy binaries
find /tmp/ffmpeg-win -name "ffmpeg.exe" -exec cp {} "$BIN_DIR/" \;
find /tmp/ffmpeg-win -name "ffprobe.exe" -exec cp {} "$BIN_DIR/" \;

# Download whisper.cpp Windows build
echo "Downloading whisper-cli..."
curl -L "https://github.com/ggerganov/whisper.cpp/releases/download/${WHISPER_VERSION}/whisper-cli-win64.exe" \
-o "$BIN_DIR/whisper-cli.exe"
Copy link

Copilot AI Jan 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script downloads and executes precompiled ffmpeg and whisper-cli binaries from third-party GitHub URLs (including the mutable latest FFmpeg builds) without any checksum, signature, or pinning to an immutable identifier. If the upstream project, tag, or delivery path is compromised (or a MITM/TLS termination point is abused), a malicious binary could be fetched and then executed in CI or shipped to users, leading to remote code execution. To mitigate this, pin these downloads to immutable artifacts and verify their integrity (e.g., using published checksums/signatures or vendoring known-good binaries) instead of relying on latest and unverified downloads.

Copilot uses AI. Check for mistakes.
- scripts/download_fonts.sh: Fix invalid shell comment syntax (// -> #)
- scripts/download-binaries.sh: Fix architecture bug - save whisper binary with correct filename for both arm64 and x64
- rust/src/whisper.rs: Remove duplicate docstrings for get_project_ffmpeg_paths and get_project_ffprobe_paths
- electron/app/components/CaptionEditor.tsx: Remove unused FONT_NAMES and getFontName
- electron/app/app.tsx: Remove unused imports (SelectGroup, SelectLabel, DropdownMenuLabel, DropdownMenuSeparator, DropdownMenuGroup) and fix duplicate field assignments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants