Skip to content

fabio0296/kara-it

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kara-It: Video to Romaji Karaoke Tool

A CLI tool that automatically generates Romaji karaoke subtitles for Japanese music videos.

Features

  • Auto-Transcription: Uses OpenAI Whisper (via stable-ts) to transcribe audio with word-level precision.
  • Auto-Romanization: Converts Japanese text to Romaji using cutlet (MeCab).
  • Karaoke Effects: Generates standard .ass karaoke tags ({\k}) synced to vocals.
  • Burn-in: Automatically burns the subtitles into the video using FFmpeg.

Installation

1. Prerequisites

  • Python 3.10+
  • FFmpeg with libass support (Required for burning subtitles).

2. Install FFmpeg (with libass)

MacOS (Apple Silicon):
Standard Homebrew FFmpeg does NOT support subtitle burning. You must compile it from source.

  1. Install Build Dependencies:

    brew install automake fdk-aac git lame libass libtool libvorbis libvpx \
    opus sdl shtool texi2html theora wget x264 x265 xvid nasm yasm pkg-config
  2. Clone FFmpeg:

    cd ~/Downloads
    git clone https://github.com/FFmpeg/FFmpeg.git
    cd FFmpeg
  3. Configure & Build: Run this block in your terminal to configure with explicit Apple Silicon paths:

    export PKG_CONFIG_PATH="/opt/homebrew/lib/pkgconfig:$PKG_CONFIG_PATH"
    
    ./configure  --prefix=/usr/local \
                 --extra-cflags="-I/opt/homebrew/include" \
                 --extra-ldflags="-L/opt/homebrew/lib" \
                 --enable-gpl \
                 --enable-nonfree \
                 --enable-libass \
                 --enable-libfdk-aac \
                 --enable-libfreetype \
                 --enable-libmp3lame \
                 --enable-libopus \
                 --enable-libtheora \
                 --enable-libvorbis \
                 --enable-libvpx \
                 --enable-libx264 \
                 --enable-libx265
  4. Install:

    make -j$(sysctl -n hw.ncpu)
    sudo make install

3. Install Project

  1. Clone the repository:

    git clone https://github.com/yourusername/kara-it.git
    cd kara-it
  2. Install Poetry (if not already installed):

    curl -sSL https://install.python-poetry.org | python3 -
    
    # Add Poetry to PATH (add to your shell config file: ~/.zshrc or ~/.bashrc)
    export PATH="$HOME/.local/bin:$PATH"
  3. Install Dependencies:

    # Install all dependencies and create virtual environment
    poetry install
    
    # Activate the virtual environment
    poetry shell
  4. Install MeCab Dictionary:

    python -m unidic_lite.download

Note: Poetry is configured to create the virtual environment in the project directory (.venv/). This keeps your dependencies isolated and makes it easy to manage.

Usage

Basic Usage (Auto Mode): Generates Romaji karaoke and burns it to a new video.

python src/main.py generate "path/to/song.mp4"

Options:

  • --karaoke / --no-karaoke: Enable/Disable {\k} tags (Default: Enabled).
  • --burn / --no-burn: Burn subtitles into video (Default: Enabled).
  • --format ass: Output format (Default: ass).
  • --model base: Whisper model size (tiny, base, small, medium, large).

Example:

python -m src.main generate my_song.mp4 --model medium --karaoke

Handling Live Videos (Spoken Interludes)

by default, the tool transcribes everything, including spoken words between songs. To remove these:

  1. Transcribe first:
    poetry run python -m src.main transcribe live_video.mp4 --output transcript.json
  2. Edit the JSON: Open transcript.json and manually remove the segments corresponding to the spoken parts.
  3. Continue the pipeline:
    poetry run python -m src.main romanize transcript.json --output romaji.json
    poetry run python -m src.main format romaji.json --output subs.ass
    poetry run python -m src.main burn live_video.mp4 subs.ass

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages