Skip to content
This repository was archived by the owner on Jan 19, 2026. It is now read-only.

Commit aacd886

Browse files
beveradbclaude
andcommitted
docs: mark project as deprecated, redirect to karaoke-gen
This project has been consolidated into karaoke-gen. See https://github.com/nomadkaraoke/karaoke-gen Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 83715b7 commit aacd886

File tree

1 file changed

+18
-258
lines changed

1 file changed

+18
-258
lines changed

README.md

Lines changed: 18 additions & 258 deletions
Original file line numberDiff line numberDiff line change
@@ -1,270 +1,30 @@
1-
# Lyrics Transcriber 🎶
1+
# python-lyrics-transcriber (DEPRECATED)
22

3-
![PyPI - Version](https://img.shields.io/pypi/v/lyrics-transcriber)
4-
![Python Version](https://img.shields.io/badge/python-3.10%E2%80%933.13-blue)
5-
[![Tests](https://github.com/nomadkaraoke/python-lyrics-transcriber/actions/workflows/test-and-publish.yml/badge.svg)](https://github.com/nomadkaraoke/python-lyrics-transcriber/actions/workflows/test-and-publish.yml)
6-
[![Coverage](https://codecov.io/gh/nomadkaraoke/python-lyrics-transcriber/graph/badge.svg?token=SMW2TVPVNT)](https://codecov.io/gh/nomadkaraoke/python-lyrics-transcriber)
3+
> **This project has been deprecated and archived.**
74
8-
Create synchronized karaoke assets from an audio file with word‑level timing: fetch lyrics, transcribe audio, auto‑correct against references, review in a web UI, and export ASS, LRC, CDG, and video.
5+
The lyrics transcription functionality has been consolidated into [karaoke-gen](https://github.com/nomadkaraoke/karaoke-gen).
96

10-
### What this project is now
11-
- **Modular pipeline** orchestrated by `LyricsTranscriber` with clear configs
12-
- **Transcription** via AudioShake (preferred) and Whisper on RunPod (fallback)
13-
- **Lyrics providers**: Genius, Spotify, Musixmatch, or a local file
14-
- **Rule‑based correction** with optional **LLM‑assisted** gap fixes
15-
- **Human review** server + frontend for iterative corrections and previews
16-
- **Outputs**: original/corrected text, corrections JSON, LRC, ASS, CDG(+MP3/ZIP), and video
7+
## Migration
178

18-
## Features
19-
- **Multi-transcriber orchestration** with caching per audio hash
20-
- AudioShake API (priority 1)
21-
- Whisper via RunPod + Dropbox upload (priority 2)
22-
- **Lyrics fetching** with caching per artist/title
23-
- Genius (token or RapidAPI) • Spotify (cookie or RapidAPI) • Musixmatch (RapidAPI) • Local file
24-
- **Correction engine**
25-
- Anchor/gap detection, multiple rule handlers (word count, syllables, relaxed, punctuation, extend‑anchor)
26-
- Optional LLM handlers (Ollama local, or OpenRouter with `OPENROUTER_API_KEY`)
27-
- **Review UI** (FastAPI) at `http://localhost:8000`
28-
- Edit corrections, toggle handlers, add lyrics sources, generate preview video
29-
- **Countdown intro for karaoke** (enabled by default)
30-
- Automatically adds 3-second intro with "3... 2... 1..." for songs that start within 3 seconds
31-
- Pads audio with silence and shifts all timestamps accordingly
32-
- Helps karaoke singers prepare before vocals begin
33-
- Disable with `--skip_countdown`
34-
- **Rich outputs**
35-
- Plain text (original/corrected), corrections `JSON`, `*.lrc` (MidiCo), `*.ass` (karaoke), `*.cdg` with `*.mp3` and ZIP, and MP4/MKV video
36-
- Subtitle offset, line wrapping, styles via JSON
9+
For karaoke video generation with synchronized lyrics:
10+
- Use [karaoke-gen](https://github.com/nomadkaraoke/karaoke-gen) - the complete karaoke generation solution
11+
- Web app: https://gen.nomadkaraoke.com
3712

38-
## Install
39-
```
40-
pip install lyrics-transcriber
41-
```
13+
## Historical Context
4214

43-
### System requirements
44-
- Python 3.10–3.13
45-
- FFmpeg (required for audio probe and video rendering)
46-
- spaCy English model (phrase analyzer used by correction):
47-
```
48-
python -m spacy download en_core_web_sm
49-
```
15+
This library was originally developed as a standalone tool for creating synchronized lyrics files with word-level timestamps. It combined audio transcription (via AudioShake or Whisper), lyrics fetching from multiple sources (Genius, Spotify, Musixmatch, LRCLib), and intelligent correction algorithms to produce professional karaoke assets.
5016

51-
## Quick start (CLI)
52-
Minimal run (transcribe + LRC/ASS, no video/CDG):
53-
```bash
54-
lyrics-transcriber /path/to/song.mp3 --skip_video --skip_cdg
55-
```
17+
The functionality has now been integrated into the karaoke-gen platform, which provides a complete end-to-end solution for karaoke video generation including:
18+
- Audio separation (vocals/instrumentals)
19+
- Lyrics transcription and correction
20+
- Human review interface
21+
- Video rendering with title screens
22+
- Distribution to YouTube, Dropbox, Google Drive
5623

57-
Use AudioShake and auto‑fetch lyrics (Genius + artist/title):
58-
```bash
59-
export AUDIOSHAKE_API_TOKEN=... # or pass --audioshake_api_token
60-
export GENIUS_API_TOKEN=...
61-
lyrics-transcriber /path/to/song.mp3 --artist "Artist" --title "Song"
62-
```
24+
## Final Version
6325

64-
Use Whisper on RunPod (fallback or standalone):
65-
```bash
66-
export RUNPOD_API_KEY=...
67-
export WHISPER_RUNPOD_ID=... # your RunPod endpoint ID
68-
lyrics-transcriber /path/to/song.mp3 --skip_cdg --skip_video
69-
```
70-
71-
Provide a local lyrics file instead of fetching:
72-
```bash
73-
lyrics-transcriber /path/to/song.mp3 --lyrics_file /path/to/lyrics.txt
74-
```
75-
76-
Render video/CDG (requires a styles JSON file):
77-
```bash
78-
lyrics-transcriber /path/to/song.mp3 \
79-
--output_styles_json /path/to/styles.json \
80-
--video_resolution 1080p
81-
```
82-
83-
### Common flags
84-
- **Song identification**: `--artist`, `--title`, `--lyrics_file`
85-
- **APIs**: `--audioshake_api_token`, `--genius_api_token`, `--spotify_cookie`, `--runpod_api_key`, `--whisper_runpod_id`
86-
- **Output**: `--output_dir`, `--cache_dir`, `--output_styles_json`, `--subtitle_offset`
87-
- **Feature toggles**: `--skip_lyrics_fetch`, `--skip_transcription`, `--skip_correction`, `--skip_plain_text`, `--skip_lrc`, `--skip_cdg`, `--skip_video`, `--skip_countdown`, `--video_resolution {4k,1080p,720p,360p}`
88-
89-
Run `lyrics-transcriber --help` for full usage.
90-
91-
## Environment variables
92-
These are read automatically (CLI flags override):
93-
- `AUDIOSHAKE_API_TOKEN`
94-
- `GENIUS_API_TOKEN`, `RAPIDAPI_KEY`
95-
- `SPOTIFY_COOKIE_SP_DC`
96-
- `RUNPOD_API_KEY`, `WHISPER_RUNPOD_ID`
97-
- `WHISPER_DROPBOX_APP_KEY`, `WHISPER_DROPBOX_APP_SECRET`, `WHISPER_DROPBOX_REFRESH_TOKEN`
98-
- `OPENROUTER_API_KEY` (optional LLM handler)
99-
- `LYRICS_TRANSCRIBER_CACHE_DIR` (default `~/lyrics-transcriber-cache`)
100-
101-
## Outputs
102-
Generated files are written to `--output_dir` (default: CWD):
103-
- `... (Lyrics Corrections).json` — full correction data and audit trail
104-
- `... (Karaoke).ass` — styled karaoke subtitles (ASS)
105-
- `... .lrc` — MidiCo compatible LRC
106-
- `... (original).txt` and `... (corrected).txt` — plain text exports
107-
- `... .cdg`, `... .mp3`, `... .zip` — CDG package (when enabled)
108-
- `... (With Vocals).mkv` — video with lyrics overlay (when enabled)
109-
110-
Notes
111-
- If no `--output_styles_json` is provided, CDG and video are disabled automatically.
112-
- `--subtitle_offset` shifts all word timings (ms) for late/early subtitles.
113-
114-
## Review server (human‑in‑the‑loop)
115-
If review is enabled (default), a local server starts during processing and opens the UI at `http://localhost:8000`:
116-
- Inspect and adjust corrections
117-
- Toggle correction handlers (rule‑based/LLM)
118-
- Add another lyrics source (paste plain text)
119-
- Generate a low‑res preview video on demand
120-
121-
Frontend assets are bundled when installed from PyPI. For local dev, build the frontend once if needed:
122-
```
123-
./scripts/build_frontend.sh
124-
```
125-
126-
## Styles JSON (for CDG/Video)
127-
Provide a JSON with at least a `karaoke` section (for video/ASS) and, if generating CDG, a `cdg` section. Example (minimal):
128-
```json
129-
{
130-
"karaoke": {
131-
"ass_name": "Karaoke",
132-
"font": "Oswald SemiBold",
133-
"font_path": "lyrics_transcriber/output/fonts/Oswald-SemiBold.ttf",
134-
"font_size": 120,
135-
"primary_color": "255,165,0",
136-
"secondary_color": "255,255,255",
137-
"outline_color": "0,0,0",
138-
"back_color": "0,0,0",
139-
"bold": true,
140-
"italic": false,
141-
"underline": false,
142-
"strike_out": false,
143-
"scale_x": 100,
144-
"scale_y": 100,
145-
"spacing": 0,
146-
"angle": 0,
147-
"border_style": 1,
148-
"outline": 3,
149-
"shadow": 0,
150-
"margin_l": 0,
151-
"margin_r": 0,
152-
"margin_v": 100,
153-
"encoding": 1,
154-
"background_color": "black",
155-
"max_line_length": 36,
156-
"top_padding": 180
157-
},
158-
"cdg": {
159-
"font": "Oswald SemiBold",
160-
"font_path": "lyrics_transcriber/output/fonts/Oswald-SemiBold.ttf"
161-
}
162-
}
163-
```
164-
165-
## Using as a library
166-
```python
167-
from lyrics_transcriber import LyricsTranscriber
168-
from lyrics_transcriber.core.controller import TranscriberConfig, LyricsConfig, OutputConfig
169-
170-
transcriber = LyricsTranscriber(
171-
audio_filepath="/path/to/song.mp3",
172-
artist="Artist", # optional
173-
title="Title", # optional
174-
transcriber_config=TranscriberConfig(
175-
audioshake_api_token="...", # or env
176-
runpod_api_key="...", whisper_runpod_id="..."
177-
),
178-
lyrics_config=LyricsConfig(
179-
genius_api_token="...", spotify_cookie="...", rapidapi_key="...",
180-
lyrics_file=None
181-
),
182-
output_config=OutputConfig(
183-
output_dir="./out", cache_dir="~/lyrics-transcriber-cache",
184-
output_styles_json="/path/to/styles.json", # required for CDG/video
185-
video_resolution="1080p", subtitle_offset_ms=0,
186-
add_countdown=True # enable countdown for songs starting within 3s (default: True)
187-
),
188-
)
189-
190-
result = transcriber.process()
191-
print(result.ass_filepath, result.lrc_filepath, result.video_filepath)
192-
193-
# Check if countdown padding was added (useful for syncing other audio files)
194-
if result.countdown_padding_added:
195-
print(f"Countdown padding added: {result.countdown_padding_seconds}s")
196-
print(f"Padded audio filepath: {result.padded_audio_filepath}")
197-
# You can use this info to apply the same padding to instrumental tracks
198-
```
199-
200-
## Docker
201-
Build and run locally (includes FFmpeg and spaCy model):
202-
```bash
203-
docker build -t lyrics-transcriber:local .
204-
docker run --rm -v "$PWD/input":/input -v "$PWD/output":/output \
205-
-e AUDIOSHAKE_API_TOKEN -e GENIUS_API_TOKEN -e RUNPOD_API_KEY -e WHISPER_RUNPOD_ID \
206-
lyrics-transcriber:local \
207-
--output_dir /output --skip_cdg --video_resolution 360p /input/song.mp3
208-
```
209-
210-
## Development
211-
- Python 3.10–3.13, Poetry
212-
- Install deps: `poetry install`
213-
- Run tests: `poetry run pytest`
214-
- Build frontend (if editing UI): `./scripts/build_frontend.sh`
215-
216-
## Agentic AI (Experimental)
217-
218-
Uses **LangChain + LangGraph** for AI-powered lyrics correction with automatic **Langfuse** observability.
219-
220-
### Enabling
221-
- CLI flags: `--use-agentic-ai` and `--ai-model provider/model`
222-
- Or env: `USE_AGENTIC_AI=1`, `AGENTIC_AI_MODEL=ollama/gpt-oss:latest`
223-
224-
### Model Format
225-
Models use `provider/model` format for LangChain:
226-
- **Ollama** (local): `ollama/gpt-oss:latest`, `ollama/llama3.2:latest`
227-
- **OpenAI**: `openai/gpt-4`, `openai/gpt-4-turbo`
228-
- **Anthropic**: `anthropic/claude-3-sonnet-20240229`, `anthropic/claude-3-opus-20240229`
229-
230-
### Provider Configuration
231-
- **API Keys**: Set provider-specific keys:
232-
- OpenAI: `OPENAI_API_KEY`
233-
- Anthropic: `ANTHROPIC_API_KEY`
234-
- **Local/Privacy Mode**: `PRIVACY_MODE=1` (uses Ollama only)
235-
- **Timeouts/Retries**: `AGENTIC_TIMEOUT_SECONDS=30`, `AGENTIC_MAX_RETRIES=2`
236-
- **Circuit Breaker**: `AGENTIC_CIRCUIT_THRESHOLD=3`, `AGENTIC_CIRCUIT_OPEN_SECONDS=60`
237-
238-
### Observability (Langfuse)
239-
Automatic tracing via LangChain callbacks - just set:
240-
```bash
241-
export LANGFUSE_PUBLIC_KEY="pk-lf-..."
242-
export LANGFUSE_SECRET_KEY="sk-lf-..."
243-
export LANGFUSE_HOST="https://us.cloud.langfuse.com" # or https://cloud.langfuse.com for EU
244-
```
245-
246-
Traces include:
247-
- Full prompts and responses
248-
- Token counts and latency
249-
- Cost estimates (for paid APIs)
250-
- Model performance metrics
251-
252-
View metrics: `GET /api/v1/metrics`
253-
254-
### Feedback Store
255-
- SQLite DB persisted in cache dir (sessions, feedback)
256-
- 3-year retention policy with automatic cleanup
257-
258-
### Architecture
259-
See `LANGCHAIN_MIGRATION.md` for details on the LangChain/LangGraph implementation.
26+
The last standalone version was **0.81.0** (December 2025). No further releases will be made to PyPI.
26027

26128
## License
262-
MIT. See `LICENSE`.
263-
264-
## Credits
265-
- Audio transcription by AudioShake and Whisper (RunPod)
266-
- Lyrics via Genius, Spotify, Musixmatch; layout via `karaoke-lyrics-processor`
267-
- UI/API: FastAPI, Vite/React frontend
26829

269-
## Support
270-
Please open issues or PRs on the repo, or contact @beveradb.
30+
MIT License - see [LICENSE](LICENSE) for details.

0 commit comments

Comments
 (0)