Skip to content

Commit bee8ddf

Browse files
committed
feat(subtitle): implement faster-whisper integration for improved subtitles
- Add faster-whisper>=1.0.0 dependency to pyproject.toml and requirements.txt - Replace FFmpeg whisper filter with faster-whisper library for better performance and reliability - Implement comprehensive subtitle generation with support for multiple formats (SRT, VTT, TXT, LRC) - Add helper functions for audio extraction, timestamp formatting, and segment conversion - Implement file validation checks including audio stream detection, disk space verification, and output file conflict resolution - Add progress tracking and user-friendly error handling with detailed console feedback - Support multiple whisper model sizes from tiny to large-v3 for flexible quality/speed tradeoffs - Improve robustness with temporary file handling and graceful error recovery
1 parent e182ab8 commit bee8ddf

File tree

7 files changed

+563
-91
lines changed

7 files changed

+563
-91
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
/dist/
33
/build/
44
*.spec
5+
*.egg-info/
56

67
# Python cache
78
__pycache__/

README.md

Lines changed: 118 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# 🎬 ffmPEG-this
1+
<h1 align="center">FFm<u><i>PEG</i></u>-this</h1>
22

33
<p align="center">
44
<a href="https://pypi.org/project/peg-this/">
@@ -15,34 +15,108 @@
1515
</a>
1616
</p>
1717

18-
> Your Video editor within CLI 🚀
18+
<p align="center"><b>Your Editor within CLI</b></p>
1919

2020
A powerful and user-friendly Python CLI tool for converting, manipulating, and inspecting media files using the power of FFmpeg. This tool provides a simple command-line menu to perform common audio and video tasks without needing to memorize complex FFmpeg commands.
2121

22-
2322
<p align="center">
2423
<img src="/assets/peg.gif" width="720">
2524
</p>
2625

26+
## Features at a Glance
27+
28+
| Category | Feature | Description |
29+
|----------|---------|-------------|
30+
| **Inspect** | Media Properties | View detailed codec, resolution, frame rate, bitrate, and stream information |
31+
| **Convert** | Video Formats | Convert to MP4, MKV, MOV, AVI, WebM with quality presets (CRF 18/23/28) |
32+
| | Audio Formats | Convert to MP3 (128k-320k bitrate), FLAC, WAV |
33+
| | GIF Creation | Convert video clips to animated GIFs with optimized palette |
34+
| | Image Formats | Convert between JPG, PNG, WebP, BMP, TIFF with quality control |
35+
| **Subtitles** | AI Transcription | Generate subtitles using Whisper AI (7 model sizes available) |
36+
| | Sidecar Export | Save as `.srt`, `.vtt`, `.txt`, or `.lrc` files |
37+
| | Soft Subtitles | Embed toggleable subtitle track into video |
38+
| | Hard Subtitles | Burn permanent subtitles directly into video |
39+
| | Multi-language | Support for 99+ languages with auto-detection |
40+
| **Edit** | Trim/Cut | Extract video segments by start/end time (lossless, no re-encoding) |
41+
| | Visual Crop | Interactive GUI to select crop area on video/image |
42+
| | Join/Concatenate | Merge multiple videos with automatic resolution matching |
43+
| **Audio** | Extract Audio | Rip audio track to MP3, FLAC, or WAV |
44+
| | Remove Audio | Create silent version of video (keeps video intact) |
45+
| **Image** | Resize | Scale images with aspect ratio preservation |
46+
| | Rotate | Rotate 90°, 180°, or 270° |
47+
| | Flip | Flip horizontally or vertically |
48+
| | Crop | Visual cropping with click-and-drag selection |
49+
| **Batch** | Batch Convert | Convert all media files in directory at once |
50+
51+
## Detailed Feature Breakdown
52+
53+
### Video Operations
54+
55+
| Operation | Input | Output | Method | Re-encoding |
56+
|-----------|-------|--------|--------|-------------|
57+
| **Convert** | Any video | MP4, MKV, MOV, AVI, WebM | FFmpeg transcode | Yes (CRF quality) |
58+
| **Trim** | Any video | Same format | Stream copy | No (lossless) |
59+
| **Crop** | Any video | Same format | Visual selection + crop filter | Yes |
60+
| **Join** | Multiple videos | Single MP4 | Concat filter + normalize | Yes |
61+
| **To GIF** | Any video | Animated GIF | 2-pass palette optimization | Yes |
62+
63+
### Audio Operations
64+
65+
| Operation | Input | Output | Notes |
66+
|-----------|-------|--------|-------|
67+
| **Extract** | Video with audio | MP3, FLAC, WAV | Preserves original quality for FLAC/WAV |
68+
| **Remove** | Video with audio | Silent video | Stream copy (fast, no re-encoding) |
69+
| **Convert** | Audio file | MP3, FLAC, WAV | Bitrate selection for MP3 |
70+
71+
### Subtitle Generation
72+
73+
| Model | Size | Speed | Accuracy | Languages |
74+
|-------|------|-------|----------|-----------|
75+
| `tiny.en` | ~75 MB | Fastest | Good | English only |
76+
| `base.en` | ~150 MB | Fast | Better | English only |
77+
| `small.en` | ~500 MB | Balanced | Great | English only |
78+
| `medium.en` | ~1.5 GB | Slower | Excellent | English only |
79+
| `small` | ~500 MB | Balanced | Great | 99+ languages |
80+
| `medium` | ~1.5 GB | Slower | Excellent | 99+ languages |
81+
| `large-v3` | ~3 GB | Slowest | Best | 99+ languages |
82+
83+
**Output Options:**
84+
| Type | File Extension | Description |
85+
|------|----------------|-------------|
86+
| Sidecar | `.srt` | SubRip - most compatible format |
87+
| Sidecar | `.vtt` | WebVTT - for web/HTML5 players |
88+
| Sidecar | `.txt` | Plain text transcript |
89+
| Sidecar | `.lrc` | Lyrics format with timestamps |
90+
| Soft Subs | `.mp4/.mkv` | Embedded, toggleable in players |
91+
| Hard Subs | `.mp4/.mkv` | Burned in, always visible |
92+
93+
### Image Operations
94+
95+
| Operation | Options | Notes |
96+
|-----------|---------|-------|
97+
| **Convert** | JPG, PNG, WebP, BMP, TIFF | Quality presets (95%, 80%, 60%) |
98+
| **Resize** | Custom width/height | Use `-1` to preserve aspect ratio |
99+
| **Rotate** | 90° CW, 90° CCW, 180° | Lossless rotation |
100+
| **Flip** | Horizontal, Vertical | Mirror image |
101+
| **Crop** | Visual selection | Interactive GUI with preview |
102+
103+
### Supported Formats
104+
105+
| Type | Supported Formats |
106+
|------|-------------------|
107+
| **Video Input** | `.mp4`, `.mkv`, `.avi`, `.mov`, `.webm`, `.flv`, `.wmv`, `.gif` |
108+
| **Video Output** | `.mp4`, `.mkv`, `.mov`, `.avi`, `.webm`, `.gif` |
109+
| **Audio Input** | `.mp3`, `.flac`, `.wav`, `.ogg`, `.aac`, `.m4a` |
110+
| **Audio Output** | `.mp3`, `.flac`, `.wav` |
111+
| **Image Input** | `.jpg`, `.jpeg`, `.png`, `.webp`, `.bmp`, `.tiff` |
112+
| **Image Output** | `.jpg`, `.png`, `.webp`, `.bmp`, `.tiff` |
113+
| **Subtitle Output** | `.srt`, `.vtt`, `.txt`, `.lrc` |
114+
115+
## Usage
27116

28-
## ✨ Features
29-
30-
- **Inspect Media Properties**: View detailed information about video and audio streams, including codecs, resolution, frame rate, bitrates, and more.
31-
- **Convert & Transcode**: Convert videos and audio to a wide range of popular formats (MP4, MKV, WebM, MP3, FLAC, WAV, GIF) with simple quality presets.
32-
- **Join Videos (Concatenate)**: Combine two or more videos into a single file. The tool automatically handles differences in resolution and audio sample rates for a seamless join.
33-
- **Trim (Cut) Videos**: Easily cut a video to a specific start and end time without re-encoding for fast, lossless clips.
34-
- **Visually Crop Videos**: An interactive tool that shows you a frame of the video, allowing you to click and drag to select the exact area you want to crop.
35-
- **Extract Audio**: Rip the audio track from any video file into MP3, FLAC, or WAV.
36-
- **Remove Audio**: Create a silent version of your video by stripping out all audio streams.
37-
- **Image Manipulation**: Perform basic operations on images such as format conversion, resizing, rotating, and flipping.
38-
- **Batch Conversion**: Convert all media files in the current directory to a specified format in one go.
39-
- **CLI Interface**: A user-friendly command-line interface that makes it easy to perform common tasks and navigate the tool's features.
40-
41-
42-
## 🚀 Usage
43117
### Prerequisite: Install FFmpeg
44118

45-
> [NOTE]
119+
> [!NOTE]
46120
> `peg_this` uses a library called `ffmpeg-python` which acts as a controller for the main FFmpeg program. It does not include FFmpeg itself. Therefore, you must have FFmpeg installed on your system and available in your terminal's PATH.
47121
48122
For **macOS** users, the easiest way to install it is with [Homebrew](https://brew.sh/):
@@ -100,15 +174,34 @@ If you want to run the tool directly from the source code:
100174
python -m src.peg_this.peg_this
101175
```
102176

103-
## 📈 Star History
177+
## Subtitle Generation
178+
179+
The subtitle feature uses [faster-whisper](https://github.com/SYSTRAN/faster-whisper), a fast and accurate speech-to-text engine powered by OpenAI's Whisper model.
180+
181+
### How it works
182+
183+
1. Select a video file
184+
2. Choose "Generate Subtitles (Whisper)"
185+
3. Pick a model size (tiny to large-v3)
186+
4. Select processing mode (Fast or Accurate)
187+
5. Choose output type:
188+
- **Sidecar file**: Export as `.srt`, `.vtt`, `.txt`, or `.lrc`
189+
- **Soft subtitles**: Embed into video (can be toggled on/off in players)
190+
- **Hard subtitles**: Burn into video (permanent, always visible)
191+
192+
### Supported Languages
193+
194+
Using multilingual models (`small`, `medium`, `large-v3`), you can transcribe audio in 99+ languages including English, Spanish, French, German, Chinese, Japanese, Korean, Hindi, Arabic, and many more.
195+
196+
## Star History
104197
105198
<p align="center">
106199
<a href="https://star-history.com/#hariharen9/ffmpeg-this&Date">
107200
<img src="https://api.star-history.com/svg?repos=hariharen9/ffmpeg-this&type=Date" alt="Star History Chart">
108201
</a>
109202
</p>
110203
111-
## Sponsor
204+
## Sponsor
112205
113206
<p align="center">
114207
<a href="https://github.com/sponsors/hariharen9">
@@ -119,20 +212,20 @@ If you want to run the tool directly from the source code:
119212
</a>
120213
</p>
121214
122-
## 👥 Contributors
215+
## Contributors
123216
124217
<a href="https://github.com/hariharen9/ffmpeg-this/graphs/contributors">
125218
<img src="https://contrib.rocks/image?repo=hariharen9/ffmpeg-this" />
126219
</a>
127220
128-
## 🤝 Contributing
221+
## Contributing
129222
130223
Contributions are welcome! Please see the [Contributing Guidelines](CONTRIBUTING.md) for more information.
131224
132-
## 📄 License
225+
## License
133226
134227
This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
135228
136229
<p align="center">
137-
<h2>Made with ❤️ by <a href="https://hariharen.site">Hariharen</a></h2>
230+
Made with ❤️ by <a href="https://hariharen.site">Hariharen</a>
138231
</p>

assets/banner.png

5.03 MB
Loading

assets/ffmpegthis.png

2.78 MB
Loading

pyproject.toml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,8 @@ dependencies = [
2222
"ffmpeg-python==0.2.0",
2323
"questionary>=2.0.0",
2424
"rich>=13.0.0",
25-
"Pillow>=9.0.0"
25+
"Pillow>=9.0.0",
26+
"faster-whisper>=1.0.0"
2627
]
2728

2829
[project.urls]

requirements.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,3 +3,4 @@ rich
33
pyinstaller
44
Pillow
55
ffmpeg-python
6+
faster-whisper

0 commit comments

Comments
 (0)