@@ -17,186 +17,240 @@ Video/Audio: MP4, WebM, Ogg, 3GP, FLV, MOV, Matroska, MPEG TS, WAV, MP3, AAC, FL
1717
1818:information_source : <small style =" line-height : 1.2 ;" >Subaligner relies on file extensions as default hints to process a wide range of audiovisual or subtitle formats. It is recommended to use extensions widely acceppted by the community to ensure compatibility.</small >
1919
20- ## Dependencies
21- Required by basic: [ FFmpeg] ( https://www.ffmpeg.org/ )
22- ```
23- $ apt-get install ffmpeg
24- ```
25- or
26- ```
20+ ## Dependant package
21+ Required by the basic installation: [ FFmpeg] ( https://www.ffmpeg.org/ )
22+ <details >
23+ <summary >Install FFmpeg</summary >
24+ <pre ><code >$ apt-get install ffmpeg
2725$ brew install ffmpeg
28- ```
26+ </code ></pre >
27+ </details >
2928
3029## Basic Installation
31- ```
32- $ pip install -U pip && pip install -U setuptools wheel
30+ <details >
31+ <summary >Install from PyPI</summary >
32+ <pre ><code >$ pip install -U pip && pip install -U setuptools wheel
3333$ pip install subaligner
34- ```
35- or install from source:
36- ```
37- $ git clone git@github.com:baxtree/subaligner.git && cd subaligner
34+ </code ></pre >
35+ </details >
36+ <details >
37+ <summary >Install from source</summary >
38+ <pre ><code >$ git clone git@github.com:baxtree/subaligner.git && cd subaligner
3839$ pip install -U pip && pip install -U setuptools
39- $ python setup.py install
40- ```
40+ $ pip install .
41+ </code ></pre >
42+ </details >
4143:information_source : <small style =" line-height : 1.2 ;" >It is highly recommended creating a virtual environment prior to installation.</small >
4244
4345## Installation with Optional Packages Supporting Additional Features
44- ```
45- # Install dependencies for enabling translation and transcription
46-
47- $ pip install 'subaligner[llm]'
48- ```
49- ```
50- # Install dependencies for enabling forced alignment
51-
52- $ pip install 'setuptools<65.0.0'
46+ < details >
47+ < summary > Install dependencies for enabling translation and transcription</ summary >
48+ < pre >< code >$ pip install 'subaligner[llm]'
49+ </ code ></ pre >
50+ </ details >
51+
52+ < details >
53+ < summary >Install dependencies for enabling forced alignment</ summary >
54+ < pre >< code > $ pip install 'setuptools<65.0.0'
5355$ pip install 'subaligner[stretch]'
54- ```
55- ```
56- # Install dependencies for setting up the development environment
56+ </code ></pre >
57+ </details >
5758
58- $ pip install 'setuptools<65.0.0'
59+ <details >
60+ <summary >Install dependencies for setting up the development environment</summary >
61+ <pre ><code >$ pip install 'setuptools<65.0.0'
5962$ pip install 'subaligner[dev]'
60- ```
61- Note that both ` subaligner[stretch] ` and ` subaligner[dev] ` require additional dependencies to be pre-installed:
62- ```
63- $ apt-get install espeak libespeak1 libespeak-dev espeak-data
64- ```
65- or
66- ```
67- $ brew install espeak
68- ```
69- To install all supported features:
70- ```
71- $ pip install 'setuptools<65.0.0'
63+ </code ></pre >
64+ </details >
65+
66+
67+ <details >
68+ <summary >Install all extra dependencies</summary >
69+ <pre ><code >$ pip install 'setuptools<65.0.0'
7270$ pip install 'subaligner[harmony]'
73- ```
71+ </code ></pre >
72+ </details >
73+
74+ Note that ` subaligner[stretch] ` , ` subaligner[dev] ` and ` subaligner[harmony] ` require [ eSpeak] ( https://espeak.sourceforge.net/ ) to be pre-installed:
75+ <details >
76+ <summary >Install eSpeak</summary >
77+ <pre ><code >$ apt-get install espeak libespeak1 libespeak-dev espeak-data
78+ $ brew install espeak
79+ </code ></pre >
80+ </details >
7481
7582## Container Support
76- If you prefer using a containerised environment over installing everything locally, run:
83+ If you prefer using a containerised environment over installing everything locally:
84+ <details >
85+ <summary >Run subaligner with a container</summary >
86+ <pre ><code >$ docker run -v `pwd`:`pwd` -w `pwd` -it baxtree/subaligner bash
87+ </code ></pre >
88+ </details >
7789
78- ```
79- $ docker run -v `pwd`:`pwd` -w `pwd` -it baxtree/subaligner bash
80- ```
8190For Windows users, you can use Windows Subsystem for Linux ([ WSL] ( https://learn.microsoft.com/en-us/windows/wsl/install ) ) to install Subaligner.
8291Alternatively, you can use [ Docker Desktop] ( https://docs.docker.com/docker-for-windows/install/ ) to pull and run the image.
83- Assuming your media assets are stored under ` d:\media ` , open built-in command prompt, PowerShell, or Windows Terminal and run:
84- ```
85- docker pull baxtree/subaligner
92+ Assuming your media assets are stored under ` d:\media ` , open built-in command prompt, PowerShell, or Windows Terminal:
93+ <details >
94+ <summary >Run the subaligner container on Windows</summary >
95+ <pre ><code >docker pull baxtree/subaligner
8696docker run -v "/d/media":/media -w "/media" -it baxtree/subaligner bash
87- ```
97+ </code ></pre >
98+ </details >
8899
89100## Usage
90- ```
91- # Single-stage alignment (high-level shift with lower latency)
92-
93- $ subaligner -m single -v video.mp4 -s subtitle.srt
101+ <details >
102+ <summary >Single-stage alignment (high-level shift with lower latency)</summary >
103+ <pre ><code >$ subaligner -m single -v video.mp4 -s subtitle.srt
94104$ subaligner -m single -v https://example.com/video.mp4 -s https://example.com/subtitle.srt -o subtitle_aligned.srt
95- ```
96- ```
97- # Dual-stage alignment (low-level shift with higher latency)
105+ </code ></pre >
106+ </details >
98107
99- $ subaligner -m dual -v video.mp4 -s subtitle.srt
108+ <details >
109+ <summary >Dual-stage alignment (low-level shift with higher latency)</summary >
110+ <pre ><code >$ subaligner -m dual -v video.mp4 -s subtitle.srt
100111$ subaligner -m dual -v https://example.com/video.mp4 -s https://example.com/subtitle.srt -o subtitle_aligned.srt
101- ```
102- ```
103- # Generate subtitles by transcribing audiovisual files
104- $ subaligner -m transcribe -v video.mp4 -ml eng -mr whisper -mf small -o subtitle_aligned.srt
105- $ subaligner -m transcribe -v video.mp4 -ml zho -mr whisper -mf medium -o subtitle_aligned.srt
106- $ subaligner -m transcribe -v video.mp4 -ml eng -mr whisper -mf turbo -ip "your initial prompt" -o subtitle_aligned.srt
107- $ subaligner -m transcribe -v video.mp4 -ml eng -mr whisper -mf turbo -ip "your initial prompt" --word_time_codes -o raw_subtitle.json
108- $ subaligner -m transcribe -v video.mp4 -s subtitle.srt -ml eng -mr whisper -mf turbo -o subtitle_aligned.srt
109- $ subaligner -m transcribe -v video.mp4 -s subtitle.srt --use_prior_prompting -ml eng -mr whisper -mf turbo -o subtitle_aligned.srt
112+ </code ></pre >
113+ </details >
110114
111- ```
112- ```
113- # Alignment on segmented plain texts (double newlines as the delimiter)
115+ <details >
116+ <summary >Generate subtitles by transcribing audiovisual files</summary >
117+ <pre ><code >$ subaligner -m transcribe -v video.mp4 -ml eng -mr whisper -mf small -o subtitle_aligned.srt
118+ $ subaligner -m transcribe -v video.mp4 -ml zho -mr whisper -mf medium -o subtitle_aligned.srt
119+ </code ></pre >
120+ </details >
121+
122+ <details >
123+ <summary >Pass in a global prompt for the entire audio transcription</summary >
124+ <pre ><code >$ subaligner -m transcribe -v video.mp4 -ml eng -mr whisper -mf turbo -ip "your initial prompt" -o subtitle_aligned.srt
125+ </code ></pre >
126+ </details >
127+
128+ <details >
129+ <summary >Use the full subtitle content as a prompt</summary >
130+ <pre ><code >$ subaligner -m transcribe -v video.mp4 -s subtitle.srt -ml eng -mr whisper -mf turbo -o subtitle_aligned.srt
131+ </code ></pre >
132+ </details >
133+
134+ <details >
135+ <summary >Use the previous subtitle segment as the prompt when transcribing the following segment</summary >
136+ <pre ><code >$ subaligner -m transcribe -v video.mp4 -s subtitle.srt --use_prior_prompting -ml eng -mr whisper -mf turbo -o subtitle_aligned.srt
137+ </code ></pre >
138+ </details >
139+
140+ (For details on the prompt crafting for transcription, please refer to [ Whisper prompting guide] ( https://cookbook.openai.com/examples/whisper_prompting_guide ) .)
141+
142+ <details >
143+ <summary >Alignment on segmented plain texts (double newlines as the delimiter)</summary >
144+ <pre ><code >$ subaligner -m script -v video.mp4 -s subtitle.txt -o subtitle_aligned.srt
145+ $ subaligner -m script -v https://example.com/video.mp4 -s https://example.com/subtitle.txt -o subtitle_aligned.srt
146+ </code ></pre >
147+ </details >
114148
115- $ subaligner -m script -v video.mp4 -s subtitle.txt -o subtitle_aligned.srt
149+ <details >
150+ <summary >Generate JSON raw subtitle with per-word timings</summary >
151+ <pre ><code >$ subaligner -m transcribe -v video.mp4 -ml eng -mr whisper -mf turbo -ip "your initial prompt" --word_time_codes -o raw_subtitle.json
116152$ subaligner -m script -v video.mp4 -s subtitle.txt --word_time_codes -o raw_subtitle.json
117- $ subaligner -m script -v https://example.com/video.mp4 -s https://example.com/subtitle.txt -o subtitle_aligned.srt
118- ```
119- ```
120- # Alignment on multiple subtitles against the single media file
153+ </code ></pre >
154+ </details >
155+
121156
122- $ subaligner -m script -v video.mp4 -s subtitle_lang_1.txt -s subtitle_lang_2.txt
157+ <details >
158+ <summary >Alignment on multiple subtitles against the single media file</summary >
159+ <pre ><code >$ subaligner -m script -v video.mp4 -s subtitle_lang_1.txt -s subtitle_lang_2.txt
123160$ subaligner -m script -v video.mp4 -s subtitle_lang_1.txt subtitle_lang_2.txt
124- ```
125- ```
126- # Alignment on embedded subtitles
161+ </code ></pre >
162+ </details >
127163
128- $ subaligner -m single -v video.mkv -s embedded:stream_index=0 -o subtitle_aligned.srt
164+ <details >
165+ <summary >Alignment on embedded subtitles</summary >
166+ <pre ><code >$ subaligner -m single -v video.mkv -s embedded:stream_index=0 -o subtitle_aligned.srt
129167$ subaligner -m dual -v video.mkv -s embedded:stream_index=0 -o subtitle_aligned.srt
130- ```
131- ```
132- # Translative alignment with the ISO 639-3 language code pair (src,tgt)
168+ </code ></pre >
169+ </details >
133170
134- $ subaligner --languages
171+ <details >
172+ <summary >Translative alignment with the ISO 639-3 language code pair (src,tgt)</summary >
173+ <pre ><code >$ subaligner --languages
135174$ subaligner -m single -v video.mp4 -s subtitle.srt -t src,tgt
136175$ subaligner -m dual -v video.mp4 -s subtitle.srt -t src,tgt
137176$ subaligner -m script -v video.mp4 -s subtitle.txt -o subtitle_aligned.srt -t src,tgt
138177$ subaligner -m dual -v video.mp4 -s subtitle.srt -tr helsinki-nlp -o subtitle_aligned.srt -t src,tgt
139178$ subaligner -m dual -v video.mp4 -s subtitle.srt -tr facebook-mbart -tf large -o subtitle_aligned.srt -t src,tgt
140179$ subaligner -m dual -v video.mp4 -s subtitle.srt -tr facebook-m2m100 -tf small -o subtitle_aligned.srt -t src,tgt
141180$ subaligner -m dual -v video.mp4 -s subtitle.srt -tr whisper -tf small -o subtitle_aligned.srt -t src,eng
142- ```
143- ```
144- # Transcribe audiovisual files and generate translated subtitles
181+ </code ></pre >
182+ </details >
183+
184+ <details >
185+ <summary >Transcribe audiovisual files and generate translated subtitles</summary >
186+ <pre ><code >$ subaligner -m transcribe -v video.mp4 -ml src -mr whisper -mf small -tr helsinki-nlp -o subtitle_aligned.srt -t src,tgt
187+ </code ></pre >
188+ </details >
145189
146- $ subaligner -m transcribe -v video.mp4 -ml src -mr whisper -mf small -tr helsinki-nlp -o subtitle_aligned.srt -t src,tgt
147- ```
148- ```
149- # Shift subtitle manually by offset in seconds
150190
151- $ subaligner -m shift --subtitle_path subtitle.srt -os 5.5
191+ <details >
192+ <summary >Shift subtitle manually by offset in seconds</summary >
193+ <pre ><code >$ subaligner -m shift --subtitle_path subtitle.srt -os 5.5
152194$ subaligner -m shift --subtitle_path subtitle.srt -os -5.5 -o subtitle_shifted.srt
153- ```
154- ```
155- # Run batch alignment against directories
195+ </code ></pre >
196+ </details >
156197
157- $ subaligner_batch -m single -vd videos/ -sd subtitles/ -od aligned_subtitles/
198+ <details >
199+ <summary >Run batch alignment against directories</summary >
200+ <pre ><code >$ subaligner_batch -m single -vd videos/ -sd subtitles/ -od aligned_subtitles/
158201$ subaligner_batch -m dual -vd videos/ -sd subtitles/ -od aligned_subtitles/
159202$ subaligner_batch -m dual -vd videos/ -sd subtitles/ -od aligned_subtitles/ -of ttml
160- ```
161- ```
162- # Run alignments with pipx
203+ </code ></pre >
204+ </details >
163205
164- $ pipx run subaligner -m single -v video.mp4 -s subtitle.srt
206+ <details >
207+ <summary >Run alignments with pipx</summary >
208+ <pre ><code >$ pipx run subaligner -m single -v video.mp4 -s subtitle.srt
165209$ pipx run subaligner -m dual -v video.mp4 -s subtitle.srt
166- ```
167- ```
168- # Run the module as a script
169- $ python -m subaligner -m single -v video.mp4 -s subtitle.srt
210+ </code ></pre >
211+ </details >
212+
213+ <details >
214+ <summary >Run the module as a script</summary >
215+ <pre ><code >$ python -m subaligner -m single -v video.mp4 -s subtitle.srt
170216$ python -m subaligner -m dual -v video.mp4 -s subtitle.srt
171- ```
172- ```
173- # Run alignments with the docker image
217+ </code ></pre >
218+ </details >
174219
175- $ docker pull baxtree/subaligner
220+ <details >
221+ <summary >Run alignments with the docker image</summary >
222+ <pre ><code >$ docker pull baxtree/subaligner
176223$ docker run -v `pwd`:`pwd` -w `pwd` -it baxtree/subaligner subaligner -m single -v video.mp4 -s subtitle.srt
177224$ docker run -v `pwd`:`pwd` -w `pwd` -it baxtree/subaligner subaligner -m dual -v video.mp4 -s subtitle.srt
178225$ docker run -it baxtree/subaligner subaligner -m single -v https://example.com/video.mp4 -s https://example.com/subtitle.srt -o subtitle_aligned.srt
179226$ docker run -it baxtree/subaligner subaligner -m dual -v https://example.com/video.mp4 -s https://example.com/subtitle.srt -o subtitle_aligned.srt
180- ```
227+ </code ></pre >
228+ </details >
229+
230+ ![ ] ( figures/screencast.gif )
231+
181232The aligned subtitle will be saved at ` subtitle_aligned.srt ` . To obtain the subtitle in raw JSON format for downstream
182233processing, replace the output file extension with ` .json ` . For details on CLIs, run ` subaligner -h ` or ` subaligner_batch -h ` ,
183234` subaligner_convert -h ` , ` subaligner_train -h ` and ` subaligner_tune -h ` for additional utilities. ` subaligner_1pass ` and ` subaligner_2pass ` are shortcuts for running ` subaligner ` with ` -m single ` and ` -m dual ` options, respectively.
184235
185- ![ ] ( figures/screencast.gif )
186-
187236## Advanced Usage
188- You can train a new model with your own audiovisual files and subtitle files:
189- ```
190- $ subaligner_train -vd VIDEO_DIRECTORY -sd SUBTITLE_DIRECTORY -tod TRAINING_OUTPUT_DIRECTORY
191- ```
237+ You can train a new model with your own audiovisual files and subtitle files,
238+ <details >
239+ <summary >Train a custom model</summary >
240+ <pre ><code >$ subaligner_train -vd VIDEO_DIRECTORY -sd SUBTITLE_DIRECTORY -tod TRAINING_OUTPUT_DIRECTORY
241+ </code ></pre >
242+ </details >
243+
192244Then you can apply it to your subtitle synchronisation with the aforementioned commands. For more details on how to train and tune your own model, please refer to [ Subaligner Docs] ( https://subaligner.readthedocs.io/en/latest/advanced_usage.html ) .
193245
194- For larger media files taking longer to process, you can reconfigure various timeouts using the following options:
195- ```
196- -mpt [Maximum waiting time in seconds when processing media files]
246+ For larger media files taking longer to process, you can reconfigure various timeouts using the following:
247+ <details >
248+ <summary >Options for tuning timeouts</summary >
249+ <pre ><code >-mpt [Maximum waiting time in seconds when processing media files]
197250-sat [Maximum waiting time in seconds when aligning each segment]
198251-fet [Maximum waiting time in seconds when embedding features for training]
199- ```
252+ </code ></pre >
253+ </details >
200254
201255## Anatomy
202256Subtitles can be out of sync with their companion audiovisual media files for a variety of causes including latency introduced by Speech-To-Text on live streams or calibration and rectification involving human intervention during post-production.
0 commit comments