Skip to content

Commit ea681ca

Browse files
committed
wip
1 parent dbd2247 commit ea681ca

File tree

4 files changed

+113
-7
lines changed

4 files changed

+113
-7
lines changed
Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# References
2+
3+
- [openai/whisper](https://github.com/openai/whisper)
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
import whisper
2+
3+
model = whisper.load_model("turbo")
4+
5+
# load audio and pad/trim it to fit 30 seconds
6+
audio = whisper.load_audio("apps/16_whisper_transcription/sample_audio.wav")
7+
audio = whisper.pad_or_trim(audio)
8+
9+
# make log-Mel spectrogram and move to the same device as the model
10+
mel = whisper.log_mel_spectrogram(audio).to(model.device)
11+
12+
# detect the spoken language
13+
_, probs = model.detect_language(mel)
14+
print(f"Detected language: {max(probs, key=probs.get)}")
15+
16+
# decode the audio
17+
options = whisper.DecodingOptions()
18+
result = whisper.decode(model, mel, options)
19+
20+
# print the recognized text
21+
print(result.text)

poetry.lock

Lines changed: 88 additions & 7 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ lxml = "^5.3.0"
3838
nest-asyncio = "^1.6.0"
3939
typer = "^0.12.5"
4040
azure-cognitiveservices-speech = "^1.40.0"
41+
openai-whisper = "^20240930"
4142

4243
[tool.poetry.group.dev.dependencies]
4344
pre-commit = "^4.0.0"

0 commit comments

Comments
 (0)