Skip to content

Commit d685846

Browse files
authored
feat: πŸ’¬ Podcast audio transcript in Python (#108)
* feat: πŸ’¬ Podcast audio transcript in Python * doc: πŸ“ add README * clean: πŸ—‘οΈ remove empty lines
1 parent a43f713 commit d685846

File tree

5 files changed

+53
-0
lines changed

5 files changed

+53
-0
lines changed

β€Žai/ai-endpoints/README.mdβ€Ž

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ Don't hesitate to use the source code and give us feedback.
2525

2626
### 🐍 Python 🐍
2727

28+
- [Podcast audio transcript](./podcast-transcript-whisper/python/)
2829
- [Chatbot with LangChain](./python-langchain-chatbot/): blocking mode, streaming mode, RAG mode.
2930
- [Streaming chatbot](./python-langchain-chatbot/) with LangChain
3031
- [Audio Summarizer Assistant](./audio-summarizer-assistant/) by connecting Speech-To-Text and LLM
Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
import os
2+
import json
3+
from openai import OpenAI
4+
5+
# πŸ› οΈ OpenAI client initialisation
6+
client = OpenAI(base_url=os.environ.get('OVH_AI_ENDPOINTS_WHISPER_URL'),
7+
api_key=os.environ.get('OVH_AI_ENDPOINTS_ACCESS_TOKEN'))
8+
9+
# 🎼 Audio file loading
10+
with open("../resources/TdT20-trimed-2.mp3", "rb") as audio_file:
11+
# πŸ“ Call Whisper transcription API
12+
transcript = client.audio.transcriptions.create(
13+
model=os.environ.get('OVH_AI_ENDPOINTS_WHISPER_MODEL'),
14+
file=audio_file,
15+
temperature=0.0,
16+
response_format="verbose_json",
17+
extra_body={"diarize": True},
18+
)
19+
20+
# πŸ”€ Merge the dialog said by the same speaker
21+
diarizedTranscript = ''
22+
speakers = ["AurΓ©lie", "Guillaume", "StΓ©phane"]
23+
previousSpeaker = -1
24+
jsonTranscript = json.loads(transcript.model_dump_json())
25+
26+
# πŸ’¬ Only the diarization field is useful
27+
for dialog in jsonTranscript["diarization"]:
28+
speaker = dialog.get("speaker")
29+
text = dialog.get("text")
30+
if (previousSpeaker == speaker):
31+
diarizedTranscript += f" {text}"
32+
else:
33+
diarizedTranscript += f"\n\n{speakers[speaker]}: {text}"
34+
previousSpeaker = speaker
35+
36+
37+
print(f"\nπŸ“ Diarized Transcript πŸ“:\n{diarizedTranscript}")
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# πŸ› οΈ Setup environment πŸ› οΈ
2+
- create the following environment variables:
3+
```bash
4+
OVH_AI_ENDPOINTS_WHISPER_URL=<whisper model URL>
5+
OVH_AI_ENDPOINTS_ACCESS_TOKEN=<your_access_token>
6+
OVH_AI_ENDPOINTS_WHISPER_MODEL=whisper-large-v3
7+
```
8+
- install required dependencies: `pip install -r requirements.txt`
9+
10+
# πŸš€ Run the application πŸš€
11+
12+
```bash
13+
$ python PodcastTranscriptWithWhisper.py
14+
```
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
openai==1.97.0
1.72 MB
Binary file not shown.

0 commit comments

Comments
Β (0)