Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 36 additions & 36 deletions whisper_youtube.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,9 @@
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "96kvih9mXkNN"
},
"source": [
"# **Youtube Videos Transcription with OpenAI's Whisper**\n",
"\n",
Expand All @@ -14,13 +17,16 @@
"Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multi-task model that can perform multilingual speech recognition as well as speech translation and language identification.\n",
"\n",
"This Notebook will guide you through the transcription of a Youtube video using Whisper. You'll be able to explore most inference parameters or use the Notebook as-is to store the transcript and video audio in your Google Drive."
],
"metadata": {
"id": "96kvih9mXkNN"
}
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "QshUbLqpX7L4"
},
"outputs": [],
"source": [
"#@markdown # **Check GPU type** 🕵️\n",
"\n",
Expand All @@ -41,20 +47,14 @@
"!nvidia-smi -L\n",
"\n",
"!nvidia-smi"
],
"metadata": {
"cellView": "form",
"id": "QshUbLqpX7L4"
},
"execution_count": null,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "IfG0E_WbRFI0",
"cellView": "form"
"cellView": "form",
"id": "IfG0E_WbRFI0"
},
"outputs": [],
"source": [
Expand Down Expand Up @@ -85,8 +85,8 @@
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "1zwGAsr4sIgd",
"cellView": "form"
"cellView": "form",
"id": "1zwGAsr4sIgd"
},
"outputs": [],
"source": [
Expand All @@ -109,6 +109,12 @@
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "TMhrSq_GZ6kA"
},
"outputs": [],
"source": [
"#@markdown # **Model selection** 🧠\n",
"\n",
Expand Down Expand Up @@ -137,16 +143,16 @@
" display(Markdown(\n",
" f\"**{Model} model is no longer available.**<br /> Please select one of the following:<br /> - {'<br /> - '.join(whisper.available_models())}\"\n",
" ))"
],
"metadata": {
"cellView": "form",
"id": "TMhrSq_GZ6kA"
},
"execution_count": null,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"cellView": "form",
"id": "xYLPZQX9S7tU"
},
"outputs": [],
"source": [
"#@markdown # **Video selection** 📺\n",
"\n",
Expand Down Expand Up @@ -182,7 +188,7 @@
" list_video_info = [ydl.extract_info(URL, download=False)]\n",
"\n",
" for video_info in list_video_info:\n",
" video_path_local_list.append(Path(f\"{video_info['id']}.wav\"))\n",
" video_path_local_list.append(Path(f\"{video_info['id']}.wav\"))\n",
"\n",
"elif Type == \"Google Drive\":\n",
" # video_path_drive = drive_mount_path / Path(video_path.lstrip(\"/\"))\n",
Expand Down Expand Up @@ -213,21 +219,15 @@
" if video_path_local.suffix == \".mp4\":\n",
" video_path_local = video_path_local.with_suffix(\".wav\")\n",
" result = subprocess.run([\"ffmpeg\", \"-i\", str(video_path_local.with_suffix(\".mp4\")), \"-vn\", \"-acodec\", \"pcm_s16le\", \"-ar\", \"16000\", \"-ac\", \"1\", str(video_path_local)])\n"
],
"metadata": {
"id": "xYLPZQX9S7tU",
"cellView": "form"
},
"execution_count": null,
"outputs": []
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "-X0qB9JAzMLY",
"cellView": "form",
"collapsed": true
"collapsed": true,
"id": "-X0qB9JAzMLY"
},
"outputs": [],
"source": [
Expand Down Expand Up @@ -374,12 +374,12 @@
},
{
"cell_type": "code",
"source": [],
"execution_count": null,
"metadata": {
"id": "Ad6n1m4deAHp"
},
"execution_count": null,
"outputs": []
"outputs": [],
"source": []
}
],
"metadata": {
Expand All @@ -397,4 +397,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
}
}