The purpose of this project is to transcribe audio files using the Whisper API.
The project consists of three files:
- api.py: This is the main file that contains the server implementation using FastAPI, which handles requests to transcribe audio files using the Whisper API.
- whisper.html: This is the HTML file that contains the form to upload an audio file and displays the results of the transcription. script.js: This is the JavaScript file that handles the UI logic of the whisper.html page.
- script.js: a JavaScript file that handles the file upload and form submission.
- WhisperAPI: class to interact with Whisper API
The following dependencies are required to run this project:
- Python 3.8 or higher
- FastAPI
- Jinja2
- Uvicorn
- aiohttp
- pydub
To run the project, follow these steps:
-
Obtain OpenAI API key here https://beta.openai.com/signup/ and save it to the
.envfile asAPI_KEY=YOUR_KEY -
Clone the repository:
git clone https://github.com/kurkuruzo/whisper.git
cd whisper-speech-to-text -
Create a virtual environment:
python -m venv venv source venv/bin/activate -
Install the dependencies:
pip install -r requirements.txt
-
Start the server:
python api.py
-
Open your web browser and go to http://localhost:8000/ to access the web app.
- Open the web app in your web browser.
- Click the "Choose File" button and select an audio file to upload.
- Click the "Upload" button to submit the form.
- Wait for the server to transcribe the audio file.
The transcription will appear on the page once it is complete.
This project was created by Kurkuruzo and uses the OpenAI Whisper API for speech-to-text transcription.