Skip to content

Commit aeb864c

Browse files
committed
chore: Streamline transcription services and remove Anthropic API.
- Refactored application to use OpenAI Whisper API or AssemblyAI for audio file transcription - Removed functionality for generating summaries using GPT-4 and Claude models - Updated API key configuration process for OpenAI and AssemblyAI - Docker commands no longer require Anthropic API key - Updated documentation to reflect removal of Anthropic model details and configuration references
1 parent 96e8c22 commit aeb864c

File tree

1 file changed

+8
-35
lines changed

1 file changed

+8
-35
lines changed

README.md

Lines changed: 8 additions & 35 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
[![Build](https://github.com/echohello-dev/transcribe-me/actions/workflows/build.yaml/badge.svg)](https://github.com/echohello-dev/transcribe-me/actions/workflows/build.yaml)
66

7-
Transcribe Me is a CLI-driven Python application that transcribes audio files using either the OpenAI Whisper API or AssemblyAI, and generates summaries of the transcriptions using OpenAI's GPT-4 and Anthropic's Claude models.
7+
Transcribe Me is a CLI-driven Python application that transcribes audio files using either the OpenAI Whisper API or AssemblyAI.
88

99
```mermaid
1010
graph TD
@@ -14,11 +14,9 @@ graph TD
1414
D --Yes--> E[Transcribe with AssemblyAI]
1515
D --No--> F[Transcribe with OpenAI]
1616
E --> G[Generate Additional Outputs]
17-
F --> H[Generate Summaries]
18-
G --> I[Save Transcription and Outputs]
19-
H --> J[Save Transcription and Summaries]
17+
F --> I[Save Transcription]
18+
G --> I
2019
I --> K[Clean Up Temporary Files]
21-
J --> K
2220
K --> B
2321
C --No--> L[Print Warning]
2422
L --> B
@@ -27,9 +25,7 @@ graph TD
2725
## :key: Key Features
2826

2927
- **Audio Transcription**: Transcribes audio files using either the OpenAI Whisper API or AssemblyAI. It supports both MP3 and M4A formats.
30-
- **Summary Generation**: Generates summaries of the transcriptions using both OpenAI's GPT-4 and Anthropic's Claude models when using OpenAI for transcription.
3128
- **AssemblyAI Features**: When using AssemblyAI, provides additional outputs including Speaker Diarization, Summary, Sentiment Analysis, Key Phrases, and Topic Detection.
32-
- **Configurable Models**: Supports multiple models for OpenAI and Anthropic, with configurable temperature, max_tokens, and system prompts.
3329
- **Supports Audio Files**: Supports audio files in `.m4a` and `.mp3` formats.
3430
- **Supports Docker**: Can be run in a Docker container for easy deployment and reproducibility.
3531

@@ -70,11 +66,10 @@ This has been tested with macOS, your mileage may vary on other operating system
7066
transcribe-me install
7167
```
7268

73-
This command will prompt you to enter your API keys for OpenAI, Anthropic, and AssemblyAI if they are not already provided in environment variables. You can also set the API keys in environment variables:
69+
This command will prompt you to enter your API keys for OpenAI and AssemblyAI if they are not already provided in environment variables. You can also set the API keys in environment variables:
7470

7571
```bash
7672
export OPENAI_API_KEY=your_api_key
77-
export ANTHROPIC_API_KEY=your_api_key
7873
export ASSEMBLYAI_API_KEY=your_api_key
7974
```
8075

@@ -85,20 +80,14 @@ This has been tested with macOS, your mileage may vary on other operating system
8580
transcribe-me
8681
```
8782

88-
The application will transcribe each audio file in the input directory and save the transcriptions to the output directory. It will also generate summaries of the transcriptions using the configured models and save them to the output directory.
83+
The application will transcribe each audio file in the input directory and save the transcriptions to the output directory.
8984

9085
4. (Optional) You can archive the input directory to keep track of the processed audio files:
9186

9287
```bash
9388
transcribe-me archive
9489
```
9590

96-
5. (Optional) You can also transcribe only the audio files that have not been transcribed yet:
97-
98-
```bash
99-
transcribe-me only
100-
```
101-
10291
### Docker
10392

10493
You can also run the application using Docker:
@@ -115,14 +104,12 @@ You can also run the application using Docker:
115104
ghcr.io/echohello-dev/transcribe-me:latest install
116105
```
117106

118-
119107
3. Run the following command to run the application in Docker:
120108

121109
```bash
122110
docker run \
123111
--rm \
124112
-e OPENAI_API_KEY \
125-
-e ANTHROPIC_API_KEY \
126113
-e ASSEMBLYAI_API_KEY \
127114
-v $(pwd)/archive:/app/archive \
128115
-v $(pwd)/input:/app/input \
@@ -142,7 +129,6 @@ You can also run the application using Docker:
142129
image: ghcr.io/echohello-dev/transcribe-me:latest
143130
environment:
144131
- OPENAI_API_KEY
145-
- ANTHROPIC_API_KEY
146132
- ASSEMBLYAI_API_KEY
147133
volumes:
148134
- ./input:/app/input
@@ -159,13 +145,13 @@ You can also run the application using Docker:
159145

160146
This command mounts the `input`, `output`, `archive`, and `.transcribe.yaml` configuration file into the Docker container. See [`compose.example.yaml`](./compose.example.yaml) for an example configuration.
161147

162-
Make sure to replace `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, and `ASSEMBLYAI_API_KEY` with your actual API keys. Also make sure to create the `.transcribe.yaml` configuration file in the same directory as the `docker-compose.yml` file.
148+
Make sure to replace `OPENAI_API_KEY` and `ASSEMBLYAI_API_KEY` with your actual API keys. Also make sure to create the `.transcribe.yaml` configuration file in the same directory as the `docker-compose.yml` file.
163149

164150
## :rocket: How it Works
165151

166152
The Transcribe Me application follows a straightforward workflow:
167153

168-
1. **Load Configuration**: The application loads the configuration from the `.transcribe.yaml` file, which includes settings for input/output directories, models, and their configurations.
154+
1. **Load Configuration**: The application loads the configuration from the `.transcribe.yaml` file, which includes settings for input/output directories and transcription service.
169155
2. **Get Audio Files**: The application gets a list of audio files from the input directory specified in the configuration.
170156
3. **Check Existing Transcriptions**: For each audio file, the application checks if there is an existing transcription file. If a transcription file exists, it skips to the next audio file.
171157
4. **Transcribe Audio File**: If no transcription file exists, the application transcribes the audio file using either the OpenAI Whisper API or AssemblyAI, based on the configuration.
@@ -185,19 +171,6 @@ Here is an example configuration file:
185171
```yaml
186172
use_assemblyai: false # Set to true to use AssemblyAI instead of OpenAI for transcription
187173
188-
openai:
189-
models:
190-
- temperature: 0.1
191-
max_tokens: 2048
192-
model: gpt-4
193-
system_prompt: Generate a summary with key points in bold and a Next Steps section, use Markdown, be a concise tech expert but kind to non-technical readers.
194-
195-
anthropic:
196-
models:
197-
- temperature: 0.8
198-
model: claude-3-sonnet-20240229
199-
system_prompt: Generate something creative and interesting, use Markdown, be a concise tech expert but kind to non-technical readers.
200-
201174
input_folder: input
202175
output_folder: output
203176
```
@@ -236,7 +209,7 @@ output_folder: output
236209
make install
237210
```
238211

239-
3. Run the `transcribe-me install` command to create the `.transcribe.yaml` configuration file and provide your API keys for OpenAI, Anthropic, and AssemblyAI:
212+
3. Run the `transcribe-me install` command to create the `.transcribe.yaml` configuration file and provide your API keys for OpenAI and AssemblyAI:
240213

241214
```bash
242215
make transcribe-install

0 commit comments

Comments
 (0)