You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
chore: Streamline transcription services and remove Anthropic API.
- Refactored application to use OpenAI Whisper API or AssemblyAI for audio file transcription
- Removed functionality for generating summaries using GPT-4 and Claude models
- Updated API key configuration process for OpenAI and AssemblyAI
- Docker commands no longer require Anthropic API key
- Updated documentation to reflect removal of Anthropic model details and configuration references
Transcribe Me is a CLI-driven Python application that transcribes audio files using either the OpenAI Whisper API or AssemblyAI, and generates summaries of the transcriptions using OpenAI's GPT-4 and Anthropic's Claude models.
7
+
Transcribe Me is a CLI-driven Python application that transcribes audio files using either the OpenAI Whisper API or AssemblyAI.
8
8
9
9
```mermaid
10
10
graph TD
@@ -14,11 +14,9 @@ graph TD
14
14
D --Yes--> E[Transcribe with AssemblyAI]
15
15
D --No--> F[Transcribe with OpenAI]
16
16
E --> G[Generate Additional Outputs]
17
-
F --> H[Generate Summaries]
18
-
G --> I[Save Transcription and Outputs]
19
-
H --> J[Save Transcription and Summaries]
17
+
F --> I[Save Transcription]
18
+
G --> I
20
19
I --> K[Clean Up Temporary Files]
21
-
J --> K
22
20
K --> B
23
21
C --No--> L[Print Warning]
24
22
L --> B
@@ -27,9 +25,7 @@ graph TD
27
25
## :key: Key Features
28
26
29
27
-**Audio Transcription**: Transcribes audio files using either the OpenAI Whisper API or AssemblyAI. It supports both MP3 and M4A formats.
30
-
-**Summary Generation**: Generates summaries of the transcriptions using both OpenAI's GPT-4 and Anthropic's Claude models when using OpenAI for transcription.
31
28
-**AssemblyAI Features**: When using AssemblyAI, provides additional outputs including Speaker Diarization, Summary, Sentiment Analysis, Key Phrases, and Topic Detection.
32
-
-**Configurable Models**: Supports multiple models for OpenAI and Anthropic, with configurable temperature, max_tokens, and system prompts.
33
29
-**Supports Audio Files**: Supports audio files in `.m4a` and `.mp3` formats.
34
30
-**Supports Docker**: Can be run in a Docker container for easy deployment and reproducibility.
35
31
@@ -70,11 +66,10 @@ This has been tested with macOS, your mileage may vary on other operating system
70
66
transcribe-me install
71
67
```
72
68
73
-
This command will prompt you to enter your API keys forOpenAI, Anthropic, and AssemblyAI if they are not already providedin environment variables. You can also set the API keys in environment variables:
69
+
This command will prompt you to enter your API keys forOpenAI and AssemblyAI if they are not already providedin environment variables. You can also set the API keys in environment variables:
74
70
75
71
```bash
76
72
export OPENAI_API_KEY=your_api_key
77
-
export ANTHROPIC_API_KEY=your_api_key
78
73
export ASSEMBLYAI_API_KEY=your_api_key
79
74
```
80
75
@@ -85,20 +80,14 @@ This has been tested with macOS, your mileage may vary on other operating system
85
80
transcribe-me
86
81
```
87
82
88
-
The application will transcribe each audio file in the input directory and save the transcriptions to the output directory. It will also generate summaries of the transcriptions using the configured models and save them to the output directory.
83
+
The application will transcribe each audio file in the input directory and save the transcriptions to the output directory.
89
84
90
85
4. (Optional) You can archive the input directory to keep track of the processed audio files:
91
86
92
87
```bash
93
88
transcribe-me archive
94
89
```
95
90
96
-
5. (Optional) You can also transcribe only the audio files that have not been transcribed yet:
97
-
98
-
```bash
99
-
transcribe-me only
100
-
```
101
-
102
91
### Docker
103
92
104
93
You can also run the application using Docker:
@@ -115,14 +104,12 @@ You can also run the application using Docker:
3. Run the following command to run the application in Docker:
120
108
121
109
```bash
122
110
docker run \
123
111
--rm \
124
112
-e OPENAI_API_KEY \
125
-
-e ANTHROPIC_API_KEY \
126
113
-e ASSEMBLYAI_API_KEY \
127
114
-v $(pwd)/archive:/app/archive \
128
115
-v $(pwd)/input:/app/input \
@@ -142,7 +129,6 @@ You can also run the application using Docker:
142
129
image: ghcr.io/echohello-dev/transcribe-me:latest
143
130
environment:
144
131
- OPENAI_API_KEY
145
-
- ANTHROPIC_API_KEY
146
132
- ASSEMBLYAI_API_KEY
147
133
volumes:
148
134
- ./input:/app/input
@@ -159,13 +145,13 @@ You can also run the application using Docker:
159
145
160
146
This command mounts the `input`, `output`, `archive`, and `.transcribe.yaml` configuration file into the Docker container. See [`compose.example.yaml`](./compose.example.yaml) for an example configuration.
161
147
162
-
Make sure to replace `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, and `ASSEMBLYAI_API_KEY` with your actual API keys. Also make sure to create the `.transcribe.yaml` configuration file in the same directory as the `docker-compose.yml` file.
148
+
Make sure to replace `OPENAI_API_KEY` and `ASSEMBLYAI_API_KEY` with your actual API keys. Also make sure to create the `.transcribe.yaml` configuration file in the same directory as the `docker-compose.yml` file.
163
149
164
150
## :rocket: How it Works
165
151
166
152
The Transcribe Me application follows a straightforward workflow:
167
153
168
-
1. **Load Configuration**: The application loads the configuration from the `.transcribe.yaml` file, which includes settings for input/output directories, models, and their configurations.
154
+
1. **Load Configuration**: The application loads the configuration from the `.transcribe.yaml` file, which includes settings for input/output directoriesand transcription service.
169
155
2. **Get Audio Files**: The application gets a list of audio files from the input directory specified in the configuration.
170
156
3. **Check Existing Transcriptions**: For each audio file, the application checks if there is an existing transcription file. If a transcription file exists, it skips to the next audio file.
171
157
4. **Transcribe Audio File**: If no transcription file exists, the application transcribes the audio file using either the OpenAI Whisper API or AssemblyAI, based on the configuration.
@@ -185,19 +171,6 @@ Here is an example configuration file:
185
171
```yaml
186
172
use_assemblyai: false# Set to true to use AssemblyAI instead of OpenAI for transcription
187
173
188
-
openai:
189
-
models:
190
-
- temperature: 0.1
191
-
max_tokens: 2048
192
-
model: gpt-4
193
-
system_prompt: Generate a summary with key points in bold and a Next Steps section, use Markdown, be a concise tech expert but kind to non-technical readers.
194
-
195
-
anthropic:
196
-
models:
197
-
- temperature: 0.8
198
-
model: claude-3-sonnet-20240229
199
-
system_prompt: Generate something creative and interesting, use Markdown, be a concise tech expert but kind to non-technical readers.
200
-
201
174
input_folder: input
202
175
output_folder: output
203
176
```
@@ -236,7 +209,7 @@ output_folder: output
236
209
make install
237
210
```
238
211
239
-
3. Run the `transcribe-me install`command to create the `.transcribe.yaml` configuration file and provide your API keys for OpenAI, Anthropic, and AssemblyAI:
212
+
3. Run the `transcribe-me install`command to create the `.transcribe.yaml` configuration file and provide your API keys for OpenAI and AssemblyAI:
0 commit comments