Craig Whisper Transcription Script

Powershell script that uses OpenAI whisper model locally to transcribe, clean, and combine the multi-track audio output of Discord Craig.

Overview

transcribe.ps1 is a PowerShell script designed to automate the transcription, cleaning, and combination of multi-track audio files, specifically those generated by a Discord Craig bot. The script leverages OpenAI's Whisper model to perform high-quality speech-to-text transcription locally on a Windows machine.

This script is ideal for users who need to process multi-track audio recordings, such as podcasts, interviews, or collaborative discussions, and generate consolidated, easy-to-read transcripts in multiple formats.

Features

Automated Transcription: Uses OpenAI's Whisper model (large-v2) locally for accurate transcription of audio files.
Multi-Track Support: Processes multiple audio files with a naming convention (n-playername_m) to identify speakers.
Error Handling: Logs errors and tracks transcription progress to allow resumption of interrupted processes.
Output Formats:
- Individual TSV files for each transcription.
- Combined TSV file with all transcripts sorted by timestamp.
- Plain text transcript (final_transcript.txt).
- Markdown-formatted transcript (transcript.md) with timestamps and speaker names (ideal for building a custom GPT).
Post-Processing:
- Consolidates consecutive identical lines from the same speaker.
- Filters out noisy words (e.g., "you") if they appear alone.
Statistics: Collects and displays processing statistics, including total files processed, success/failure counts, and average processing time.
Cleanup Option: Removes temporary files after processing if specified.

Requirements

To use the script, ensure the following dependencies are installed and available in your system's PATH:

PowerShell 7.5.0
Python (with pip installed)
OpenAI Whisper (pip install -U openai-whisper)
FFmpeg

The audio files must be in a format supported by FFmpeg (e.g., MP3, WAV, M4A, FLAC, OGG, AAC, MP4, WMA).

Usage

Run the script from a PowerShell terminal with the following parameters:

Parameters

-InputFolder (Required): Path to the folder containing audio files to transcribe.
-OutputFolder (Optional): Path to the folder where transcription outputs will be saved. Defaults to <InputFolder>\..\transcriptions.
-Force (Optional): Forces re-transcription of already processed files.
-PostProcessOnly (Optional): Skips transcription and only performs post-processing on existing transcripts.
-Cleanup (Optional): Removes temporary files after processing.

Examples

Transcribe all audio files in a folder:

.\transcribe.ps1 -InputFolder "C:\path\to\audio\files"

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.gitignore		.gitignore
README.md		README.md
check-gpu.py		check-gpu.py
transcribe.ps1		transcribe.ps1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Craig Whisper Transcription Script

Overview

Features

Requirements

Usage

Parameters

Examples

About

Uh oh!

Releases

Packages

Languages

jmutchek/craig-whisper

Folders and files

Latest commit

History

Repository files navigation

Craig Whisper Transcription Script

Overview

Features

Requirements

Usage

Parameters

Examples

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages