vcon-dev
diff --git a/‎.env.example‎
Lines changed: 3 additions & 1 deletion b/‎.env.example‎
Lines changed: 3 additions & 1 deletion
diff --git a/‎poetry.lock‎
Lines changed: 38 additions & 18 deletions b/‎poetry.lock‎
Lines changed: 38 additions & 18 deletions
diff --git a/‎pyproject.toml‎
Lines changed: 1 addition & 0 deletions b/‎pyproject.toml‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎server/links/groq_whisper/README.md‎
Lines changed: 104 additions & 0 deletions b/‎server/links/groq_whisper/README.md‎
Lines changed: 104 additions & 0 deletions
@@ -1,4 +1,3 @@
-
 REDIS_URL=redis://redis
 
 # Leave this blank to disable API security
@@ -9,3 +8,6 @@ CONSERVER_API_TOKEN=
 # modify the values in config.yml as needed
 # and set CONSERVER_CONFIG_FILE to ./config.yml below
 CONSERVER_CONFIG_FILE= 
+
+# Groq API key for Whisper transcription
+GROQ_API_KEY=your_groq_api_key_here
@@ -25,6 +25,7 @@ slack-sdk = "^3.27.1"
 boto3 = "^1.34.52"
 deepgram-sdk = "^3.1.5"
 openai = ">=1.54.3"
+groq = "^0.4.0"
 psycopg2-binary = "^2.9.9"
 pymongo = "^4.6.2"
 elasticsearch = "^8.13.1"
 
@@ -0,0 +1,104 @@
+# Groq Whisper Link
+
+A vCon-server link that provides automatic transcription of audio content using Groq's implementation of Whisper ASR.
+
+## Overview
+
+This link processes vCon objects containing audio recordings and transcribes them using Groq's Whisper API. The transcription results are added back to the vCon as analysis entries.
+
+## Requirements
+
+- Python 3.12+
+- A valid Groq API key
+- The `groq` Python package
+
+## Installation
+
+1. Install the required dependencies:
+
+```bash
+poetry add groq
+```
+
+2. Set your Groq API key in the environment:
+
+```bash
+export GROQ_API_KEY=your_groq_api_key_here
+```
+
+Alternatively, you can add the API key to your `.env` file:
+
+```
+GROQ_API_KEY=your_groq_api_key_here
+```
+
+## Configuration
+
+The link accepts the following configuration options:
+
+| Option | Description | Default |
+|--------|-------------|---------|
+| `API_KEY` | Groq API key for authentication | From GROQ_API_KEY environment variable |
+| `minimum_duration` | Minimum duration (in seconds) of audio to transcribe | 30 |
+
+## Usage
+
+To use this link in a vCon processing chain:
+
+```python
+from server.links.groq_whisper import run
+
+result = run(
+    vcon_uuid="your-vcon-uuid",
+    link_name="groq_whisper",
+    opts={
+        "minimum_duration": 60  # Optional override
+    }
+)
+```
+
+## How It Works
+
+1. The link retrieves the vCon object from Redis
+2. For each recording dialog in the vCon:
+   - Skips dialogs shorter than the minimum duration
+   - Skips dialogs that already have a transcript
+   - Extracts audio content (from inline base64 or external URL)
+   - Sends the audio to Groq's Whisper API for transcription
+   - Adds transcription results as a new analysis entry
+3. Stores the updated vCon back to Redis
+
+## Testing
+
+To run the tests:
+
+```bash
+# Set a dummy API key for testing
+export GROQ_API_KEY=test_api_key_for_testing
+
+# Run the tests
+pytest server/links/groq_whisper/test_groq_whisper.py -v
+```
+
+## Response Format
+
+The Groq Whisper API returns transcription results in the following format:
+
+```json
+{
+  "text": "The complete transcription text.",
+  "chunks": [
+    {
+      "text": "Chunk of transcription",
+      "timestamp": [0.0, 5.0]
+    },
+    {
+      "text": "Another chunk",
+      "timestamp": [5.1, 10.0]
+    }
+  ],
+  "language": "en"
+}
+```
+
+This response is stored in the vCon's analysis section as a transcript entry.