How to have speaker wise transcript #129

amitkayal · 2023-04-24T11:01:58Z

amitkayal
Apr 24, 2023

Hello All,

I am using following options with deepgram API dg.transcription.sync_prerecorded(source, options) and then getting whole transcription into json element results.channels.alternatives.transcript but then results.channels.alternativeswords has speaker id associated. Is there anyway I can get speaker wise transcription? Otherwise, I need to write custom script to skim through each word and then extract from transcript to separate from transcript.

I was looking for something following format...

speaker 0:"xxxxxxxxx"
speaker 1:"yyyyyyyyyyy"
speaker 0:"mmmmmmmmm"
.....

options = {
   "punctuate": True,
   "model": 'general',
   "tier": 'enhanced',
   "diarize": True,
   "endpointing": 'true'
}

Thanks

Answered by scottstephenson

May 1, 2023

Does this Python script help you out?

import requests
import json

def get_speaker_wise_transcript(response):
    speaker_transcripts = {}

    for channel in response['results']['channels']:
        for word in channel['alternatives'][0]['words']:
            speaker_id = word['speaker']

            if speaker_id not in speaker_transcripts:
                speaker_transcripts[speaker_id] = word['word']
            else:
                speaker_transcripts[speaker_id] += ' ' + word['word']

    return speaker_transcripts

# Replace this with your Deepgram API key
API_KEY = "your-deepgram-api-key"

# Replace this with the path to your audio file
AUDIO_FILE_PATH = "path-to-your-audio-file.…

View full answer

rilhia · 2023-04-24T16:49:16Z

rilhia
Apr 24, 2023

Hi @amitkayal, I am not in the Deepgram team but have been playing around with it for a while. I recommend using the Smart Format feature which is explained here....

https://developers.deepgram.com/documentation/features/smart-format/

It should give you what you require. I am using the following set of features (shown in the URL I am using below)...

"https://api.deepgram.com/v1/listen?language=en&model=nova&diarize=true&smart_format=true"

You can see highlights of the output I get from this article I have recently put together on LinkedIn....

https://www.linkedin.com/pulse/audio-transcription-made-super-easy-richard-hall

I hope it helps!

3 replies

SandraRodgers Apr 27, 2023
Collaborator

This is so great @rilhia , thanks for sharing! I'm glad you're enjoying Deepgram!

rilhia Apr 27, 2023

Thanks @SandraRodgers. I'm gutted that I was a bit late to the game here. I applied for the Developer Relations role after playing around with this, but guess I missed the boat by a few weeks. Oh well, still happy I stumbled across this.

My thinking on @amitkayal's question was primarily on the features he selected. He didn't appear to have Paragraphs selected and I know that smart_format has that built in....and has been the best all-round feature for covering most bases I have needed to cover.

I raised a potential bug a while back about the timings specified for words. I was trying out my silly example of extracting words and building them up into sentences or songs, but found that the timings specified didn't always hit the mark. It does seem to be marginally better with Nova though.

SandraRodgers May 1, 2023
Collaborator

Thanks @rilhia , I think you might be right about the timings issue. That's being worked on at the moment.

Definitely check back for open Dev Rel positions. We don't have one open now but things are always in flux!

SandraRodgers · 2023-04-27T19:17:28Z

SandraRodgers
Apr 27, 2023
Collaborator

Hi @amitkayal ,

You will need to write a script to do this. I would suggest building it off the utterances feature, since that gives you a full utterance instead of just each individual word of the speaker.

Here's some information from a previous post that might help you https://github.com/orgs/deepgram/discussions/106#discussioncomment-5445821

Hope this is helpful!

Sandra

0 replies

scottstephenson · 2023-05-01T11:04:54Z

scottstephenson
May 1, 2023
Maintainer

Does this Python script help you out?

import requests
import json

def get_speaker_wise_transcript(response):
    speaker_transcripts = {}

    for channel in response['results']['channels']:
        for word in channel['alternatives'][0]['words']:
            speaker_id = word['speaker']

            if speaker_id not in speaker_transcripts:
                speaker_transcripts[speaker_id] = word['word']
            else:
                speaker_transcripts[speaker_id] += ' ' + word['word']

    return speaker_transcripts

# Replace this with your Deepgram API key
API_KEY = "your-deepgram-api-key"

# Replace this with the path to your audio file
AUDIO_FILE_PATH = "path-to-your-audio-file.wav"

# Define the headers for the request
headers = {
    "Authorization": f"Token {API_KEY}",
}

# Define the options for the request
options = {
    "punctuate": True,
    "model": 'general',
    "tier": 'enhanced',
    "diarize": True,
    "endpointing": 'true'
}

# Open the audio file
with open(AUDIO_FILE_PATH, 'rb') as audio_file:
    # Make the request to the Deepgram API
    response = requests.post(
        "https://api.deepgram.com/v1/listen",
        headers=headers,
        params=options,
        data=audio_file
    )

# Parse the JSON response
response_json = response.json()

# Get the speaker-wise transcripts
speaker_transcripts = get_speaker_wise_transcript(response_json)

# Print out the transcripts
for speaker, transcript in speaker_transcripts.items():
    print(f'speaker {speaker}: "{transcript}"')

Sample output for this one would look like this:

speaker 0: "Hello there, my name is John. Nice to meet you too, Sarah."
speaker 1: "Hi John, my name is Sarah. Nice to meet you."

If you want to capture the sequential nature of speakers switching back and forth, you can do this instead:

import requests
import json

def get_speaker_wise_transcript(response):
    speaker_transcripts = []
    last_speaker_id = None

    for channel in response['results']['channels']:
        for word in channel['alternatives'][0]['words']:
            speaker_id = word['speaker']

            if speaker_id != last_speaker_id:
                speaker_transcripts.append((speaker_id, word['word']))
            else:
                speaker_transcripts[-1] = (speaker_id, speaker_transcripts[-1][1] + ' ' + word['word'])

            last_speaker_id = speaker_id

    return speaker_transcripts

# Replace this with your Deepgram API key
API_KEY = "your-deepgram-api-key"

# Replace this with the path to your audio file
AUDIO_FILE_PATH = "path-to-your-audio-file.wav"

# Define the headers for the request
headers = {
    "Authorization": f"Token {API_KEY}",
}

# Define the options for the request
options = {
    "punctuate": True,
    "model": 'general',
    "tier": 'enhanced',
    "diarize": True,
    "endpointing": 'true'
}

# Open the audio file
with open(AUDIO_FILE_PATH, 'rb') as audio_file:
    # Make the request to the Deepgram API
    response = requests.post(
        "https://api.deepgram.com/v1/listen",
        headers=headers,
        params=options,
        data=audio_file
    )

# Parse the JSON response
response_json = response.json()

# Get the speaker-wise transcripts
speaker_transcripts = get_speaker_wise_transcript(response_json)

# Print out the transcripts
for speaker, transcript in speaker_transcripts:
    print(f'speaker {speaker}: "{transcript}"')

Sample output would look something like this:

speaker 0: "Hello there, my name is John."
speaker 1: "Hi John, my name is Sarah. Nice to meet you."
speaker 0: "Nice to meet you too, Sarah."

I hope that helps!

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deepgram

How to have speaker wise transcript #129

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Deepgram

How to have speaker wise transcript #129

Uh oh!

amitkayal Apr 24, 2023

Replies: 3 comments · 3 replies

Uh oh!

rilhia Apr 24, 2023

Uh oh!

SandraRodgers Apr 27, 2023 Collaborator

Uh oh!

rilhia Apr 27, 2023

Uh oh!

SandraRodgers May 1, 2023 Collaborator

Uh oh!

SandraRodgers Apr 27, 2023 Collaborator

Uh oh!

Uh oh!

scottstephenson May 1, 2023 Maintainer

amitkayal
Apr 24, 2023

Replies: 3 comments 3 replies

rilhia
Apr 24, 2023

SandraRodgers Apr 27, 2023
Collaborator

SandraRodgers May 1, 2023
Collaborator

SandraRodgers
Apr 27, 2023
Collaborator

scottstephenson
May 1, 2023
Maintainer