How to output JSON file? #152

IkeDoku · 2022-09-27T14:15:55Z

IkeDoku
Sep 27, 2022

I am following https://jaimeleal.github.io/how-to-speech-synthesis to create a voice synthesis dataset. The tutorial uses Amazon Transcribe and outputs a JSON and the JSON files are then use to create a metadata.csv and filelists. My question: How can I output a JSON file directly from Whisper?

Answered by jongwook

Sep 27, 2022

If you are using the command line, you can edit transcribe.py to add something like:

import json

...

        # (after line 303)
        # save JSON
        with open(os.path.join(output_dir, audio_basename + ".json"), "w", encoding="utf-8") as f:
            json.dump(result, f)

View full answer

jongwook · 2022-09-27T18:31:43Z

jongwook
Sep 27, 2022
Maintainer

If you are using the command line, you can edit transcribe.py to add something like:

import json

...

        # (after line 303)
        # save JSON
        with open(os.path.join(output_dir, audio_basename + ".json"), "w", encoding="utf-8") as f:
            json.dump(result, f)

3 replies

besimali Sep 27, 2022

Alternatively, you can try our whisper asr webservice which returns the json file. You can spin it up as a docker image and use the REST API for inference.

IkeDoku Sep 28, 2022
Author

Great :-) I will try it out soon.

IkeDoku Sep 28, 2022
Author

I used @jongwook suggestion. Works fine! :-)

IkeDoku · 2022-09-28T13:17:16Z

IkeDoku
Sep 28, 2022
Author

How should one proceed with the already output .txt, .SRT and .vtt files? Ideally, you would not transcribe all the audio files again (couple of thousands) to generate JSON files.

There are suggestions at https://stackoverflow.com/questions/11265575/converting-text-to-json/11265677 using Gelatin, Java or Go.

3 replies

jongwook Sep 28, 2022
Maintainer

There are some information in the result object that is not saved in txt/srt/vtt, like average log probability and the temperature applied for decoding. So in case you need those, you'll need to re-run the script :'(

IkeDoku Sep 29, 2022
Author

Ok, I guess I'll take the safe road. Thanks again :-)

IkeDoku Oct 1, 2022
Author

As I mentioned before I following @JaimeLeal (https://jaimeleal.github.io/how-to-speech-synthesis) blog tutorial and now stuck at extracting text the transcriptions (Step 4: metadata.csv and filelists). The tutorial uses Amazon Transcribe. Unfortunately, I wasn't able to reach the author that's why I trying it here. The code for extraction looks like this:
`import os
import json
import pandas as pd
from sklearn.model_selection import train_test_split
import numpy as np

path = '/dataset/transcripts/'
files = os.listdir(path)

Extract the transcript text from the output of Amazon Transcribe

rows = []
for index in range(1, len(files) + 1):
name = '{:04d}'.format(index)
with open(os.path.join(path, "{}.json".format(name))) as json_file:
data = json.load(json_file)
transcript = data["results"]["transcripts"][0]["transcript"]
rows.append([name, transcript])

data = pd.DataFrame(rows, columns = ["name", "transcript"])`

At line data = json.load(json_file) it returns
in <module> transcript = data["results"]["transcripts"][0]["transcript"] KeyError: 'results'

As far as I understood it tries to fetch the data from a url. Which I don't have ofc. But then, how can I make it work in my case?

How to output JSON file? #152

Uh oh!

IkeDoku Sep 27, 2022

Replies: 2 comments · 6 replies

Uh oh!

jongwook Sep 27, 2022 Maintainer

Uh oh!

besimali Sep 27, 2022

Uh oh!

IkeDoku Sep 28, 2022 Author

Uh oh!

Uh oh!

IkeDoku Sep 28, 2022 Author

Uh oh!

Uh oh!

IkeDoku Sep 28, 2022 Author

Uh oh!

jongwook Sep 28, 2022 Maintainer

Uh oh!

IkeDoku Sep 29, 2022 Author

Uh oh!

IkeDoku Oct 1, 2022 Author

Extract the transcript text from the output of Amazon Transcribe

IkeDoku
Sep 27, 2022

Replies: 2 comments 6 replies

jongwook
Sep 27, 2022
Maintainer

IkeDoku Sep 28, 2022
Author

IkeDoku Sep 28, 2022
Author

IkeDoku
Sep 28, 2022
Author

jongwook Sep 28, 2022
Maintainer

IkeDoku Sep 29, 2022
Author

IkeDoku Oct 1, 2022
Author