Skip to content

Commit bcbd35d

Browse files
authored
Update whisper.md
1 parent eb1c98c commit bcbd35d

File tree

1 file changed

+5
-10
lines changed
  • content/learning-paths/servers-and-cloud-computing/whisper

1 file changed

+5
-10
lines changed

content/learning-paths/servers-and-cloud-computing/whisper/whisper.md

Lines changed: 5 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ layout: "learningpathall"
1010

1111
## Before you begin
1212

13-
This Learning Path demonstrates how to run the whisper-large-v3-turbo model as an application that takes the audio input and computes the text transcript of it. The instructions in this Learning Path have been designed for Arm servers running Ubuntu 24.04 LTS. You need an Arm server instance with 32 cores, atleast 8GB of RAM and 32GB disk to run this example. The instructions have been tested on a AWS c8g.8xlarge instance.
13+
This Learning Path demonstrates how to run the [whisper-large-v3-turbo model](https://huggingface.co/openai/whisper-large-v3-turbo) as an application that takes an audio input and computes the text transcript of it. The instructions in this Learning Path have been designed for Arm servers running Ubuntu 24.04 LTS. You need an Arm server instance with 32 cores, atleast 8GB of RAM and 32GB disk to run this example. The instructions have been tested on a AWS Graviton4 `c8g.8xlarge` instance.
1414

1515
## Overview
1616

@@ -60,13 +60,8 @@ wget https://www.voiptroubleshooter.com/open_speech/american/OSR_us_000_0010_8k.
6060

6161
You will use the Hugging Face `transformers` framework to help process the audio. It contains classes that configures the model, and prepares it for inference. `pipeline` is an end-to-end function for NLP tasks. In the code below, it's configured to do pre- and post-processing of the sample in this example, as well as running the actual inference.
6262

63-
Create a python file:
63+
Using a file editor of your choice, create a python file named `whisper-application.py` with the content shown below:
6464

65-
```bash
66-
vim whisper-application.py
67-
```
68-
69-
Write the following code in the `whisper-application.py` file:
7065
```python { file_name="whisper-application.py" }
7166
import torch
7267
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
@@ -123,16 +118,16 @@ msg = f'\nInferencing elapsed time: {seconds:4.2f} seconds\n'
123118
print(msg)
124119
```
125120

126-
Enable verbose mode for the output and run the script.
121+
Enable verbose mode for the output and run the script:
127122

128123
```bash
129124
export DNNL_VERBOSE=1
130125
python3 whisper-application.py
131126
```
132127

133-
You should see output similar to the image below with the log since we enabled verbose, transcript of the audio and the `Inference elapsed time`.
128+
You should see output similar to the image below with a log output, transcript of the audio and the `Inference elapsed time`.
134129

135130
![frontend](whisper_output_no_flags.png)
136131

137132

138-
You've now run a benchmark on the Whisper model. Continue to the next section to configure flags to learn about performance-enhancing features.
133+
You've now run the Whisper model successfully on your Arm-based CPU. Continue to the next section to configure flags that can increase the performance your running model.

0 commit comments

Comments
 (0)