You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/learning-paths/servers-and-cloud-computing/funASR/1_asr.md
+15-10Lines changed: 15 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
---
2
-
title: Introduction to Automatic Speech Recognition (ASR)
2
+
title: Introduction to Automatic Speech Recognition
3
3
weight: 2
4
4
5
5
### FIXED, DO NOT MODIFY
@@ -19,19 +19,24 @@ At its core, ASR transforms spoken language into written text. Despite seeming s
19
19
ASR is used in myriad applications across various domains:
20
20
21
21
* Virtual Assistants and Chatbots - ASR enables natural language interactions in chatbots and virtual assistants, enhancing customer support, information retrieval, and even entertainment.
22
-
Use case: a developer can use ASR to create a voice-controlled chatbot that helps users troubleshoot technical issues or navigate complex software applications.
23
22
24
-
* Data Analytics and Business Intelligence - Leveraging ASR to analyze customer interactions, such as calls and surveys, to uncover trends and enhance services.
25
-
Use case: an app that transcribes customer service calls and applies sentiment analysis to pinpoint areas for improving customer satisfaction.
23
+
{{% notice Use case %}}A developer can use ASR to create a voice-controlled chatbot that helps users troubleshoot technical issues or navigate complex software applications.{{% /notice %}}
26
24
27
-
* ASR-powered Accessibility Tools - Developing ASR-powered tools to enhance accessibility for users with disabilities. Applications include voice-controlled interfaces, real-time captioning, and text-to-speech conversion.
28
-
Use case: a developer can create a tool that uses ASR to generate real-time captions for online meetings or presentations, making them accessible to deaf and hard-of-hearing individuals.
25
+
* Data Analytics and Business Intelligence - leveraging ASR to analyze customer interactions, such as calls and surveys, to uncover trends and enhance services.
29
26
30
-
* Integrating ASR into Software Development Tools - Enhancing efficiency and productivity by incorporating ASR into development environments. This can include voice commands for code editing, debugging, or version control.
31
-
Use case: a developer can build an IDE plugin that enables voice-controlled coding, file navigation, and test execution, streamlining the workflow and reducing reliance on manual input.
27
+
{{% notice Use case %}}An app that transcribes customer service calls and applies sentiment analysis to pinpoint areas for improving customer satisfaction.{{% /notice %}}
32
28
33
-
* Smart Homes and IoT with ASR - Enhancing convenience with voice control of smart home devices and appliances.
34
-
Use case: a voice-activated home automation system that lets users control lighting, temperature, and entertainment systems with natural language commands.
29
+
* ASR-powered Accessibility Tools - developing ASR-powered tools to enhance accessibility for users with disabilities. Applications include voice-controlled interfaces, real-time captioning, and text-to-speech conversion.
30
+
31
+
{{% notice Use case %}}A developer can create a tool that uses ASR to generate real-time captions for online meetings or presentations, making them accessible to deaf and hard-of-hearing individuals.{{% /notice %}}
32
+
33
+
* Integrating ASR into Software Development Tools - enhancing efficiency and productivity by incorporating ASR into development environments. This can include voice commands for code editing, debugging, or version control.
34
+
35
+
{{% notice Use case %}}A developer can build an IDE plugin that enables voice-controlled coding, file navigation, and test execution, streamlining the workflow and reducing reliance on manual input.{{% /notice %}}
36
+
37
+
* Smart Homes and IoT with ASR - enhancing convenience with voice control of smart home devices and appliances.
38
+
39
+
{{% notice Use case %}}A voice-activated home automation system that lets users control lighting, temperature, and entertainment systems with natural language commands.{{% /notice %}}
Copy file name to clipboardExpand all lines: content/learning-paths/servers-and-cloud-computing/funASR/2_modelscope.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
---
2
-
title: ModelScope - Open-Source Pre-trained AI models hub
2
+
title: ModelScope - an OpenSource Pre-trained AI Models Hub
3
3
weight: 3
4
4
5
5
### FIXED, DO NOT MODIFY
@@ -8,18 +8,18 @@ layout: learningpathall
8
8
9
9
## Before you begin
10
10
11
-
To follow the instructions for this Learning Path, you will need an Arm-based server running Ubuntu 22.04 LTS or later, with at least 8 cores, 16GB of RAM, and 30GB of disk storage.
11
+
Before you begin following the instructions for this Learning Path, make sure you have an Arm-based server running Ubuntu 22.04 LTS or later, with at least 8 cores, 16GB of RAM, and 30GB of disk storage.
12
12
13
13
## What is ModelScope?
14
14
[ModelScope](https://github.com/modelscope/modelscope/) is an open-source platform designed to simplify the integration of AI models into applications. It offers a wide variety of pre-trained models for tasks such as image recognition, natural language processing, and audio analysis. With ModelScope, you can seamlessly integrate these models into your projects using just a few lines of code.
15
15
16
16
Key benefits of ModelScope:
17
17
18
-
* Model Diversity - Access a wide range of models for various tasks, including Automatic Speech Recognition (ASR), natural language processing (NLP), and computer vision.
18
+
* Model Diversity - access a wide range of models for various tasks, including Automatic Speech Recognition (ASR), natural language processing (NLP), and computer vision.
19
19
20
20
* Ease of Use - ModelScope provides a user-friendly interface and APIs that enable seamless model integration.
21
21
22
-
* Community Support - Benefit from a vibrant community of developers and researchers who actively contribute to and support ModelScope.
22
+
* Community Support - benefit from a vibrant community of developers and researchers who actively contribute to and support ModelScope.
23
23
24
24
25
25
## Arm CPU Acceleration
@@ -94,7 +94,7 @@ result = word_segmentation(text)
94
94
print(result)
95
95
```
96
96
97
-
This piece of code specifies a model and provides a Chinese sentence for the model to segment.
97
+
This piece of code specifies a model and provides a Chinese sentence for the model to segment;
98
98
"A New Year’s greeting message to share with everyone."
Copy file name to clipboardExpand all lines: content/learning-paths/servers-and-cloud-computing/funASR/3_funasr.md
+12-14Lines changed: 12 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -147,7 +147,7 @@ Result:
147
147
148
148
The output shows "欢迎大家来到达摩社区进行体验" which means "Welcome everyone to the Dharma community to explore and experience!" as expected.
149
149
150
-
You can also observe that the spacing between the third and sixth characters is very short. This is because they are combined with other characters, as discussed in the previous section.
150
+
You can also see that the spacing between the third and sixth characters is short. This is because they are combined with other characters, as discussed in the previous section.
151
151
152
152
You can now build a speech processing pipeline. The output of the speech recognition module serves as the input for the semantic segmentation model, enabling you to validate the accuracy of the recognized results. Copy the code shown below in a file named `funasr_test3.py`:
153
153
@@ -223,9 +223,7 @@ Good, the result is exactly what you are looking for.
223
223
224
224
## Paraformer: Fast and Accurate Parallel Transformer for Non-autoregressive End-to-End Speech Recognition
225
225
226
-
Now you can look at a more advanced speech recognition model, [Paraformer](https://aclanthology.org/2020.wnut-1.18/).
227
-
228
-
Paraformer is a novel architecture for automatic speech recognition (ASR) designed for both speed and accuracy. Unlike traditional models, it leverages a parallel transformer architecture, enabling simultaneous processing of multiple parts of the input speech. This parallel processing capability leads to significantly faster inference, making Paraformer well-suited for real-time ASR applications where responsiveness is crucial.
226
+
[Paraformer](https://aclanthology.org/2020.wnut-1.18/) is a novel architecture for automatic speech recognition (ASR) designed for both speed and accuracy. Unlike traditional models, it leverages a parallel transformer architecture, enabling simultaneous processing of multiple parts of the input speech. This parallel processing capability leads to significantly faster inference, making Paraformer well-suited for real-time ASR applications where responsiveness is crucial.
229
227
230
228
Furthermore, Paraformer has demonstrated state-of-the-art accuracy on several benchmark datasets, showcasing its effectiveness in accurately transcribing speech. This combination of speed and accuracy makes Paraformer a promising advancement in the field of ASR, opening up new possibilities for high-performance speech recognition systems.
231
229
@@ -504,10 +502,10 @@ python --version
504
502
```
505
503
506
504
{{% notice Note %}}
507
-
The update-alternatives command is a Debian-based Linux utility (used in Ubuntu, Debian, etc.) for managing symbolic links to different versions of software alternatives. It allows you to easily switch between multiple installed versions of the same program.
505
+
The `update-alternatives` command is a utility in Debian-based Linux distributions, such as Ubuntu and Debian, that manages symbolic links for different software versions. It simplifies the process of switching between multiple installed versions of the same program.
Please reference this [link](https://learn.arm.com/learning-paths/servers-and-cloud-computing/pytorch-llama/pytorch-llama/)to learn more the detail description of those instructions.
542
+
See this Learning Path [Run a Large Language Model Chatbot on Arm servers](https://learn.arm.com/learning-paths/servers-and-cloud-computing/pytorch-llama/pytorch-llama/)for further information.
545
543
{{% /notice %}}
546
544
547
-
Once you have installed the optimized PyTorch, we can enabled bfloat16 fast math kernels by setting DNNL_DEFAULT_FPMATH_MODE.
545
+
Once you have installed the optimized PyTorch, enable bfloat16 fast math kernels by setting DNNL_DEFAULT_FPMATH_MODE.
548
546
549
-
On AWS Graviton3 as example, this enables GEMM kernels that use bfloat16 MMLA instructions available in the hardware.
547
+
Using AWS Graviton3 as an example, this enables GEMM kernels that use bfloat16 MMLA instructions available in the hardware.
550
548
551
549
552
550
### Update the FunASR application with benchmark function
553
551
554
-
Now we can test FunASR model again.
552
+
Now you can test the FunASR model again.
555
553
556
-
By re-use the previouslyparaformer-2.pyand add benchmark function, please copy the updated code shown below in a file named `paraformer-3.py`:
554
+
By reusing the previously-named `paraformer-2.py` file and add benchmark function, copy the updated code shown below in a file named `paraformer-3.py`:
557
555
558
556
```python
559
557
import os
@@ -650,7 +648,7 @@ The model took 0.816 seconds to complete execution.
650
648
651
649
### Enable bfloat16 Fast Math Kernels
652
650
653
-
Now we enable bfloat16 and run Python script again:
651
+
Now enable bfloat16 and run Python script again:
654
652
655
653
```bash
656
654
export DNNL_DEFAULT_FPMATH_MODE=BF16
@@ -665,7 +663,7 @@ rtf_avg: 0.010: 100%|
665
663
Execution Time: 738.04 ms (Mean), 747.57 ms (P99)
666
664
```
667
665
668
-
You can notice that the execution time is now 0.7 seconds, reflecting an improvement compared to earlier results.
666
+
Here you can see that the execution time is now 0.7 seconds, reflecting an improvement compared to earlier results.
669
667
670
668
## Conclusion
671
-
Arm CPUs and an optimized software ecosystem enable developers to build innovative, efficient ASR solutions. Explore the capabilities of ModelScope and FunASR, and unlock the potential of Arm technology for your next Chinese ASR project.
669
+
Arm CPUs paired with an optimized software ecosystem enable developers to build innovative, efficient ASR solutions. Discover the potential of ModelScope and FunASR, and harness Arm technology for your next Chinese ASR project.
Copy file name to clipboardExpand all lines: content/learning-paths/servers-and-cloud-computing/funASR/_index.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,20 +1,20 @@
1
1
---
2
-
title: Deploy ModelScope FunASR Chinese Speech Recognition Model on Arm Servers
2
+
title: Deploy ModelScope FunASR Model on Arm Servers
3
3
draft: true
4
4
cascade:
5
5
draft: true
6
6
7
7
minutes_to_complete: 60
8
8
9
-
who_is_this_for: This is an introductory topic for software developers and AI engineers interested in learning how to run Chinese Automatic Speech Recognition (ASR) applications on Arm servers.
9
+
who_is_this_for: This is an introductory topic for developers interested in learning how to deploy the ModelScope FunASR Chinese Automatic Speech Recognition (ASR) model on Arm-based servers.
10
10
11
11
learning_objectives:
12
12
- Leverage open-source large language models and tools to build Chinese ASR applications.
13
-
- Deploy real-time Chinese speech recognition, punctuation restoration, and sentiment analysis with FunASR.
14
-
- Describe how to accelerate ModelScope models on Arm-based servers for performance and efficiency.
13
+
- Deploy real-time Chinese speech recognition, punctuation restoration, and sentiment analysis using FunASR.
14
+
- Describe how to accelerate ModelScope models on Arm-based servers for enhanced performance and efficiency.
15
15
16
16
prerequisites:
17
-
- An [Arm-based instance](/learning-paths/servers-and-cloud-computing/csp/) from a cloud service provider, or a local Arm Linux computer with at least 8 CPUs and 16GB RAM.
17
+
- An [Arm-based instance](/learning-paths/servers-and-cloud-computing/csp/) from a cloud service provider, or a local Arm Linux computer with at least 8 CPUs and 16GB of RAM.
0 commit comments