Add MedASR medical speech recognition notebook by padatta · Pull Request #3416 · openvinotoolkit/openvino_notebooks

padatta · 2026-04-15T09:38:33Z

Summary

This PR adds a comprehensive notebook demonstrating Google's MedASR (Medical Automatic Speech Recognition) model optimization with OpenVINO and NNCF quantization.

What's New

Demonstrates medical speech recognition with INT8 quantization
Complete workflow from PyTorch to optimized OpenVINO models

Features

Model Conversion & Optimization

Convert MedASR PyTorch model to OpenVINO FP16 using torch.export
Apply INT8 quantization with NNCF ModelType.TRANSFORMER preset
Use real audio calibration data for accurate quantization

Key Results

Model Compression: 3.9x size reduction (402 MB → 102 MB)
Accuracy: 97.98% token match accuracy (INT8 vs PyTorch)
Medical Terminology: Optimized for clinical documentation

Notebook Contents

Installation & Setup
Load MedASR model from HuggingFace
Prepare audio data (10s optimal for GPU)
PyTorch inference baseline
OpenVINO FP16 conversion
INT8 quantization with real audio calibration
Accuracy comparison (PyTorch/FP16/INT8)
Performance benchmarking (CPU/GPU)

Use Case

Medical speech recognition for:

Clinical documentation
Medical transcription services
Healthcare voice assistants
Edge deployment in medical devices

review-notebook-app · 2026-04-15T09:38:38Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

brmarkus · 2026-04-15T09:50:35Z

Have a look into e.g. "https://github.com/padatta/openvino_notebooks/blob/6736abef648d34b96f51b7ebcf5b4801ae24fe35/notebooks/llm-chatbot/llm-chatbot.ipynb" - could you add a note regarding HuggingFace access token? This could help improve usability.

Note: run model with demo, you will need to accept license agreement. You must be a registered user in 🤗 Hugging Face Hub. Please visit HuggingFace model card, carefully read terms of usage and click accept button. You will need to use an access token for the code below to run. For more information on access tokens, refer to this section of the documentation. You can login on Hugging Face Hub in notebook environment, using following code:
## login to huggingfacehub to get access to pretrained model 

from huggingface_hub import notebook_login, whoami

try:
    whoami()
    print('Authorization token already provided')
except OSError:
    notebook_login()

This notebook demonstrates converting Google's MedASR model to OpenVINO with FP16 and INT8 quantization for efficient medical speech recognition. Features: - HuggingFace authentication with notebook_login for gated model access - Model conversion using torch.export and ov.convert_model - INT8 quantization with NNCF using real audio calibration data - Comprehensive accuracy validation (97.98% token-level accuracy) - Performance benchmarking on CPU and GPU - Model compression: 402 MB -> 102 MB (3.9x reduction) The notebook includes complete workflow from model loading to deployment, with support for 10-second audio chunks (static shape [1, 998, 128]).

padatta · 2026-04-15T12:14:01Z

Have a look into e.g. "https://github.com/padatta/openvino_notebooks/blob/6736abef648d34b96f51b7ebcf5b4801ae24fe35/notebooks/llm-chatbot/llm-chatbot.ipynb" - could you add a note regarding HuggingFace access token? This could help improve usability.
Note: run model with demo, you will need to accept license agreement. You must be a registered user in 🤗 Hugging Face Hub. Please visit HuggingFace model card, carefully read terms of usage and click accept button. You will need to use an access token for the code below to run. For more information on access tokens, refer to this section of the documentation. You can login on Hugging Face Hub in notebook environment, using following code:
## login to huggingfacehub to get access to pretrained model 

from huggingface_hub import notebook_login, whoami

try:
    whoami()
    print('Authorization token already provided')
except OSError:
    notebook_login()

"I have pushed the changes. The notebook now includes a HuggingFace login section for gated model access, updated instructions, and improved usability."

brmarkus · 2026-04-15T12:54:32Z

The Intel GNA (Gaussian & Neural Accelerator) accelerator was declared deprecated and the GNA plugin got removed several major releases ago (since v2024).

Could the NPU device be used for this task, similar to the GNA in the past, to offload the CPU/GPU/SoC?

padatta force-pushed the add-medasr-medical-asr-notebook branch from 6736abe to ed695b7 Compare April 15, 2026 12:02

padatta force-pushed the add-medasr-medical-asr-notebook branch from ed695b7 to 93c9058 Compare April 15, 2026 12:06

Merge branch 'latest' into add-medasr-medical-asr-notebook

35641d4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MedASR medical speech recognition notebook#3416

Add MedASR medical speech recognition notebook#3416
padatta wants to merge 2 commits intoopenvinotoolkit:latestfrom
padatta:add-medasr-medical-asr-notebook

padatta commented Apr 15, 2026

Uh oh!

review-notebook-app bot commented Apr 15, 2026

Uh oh!

brmarkus commented Apr 15, 2026

Uh oh!

padatta commented Apr 15, 2026

Uh oh!

brmarkus commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

padatta commented Apr 15, 2026

Summary

What's New

Features

Model Conversion & Optimization

Key Results

Notebook Contents

Use Case

Uh oh!

review-notebook-app bot commented Apr 15, 2026

Uh oh!

brmarkus commented Apr 15, 2026

Uh oh!

padatta commented Apr 15, 2026

Uh oh!

brmarkus commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants