Skip to content

TNEL-UCSD/word-metrics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Word-Level Error Metrics for Automatic Speech Recognition (ASR) and Brain-to-Text BCI (BTT)

This repository contains research code for word-level error analysis, developed as part of the following publication:

Word-Level Error Analysis in Decoding Systems: From Speech Recognition to Brain-Computer Interfaces Jingya Huang, Aashish N. Patel, Sowmya Manojna Narasimha, Gal Mishne, Vikash Gilja (2025). Interspeech 2025.

Project Summary

This package implements word-level error metrics to provide fine-grained evaluation of sequence-to-sequence decoding models, specifically for Automatic Speech Recognition (ASR) and Brain-to-Text Brain-Computer Interfaces (BTT). Standard sentence-level metrics often does not capture nuanced error patterns at the word level, particularly for infrequent or semantically critical words.

To address this, we introduce a refined alignment algorithm that attributes edit operations to specific words. The framework supports multiple word-level metrics that quantify both literal correctness and semantic similarity between decoded and reference words. These metrics enable detailed analysis of generalization gaps associated with word frequency, which are particularly relevant for assessing model performance on out-of-vocabulary (OOV) and low-frequency words.

Example Model (ASR-Wav2Vec2) Performance

While our primary experiments focus on character-level decoding outputs, the framework is generalizable to other output units (e.g., phonemes or subword tokens), provided that word delimiter symbols are available.

Motivation Example

Consider the example sentence:

"but you will still have to have an orthodontist to straighten out your teeth"

A system may correctly decode the majority of words while consistently misrecognizing infrequent but semantically important words such as "orthodontist," "straighten," and "teeth." Although sentence-level WER may remain low, the semantic fidelity of the transcription is substantially degraded. The metrics provided in this package are designed to expose such discrepancies by attributing errors at the word level and capturing both exact correctness and semantic distance.

Repository Contents

  • alignment/ — Core refined alignment algorithm for edit attribution
  • metrics/ — Implementations of word-level correctness and semantic similarity metrics
  • examples/ — Usage examples and demonstration scripts
  • requirements.txt — Package dependencies

Installation

Clone the repository:

git clone https://github.com/TNEL-UCSD/word-metrics.git
cd word-metrics

Install the package and dependencies:

pip install .
pip install -r requirements.txt

Note: If the repository is updated, please reinstall to ensure compatibility.

Dependencies

  • Python >= 3.9
  • NumPy
  • SciPy
  • scikit-learn
  • datasets
  • transformers
  • soundfile
  • librosa
  • flair
  • spacy
  • seaborn
  • SpeechBrain
  • PyTorch (CUDA-compatible version)

Full dependency versions are specified in requirements.txt.

Usage

Example scripts are provided in the examples/ directory to demonstrate:

  • Alignment and edit attribution
  • Word-level error metric computation
  • Evaluation on ASR model outputs (e.g., SpeechBrain wav2vec2 models)

The code is designed to operate on decoded text sequences where word boundaries are explicitly marked.

Citation

If you use this code in your work, please cite:

@inproceedings{huang2025word,
  title={Word-Level Error Analysis in Decoding Systems: From Speech Recognition to Brain-Computer Interfaces},
  author={Huang, Jingya and Patel, Aashish N. and Narasimha, Sowmya Manojna and Mishne, Gal and Gilja, Vikash},
  booktitle={Interspeech},
  year={2025}
}

License

This repository is released under the MIT License.

Contact

For questions, bug reports, or feature requests, please open an issue on this repository.

About

Word-Level Error Analysis in Decoding Systems: From Speech Recognition to Brain-Computer Interfaces

Resources

License

Stars

Watchers

Forks

Contributors