Add FeatureBuffer support to Cache-Aware streaming pipeline #15188

arushidNV · 2025-12-15T07:49:50Z

What does this PR do?

Adds FeatureBuffer support to the Cache-Aware RNNT streaming pipeline, enabling both raw audio (Frame-based) and pre-computed feature (FeatureBuffer-based) streaming inference.

Collection: ASR

Changelog

Implemented transcribe_step_for_feature_buffers() method in CacheAwareRNNTPipeline and CacheAwareCTCPipelineto handle pre-computed features
Updated type hints in run_greedy_decoder() and cache_aware_transcribe_step() to support both Frame and FeatureBuffer types

Usage

This enables flexible deployment scenarios where features can be pre-computed on different hardware (e.g., TensorRT encoder) before streaming to the RNNT decoder:

from nemo.collections.asr.inference.factory import CacheAwarePipelineBuilder

# Initialize pipeline with FeatureBuffer support
cfg.streaming.request_type = "feature_buffer"  # or "frame" for raw audio
pipeline = CacheAwarePipelineBuilder.build(cfg)

# Use with pre-computed features
feature_buffers = [FeatureBuffer(
    features=computed_features,  # [n_feat, T]
    stream_id=stream_id,
    is_last=is_final_chunk
)]

# Process features
pipeline.transcribe_step(feature_buffers)
results = pipeline.get_results()

WER Comparison

The WER comparison results with Frame and Feature Buffer for open-asr-leaderboard datasets:
Model: https://huggingface.co/nvidia/nemotron-speech-streaming-en-0.6b
Attention Context Size: [70,3]
EOU Threshold: 800
Residue Tokens: 0

Dataset	Frame WER	Feature Buffer WER
ami-test	31.2132	31.221
common_voice-test	15.9703	15.9742
earnings22-test	21.0542	21.1941
gigaspeech-test	19.6682	19.6876
librispeech-test.clean	5.4333	5.4596
librispeech-test.other	9.322	9.3484
spgispeech-test	7.8701	7.8541
tedlium-test	13.0503	12.998
voxpopuli-test	10.5082	10.5126
Average	14.8989	14.9166

GitHub Actions CI

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
ASR contributors and maintainers can review this PR.

Additional Information

naymaraq · 2025-12-16T11:45:08Z

nemo/collections/asr/inference/pipelines/cache_aware_ctc_pipeline.py

        Args:
            state: (CacheAwareCTCStreamingState) The state of the stream
-            frame: (Frame) The current frame
+            frame: (Frame | FeatureBuffer) The current frame or feature buffer


Please use Request type instead of Frame | FeatureBuffer

naymaraq · 2025-12-16T11:45:42Z

nemo/collections/asr/inference/pipelines/cache_aware_ctc_pipeline.py

        Decode the log probabilities and update the state
        Args:
-            frames: (list[Frame]) List of frames to transcribe.
+            frames: (list[Frame | FeatureBuffer]) List of frames or feature buffers to transcribe.


Please use Request type instead of Frame | FeatureBuffer

naymaraq · 2025-12-16T11:46:05Z

nemo/collections/asr/inference/pipelines/cache_aware_ctc_pipeline.py


        Args:
-            frames: (list[Frame]) List of frames to transcribe.
+            frames: (list[Frame | FeatureBuffer]) List of frames or feature buffers to transcribe.


Please use Request type instead of Frame | FeatureBuffer

naymaraq · 2025-12-16T11:47:04Z

nemo/collections/asr/inference/pipelines/cache_aware_rnnt_pipeline.py

        Args:
            state: (CacheAwareRNNTStreamingState) The state of the stream
-            frame: (Frame) The current frame
+            frame: (Frame | FeatureBuffer) The current frame or feature buffer


Please use Request type instead of Frame | FeatureBuffer

naymaraq · 2025-12-16T11:47:17Z

nemo/collections/asr/inference/pipelines/cache_aware_rnnt_pipeline.py

        """
        Cache Aware Transcribe Step
-        It receives a list of frames and features and do the following:
+        It receives a list of frames (Frame or FeatureBuffer) and features and do the following:


Please use Request type instead of Frame | FeatureBuffer

naymaraq · 2025-12-16T11:47:32Z

nemo/collections/asr/inference/pipelines/cache_aware_rnnt_pipeline.py

        8. Update the ready states to indicate that the state is ready for text post-processing
        Args:
-            frames: (list[Frame]) List of frames to transcribe.
+            frames: (list[Frame | FeatureBuffer]) List of frames or feature buffers to transcribe.


Please use Request type instead of Frame | FeatureBuffer

naymaraq · 2025-12-16T11:49:01Z

nemo/collections/asr/inference/pipelines/cache_aware_rnnt_pipeline.py

        # update the previous hypothesis and reset the previous hypothesis for the streams that has ended
-        for state, hyp, eos in zip(states, best_hyp, eos_flags):
+        for i, (state, hyp, eos) in enumerate(zip(states, best_hyp, eos_flags)):
+            hyp_len = len(hyp.y_sequence) if hyp is not None and hasattr(hyp, 'y_sequence') else 0


Seems hyp_len is not used, if so, no need to enumerate

nemo/collections/asr/inference/pipelines/cache_aware_rnnt_pipeline.py

Signed-off-by: arushid <[email protected]>

Signed-off-by: arushidNV <[email protected]>

Signed-off-by: arushid <[email protected]>

Signed-off-by: arushidNV <[email protected]>

Signed-off-by: arushid <[email protected]>

Signed-off-by: arushidNV <[email protected]>

Signed-off-by: arushid <[email protected]>

naymaraq · 2026-01-06T16:48:21Z

examples/asr/conf/asr_streaming_inference/cache_aware_rnnt.yaml

+  batch_size: 64                             # Number of audio frames per batch
  word_boundary_tolerance: 4                  # Tolerance for word boundaries
-  att_context_size: [70,13]                   # Attention context size: [70,13],[70,6],[70,1],[70,0]
+  att_context_size: [70,3]                   # Attention context size: [70,13],[70,6],[70,1],[70,0]


Let's keep [70, 13] for default

naymaraq · 2026-01-06T16:49:41Z

examples/asr/conf/asr_streaming_inference/cache_aware_rnnt.yaml

It would be better to change the default model to nvidia/nemotron-speech-streaming-en-0.6b.

naymaraq · 2026-01-06T16:59:42Z

nemo/collections/asr/inference/pipelines/cache_aware_rnnt_pipeline.py

+        for fbuffer in fbuffers:
+            feature = fbuffer.features
+            # Trim to expected feature buffer length (safeguard for external feature buffer inputs)
+            feature = drop_trailing_features(feature.unsqueeze(0), self.expected_feature_buffer_len).squeeze(0)


Just a suggestion: I would suggest to drop trailing features in the preprocess method if possible

Signed-off-by: arushid <[email protected]>

github-actions bot added the ASR label Dec 15, 2025

arushidNV changed the title ~~Add FeatureBuffer support to Cache-Aware RNNT streaming pipeline~~ Add FeatureBuffer support to Cache-Aware streaming pipeline Dec 15, 2025

nithinraok requested a review from naymaraq December 15, 2025 17:52

naymaraq requested changes Dec 16, 2025

View reviewed changes

arushidNV force-pushed the cache-aware-feature-support branch from aa477a3 to 25530b6 Compare December 18, 2025 04:23

github-advanced-security bot found potential problems Dec 18, 2025

View reviewed changes

nemo/collections/asr/inference/pipelines/cache_aware_rnnt_pipeline.py Fixed Show fixed Hide fixed

arushidNV added the Run CICD label Dec 19, 2025

arushidNV temporarily deployed to test December 19, 2025 09:02 — with GitHub Actions Inactive

arushidNV force-pushed the cache-aware-feature-support branch from b03778d to c7b373e Compare December 22, 2025 04:42

chtruong814 added Run CICD and removed Run CICD labels Dec 22, 2025

chtruong814 temporarily deployed to test December 22, 2025 04:44 — with GitHub Actions Inactive

arushidNV force-pushed the cache-aware-feature-support branch from c7b373e to aaa5091 Compare December 31, 2025 04:08

chtruong814 added Run CICD and removed Run CICD labels Dec 31, 2025

chtruong814 temporarily deployed to test December 31, 2025 04:10 — with GitHub Actions Inactive

chtruong814 added Run CICD and removed Run CICD labels Jan 4, 2026

chtruong814 temporarily deployed to test January 4, 2026 18:04 — with GitHub Actions Inactive

chtruong814 added Run CICD and removed Run CICD labels Jan 5, 2026

chtruong814 temporarily deployed to test January 5, 2026 10:27 — with GitHub Actions Inactive

arushidNV and others added 6 commits January 5, 2026 16:41

Add FeatureBuffer support to Cache-Aware RNNT streaming pipeline

b82af04

Signed-off-by: arushid <[email protected]>

Apply isort and black reformatting

14971b9

Signed-off-by: arushidNV <[email protected]>

Adding feature support for Cache Aware CTC pipeline

a182c2c

Signed-off-by: arushid <[email protected]>

Apply isort and black reformatting

62a3f68

Signed-off-by: arushidNV <[email protected]>

Changed [Frame| Feature] to [Request]

b93d538

Signed-off-by: arushid <[email protected]>

Apply isort and black reformatting

70a354d

Signed-off-by: arushidNV <[email protected]>

arushidNV added 2 commits January 5, 2026 16:41

Adding feature buffer support in request generator

87f2f18

Signed-off-by: arushid <[email protected]>

Fixing issues with feature buffer support in request generator

2d7c394

Signed-off-by: arushid <[email protected]>

arushidNV force-pushed the cache-aware-feature-support branch from 39bd118 to 2d7c394 Compare January 5, 2026 11:11

chtruong814 added Run CICD and removed Run CICD labels Jan 5, 2026

chtruong814 temporarily deployed to test January 5, 2026 11:13 — with GitHub Actions Inactive

Updated comment in config

6f5b3b2

Signed-off-by: arushid <[email protected]>

chtruong814 added Run CICD and removed Run CICD labels Jan 6, 2026

chtruong814 temporarily deployed to test January 6, 2026 07:37 — with GitHub Actions Inactive

naymaraq requested changes Jan 6, 2026

View reviewed changes

Resolving code reviews

f0a4540

Signed-off-by: arushid <[email protected]>

chtruong814 added Run CICD and removed Run CICD labels Jan 7, 2026

chtruong814 temporarily deployed to test January 7, 2026 07:44 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add FeatureBuffer support to Cache-Aware streaming pipeline #15188

Add FeatureBuffer support to Cache-Aware streaming pipeline #15188

arushidNV commented Dec 15, 2025 •

edited

Loading

Uh oh!

naymaraq Dec 16, 2025

Uh oh!

naymaraq Dec 16, 2025

Uh oh!

naymaraq Dec 16, 2025

Uh oh!

naymaraq Dec 16, 2025

Uh oh!

naymaraq Dec 16, 2025

Uh oh!

naymaraq Dec 16, 2025

Uh oh!

naymaraq Dec 16, 2025

Uh oh!

Uh oh!

naymaraq Jan 6, 2026

Uh oh!

naymaraq Jan 6, 2026

Uh oh!

naymaraq Jan 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add FeatureBuffer support to Cache-Aware streaming pipeline #15188

Are you sure you want to change the base?

Add FeatureBuffer support to Cache-Aware streaming pipeline #15188

Conversation

arushidNV commented Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Changelog

Usage

WER Comparison

GitHub Actions CI

Before your PR is "Ready for review"

Who can review?

Additional Information

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

arushidNV commented Dec 15, 2025 •

edited

Loading