Model Only Recognizes a Single Word from Audio Input

When running speech recognition with ReazonSpeech, the model only outputs a single word, regardless of the length or content of the input audio. This happens even with clear audio files containing multiple words or full sentence

[audio (3).zip](https://github.com/user-attachments/files/20382943/audio.3.zip)

Code:

```
import librosa
import soundfile as sf
import io
import tempfile
import numpy as np

# from reazonspeech.nemo.asr import load_model, transcribe, audio_from_path
from reazonspeech.k2.asr import load_model, transcribe, audio_from_path

# === Load ReazonSpeech model from Hugging Face ===
# model = load_model("reazon-research/reazonspeech-k2-v2-ja-en")
model = load_model(device="cpu", precision="fp32", language="ja") # or language="ja-en" for bilingual model

# === Step 1: Load and resample audio to 16,000 Hz ===
audio_path = r'D:\Image_Based_searchengine\product_images\audio (3).wav'
y, sr = librosa.load(audio_path, sr=16000, mono=True)

# === Step 2: Amplify the audio by 1.5x and clip to avoid distortion ===
amplified_y = np.clip(y * 1.5, -1.0, 1.0)

# === Step 3: Write amplified audio to an in-memory buffer ===
buffer = io.BytesIO()
sf.write(buffer, amplified_y, 16000, format='WAV', subtype='PCM_16')
buffer.seek(0)

# === Step 4: Save buffer to a temp WAV file for ASR model ===
with tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as tmp:
    tmp.write(buffer.read())
    temp_wav_path = tmp.name

# === Step 5: Transcribe ===
audio = audio_from_path(temp_wav_path)
print("audio.samplerate:", audio.samplerate)

ret = transcribe(model, audio)
print("Transcribed Text:", ret.text)


```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Only Recognizes a Single Word from Audio Input #57

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Model Only Recognizes a Single Word from Audio Input #57

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions