Can we have this as part of the pipe to address further hallucination issues #2521
saman-rahbar
started this conversation in
General
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Just an example of token-level confidence filtering and a secondary verification loop using Whisper to addresss above seems to work fine with the current arch
import torch
import whisper
def transcribe_with_confidence_filter(
model_name: str,
audio_file_path: str,
confidence_threshold: float = 0.2,
secondary_pass: bool = True
):
if name == "main":
# Example usage
audio_path = "sample_audio.wav"
transcription = transcribe_with_confidence_filter(
model_name="medium",
audio_file_path=audio_path,
confidence_threshold=0.25,
secondary_pass=True
)
print("Final transcription:", transcription)
Advantages
Beta Was this translation helpful? Give feedback.
All reactions