Skip to content

[BUG] 带有参考文本的 Transcribe 无法自动识别语言 #281

@zhzLuke96

Description

@zhzLuke96

现象

transcribe 的时候,如果 language 为 auto ,并且带有参考脚本的时候,报错

Transcribe:  22%|████████████                                            | 1.3/6.04 [00:01<00:04,  1.15sec/s]
Traceback (most recent call last):
  File "/[hidden]/venv/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
  File "/[hidden]/venv/lib/python3.10/site-packages/gradio/route_utils.py", line 322, in call_process_api
    output = await app.get_blocks().process_api(
  File "/[hidden]/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1935, in process_api
    result = await self.call_function(
  File "/[hidden]/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1518, in call_function
    prediction = await fn(*processed_input)
  File "/[hidden]/venv/lib/python3.10/site-packages/gradio/utils.py", line 793, in async_wrapper
    response = await f(*args, **kwargs)
  File "/[hidden]/venv/lib/python3.10/site-packages/gradio/utils.py", line 793, in async_wrapper
    response = await f(*args, **kwargs)
  File "/[hidden]/modules/webui/asr_tabs/transcribe_tab.py", line 163, in submit
    audio = await handler.enqueue()
  File "/[hidden]/modules/core/handler/STTHandler.py", line 35, in enqueue
    result = self.chunker.transcribe(audio=self.input_audio, config=self.stt_config)
  File "/[hidden]/modules/core/models/stt/STTChunker.py", line 285, in transcribe
    result = self.transcribe_to_result(audio, config)
  File "/[hidden]/modules/core/models/stt/STTChunker.py", line 216, in transcribe_to_result
    result = self.model.transcribe_to_result(chunk.audio, config)
  File "/[hidden]/modules/core/models/stt/Whisper.py", line 249, in transcribe_to_result
    result = self.force_align(audio=audio, config=config)
  File "/[hidden]/modules/core/models/stt/Whisper.py", line 339, in force_align
    aligned_result = model.align(
  File "/[hidden]/venv/lib/python3.10/site-packages/stable_whisper/alignment.py", line 184, in align
    tokenizer, supported_languages = get_alignment_tokenizer(model, is_faster_model, text, language, tokenizer)
  File "/[hidden]/venv/lib/python3.10/site-packages/stable_whisper/alignment.py", line 377, in get_alignment_tokenizer
    raise TypeError('expected argument for language')
TypeError: expected argument for language
Traceback (most recent call last):
  File "/[hidden]/venv/lib/python3.10/site-packages/gradio/queueing.py", line 536, in process_events
    response = await route_utils.call_process_api(
  File "/[hidden]/venv/lib/python3.10/site-packages/gradio/route_utils.py", line 322, in call_process_api
    output = await app.get_blocks().process_api(
  File "/[hidden]/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1935, in process_api
    result = await self.call_function(
  File "/[hidden]/venv/lib/python3.10/site-packages/gradio/blocks.py", line 1518, in call_function
    prediction = await fn(*processed_input)
  File "/[hidden]/venv/lib/python3.10/site-packages/gradio/utils.py", line 793, in async_wrapper
    response = await f(*args, **kwargs)
  File "/[hidden]/venv/lib/python3.10/site-packages/gradio/utils.py", line 793, in async_wrapper
    response = await f(*args, **kwargs)
  File "/[hidden]/modules/webui/asr_tabs/transcribe_tab.py", line 163, in submit
    audio = await handler.enqueue()
  File "/[hidden]/modules/core/handler/STTHandler.py", line 35, in enqueue
    result = self.chunker.transcribe(audio=self.input_audio, config=self.stt_config)
  File "/[hidden]/modules/core/models/stt/STTChunker.py", line 285, in transcribe
    result = self.transcribe_to_result(audio, config)
  File "/[hidden]/modules/core/models/stt/STTChunker.py", line 216, in transcribe_to_result
    result = self.model.transcribe_to_result(chunk.audio, config)
  File "/[hidden]/modules/core/models/stt/Whisper.py", line 249, in transcribe_to_result
    result = self.force_align(audio=audio, config=config)
  File "/[hidden]/modules/core/models/stt/Whisper.py", line 339, in force_align
    aligned_result = model.align(
  File "/[hidden]/venv/lib/python3.10/site-packages/stable_whisper/alignment.py", line 184, in align
    tokenizer, supported_languages = get_alignment_tokenizer(model, is_faster_model, text, language, tokenizer)
  File "/[hidden]/venv/lib/python3.10/site-packages/stable_whisper/alignment.py", line 377, in get_alignment_tokenizer
    raise TypeError('expected argument for language')

如果 language不为auto 就不会报错,应该是没有自动识别

预期

支持自动识别语言

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions