Regarding the usage of io-bound #3669

mxiaonian · 2024-09-03T09:24:28Z

mxiaonian
Sep 3, 2024

Question

Recently, I have been trying to use Nicegui to create a speech recognition project based on Whisper, using a modular approach to building pages. However, I have encountered some errors while executing speech extraction and recognition functions, which is confusing.
My page structure is:
main.py

@ui.page('/')
def index():
    with ui.header(bordered=True, elevated=True):
......
    with ui.tab_panels(tabs, value='视频识别').classes('w-full'):
        with ui.tab_panel(tab_video):
            video_page.video_page()

video_page.py

@ui.page('/')
def video_page():
......
 with ui.card().classes('w-full'):
        ui.button('开始识别', on_click=lambda: start_button_click()).classes('w-full')
......

async def start_button_click():
    audio_path = await run.io_bound(extract_audio)
    if audio_path:
        task_transcribe = await run.io_bound(start_transcribe, audio_path)
        print(f'识别结果：{task_transcribe}')


async def extract_audio():
    current_path = os.getcwd()
    ffmpeg_path = os.path.join(current_path, 'ffmpeg\\bin\\ffmpeg.exe')
    output_audio = os.path.join(current_path, 'audio')
    video_name = os.path.splitext(os.path.basename(upload_video_path))[0]  
    output_audio_path = os.path.join(output_audio, f"{video_name}.mp3")  
    command = [
        ffmpeg_path,
        '-i', upload_video_path,  
        '-q:a', '0', 
        '-map', 'a',  
        output_audio_path  
    ]

    try:
        process = subprocess.run(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, text=True, encoding='utf-8',
                                 check=True)
        for line in process.stdout.splitlines():
            print(line)
        current_path = os.getcwd()
        toml_file = os.path.join(current_path, 'config/upload_config.toml')
        upload_config = toml.load(toml_file)
        section = 'upload'
        upload_config[section]['audio_save_path'] = output_audio_path
        with open(toml_file, 'w', encoding='utf-8') as f:
            toml.dump(upload_config, f)
        return output_audio_path
    except subprocess.CalledProcessError as e:
        return None


async def start_transcribe(audio_path):
    model = faster_whisper.WhisperModel(model_size_or_path=whisper_local_model_path,
                                        device=whisper_local_model_gpu,
                                        local_files_only=True)
    segments, info = model.transcribe(audio=audio_path,
                                      language=whisper_local_model_language,
                                      task=whisper_local_model_task,
                                      vad_filter=whisper_local_model_vad,
                                      vad_parameters=dict(min_silence_duration_ms=whisper_local_model_min_vad),
                                      initial_prompt=whisper_local_model_prompt,
                                      chunk_length=whisper_local_model_chunk_length,
                                      temperature=whisper_local_model_temp,
                                      without_timestamps=whisper_local_model_without_timestamps,
                                      word_timestamps=whisper_local_model_word_timestamps,
                                      beam_size=whisper_local_model_beam_size)

    for i, segment in enumerate(segments):
        start_time, end_time, text = segment['start'], segment['end'], segment['text']
        start_srt = f"{int(start_time // 3600):02}:{int((start_time % 3600) // 60):02}:{int(start_time % 60):02},{int((start_time * 1000) % 1000):03}"
        end_srt = f"{int(end_time // 3600):02}:{int((end_time % 3600) // 60):02}:{int(end_time % 60):02},{int((end_time * 1000) % 1000):03}"
        print(f"{i + 1}\n{start_srt} --> {end_srt}\n{text}\n\n")

The error message encountered is as follows:

 RuntimeWarning: coroutine 'extract_audio' was never awaited
  handle = None  # Needed to break cycles when an exception occurs.
RuntimeWarning: Enable tracemalloc to get the object allocation traceback

After reading the documentation of nickeGUI, I still don't understand how to use io-bound.
Please help me, big shots. Thank you very much

Answered by mxiaonian

Sep 5, 2024

Summarize the knowledge learned.
Starting multiple logics that require a long computation time can be written as follows: the starting entry needs to be asynchronous, while the computation logic does not require it.
for example:

from nicegui import run, ui
async def start():
    await run.io_bound(long_term_calculation_1)
    await run.io_bound(long_term_calculation_2)
    await run.io_bound(long_term_calculation_3)
......

def long_term_calculation_1():
    '''your_code'''
    ......

def long_term_calculation_2():
    '''your_code'''
    ......

def long_term_calculation_3():
    '''your_code'''
    ......

When executing the start function, the program waits for each step to be complete…

View full answer

falkoschindler · 2024-09-03T09:39:44Z

falkoschindler
Sep 3, 2024
Maintainer

Hi @mxiaonian,

Instead of

await run.io_bound(extract_audio)

you should write

await run.io_bound(extract_audio())

10 replies

rodja Sep 4, 2024
Maintainer

Have you read https://github.com/zauberzeug/nicegui/wiki/FAQs#why-is-my-long-running-function-blocking-ui-updates? If you have cpu intensive tasks, you should use run.cpu_bound.

mxiaonian Sep 4, 2024
Author

yes,I try it,But the same problem will arise,But the same problem will arise,I have checked the code many times without any errors or problems, and I am very confused,and whisper is runing on GPU,use cuda

rodja Sep 4, 2024
Maintainer

Dos the app stay responsive if you replace the code in extract_audio and start_transcribe with ~~ui.sleep(10)~~ time.sleep(10)?

falkoschindler Sep 4, 2024
Maintainer

@rodja I guess you mean time.sleep(10).

mxiaonian Sep 4, 2024
Author

hi~
I tried commenting out this part of the code line by line and running it, but ultimately found that the code that caused NICEGUI to be unresponsive was

 def start_transcribe(audio_path):
......
     segments, info = model.transcribe(
     ......
                                       chunk_length=whisper_local_model_chunk_length,
                                       temperature=whisper_local_model_temp,）

These are two parameters passed to the Whisper model, whether using variables or directly using numerical values. As long as these two parameters are passed, nickegui will disconnect the link after the model runs completely，I completely don't understand what principle this is, it's too strange.I have consulted Whisper's documentation, and the purpose of these two parameters is to：

        temperature: Union[float, List[float], Tuple[float, ...]] = [
            0.0,
            0.2,
            0.4,
            0.6,
            0.8,
            1.0,
        ],
                chunk_length: Optional[int] = None,
                
                   temperature: Temperature for sampling. It can be a tuple of temperatures,
            which will be successively used upon failures according to either
            `compression_ratio_threshold` or `log_prob_threshold`.
                      chunk_length: The length of audio segments. If it is not None, it will overwrite the
            default chunk_length of the FeatureExtractor.

If either of these two parameters is passed, it will cause the GUI to be unresponsive，That's a strange question

mxiaonian · 2024-09-04T06:45:53Z

mxiaonian
Sep 4, 2024
Author

If you are interested in reproducing this problem, you can use the following minimal code

# -*- coding: utf-8 -*-

import os
import faster_whisper
from nicegui import run, ui

os.environ['KMP_DUPLICATE_LIB_OK'] = 'True'  # 修复OMP


def start_transcribe():
    audio_path = r'your_audio_path'
    model = faster_whisper.WhisperModel(model_size_or_path=whisper_local_model_path,
                                        device="cuda",
                                        local_files_only=True)
    segments, transcription_info = model.transcribe(audio=audio_path,
                                                    language='en',
                                                    task="transcribe",
                                                    vad_filter=True,
                                                    vad_parameters=dict(
                                                        min_silence_duration_ms=500),
                                                    initial_prompt="Prompt",
                                                    temperature=0.8,   # problem code
                                                    chunk_length=5,    # problem code
                                                    without_timestamps=True,
                                                    word_timestamps=False,
                                                    beam_size=5)
    for segment in segments:
        print(f"Return information：{segment}")
        
async def start_button_click():
    await run.io_bound(start_transcribe)


ui.button("start", on_click=start_button_click)
ui.run()

0 replies

mxiaonian · 2024-09-05T02:07:07Z

mxiaonian
Sep 5, 2024
Author

Summarize the knowledge learned.
Starting multiple logics that require a long computation time can be written as follows: the starting entry needs to be asynchronous, while the computation logic does not require it.
for example:

from nicegui import run, ui
async def start():
    await run.io_bound(long_term_calculation_1)
    await run.io_bound(long_term_calculation_2)
    await run.io_bound(long_term_calculation_3)
......

def long_term_calculation_1():
    '''your_code'''
    ......

def long_term_calculation_2():
    '''your_code'''
    ......

def long_term_calculation_3():
    '''your_code'''
    ......

When executing the start function, the program waits for each step to be completed before proceeding to the next step

0 replies

Uh oh!

Regarding the usage of io-bound #3669

Uh oh!

Uh oh!

mxiaonian Sep 3, 2024

Question

Replies: 3 comments · 10 replies

Uh oh!

falkoschindler Sep 3, 2024 Maintainer

Uh oh!

rodja Sep 4, 2024 Maintainer

Uh oh!

mxiaonian Sep 4, 2024 Author

Uh oh!

Uh oh!

rodja Sep 4, 2024 Maintainer

Uh oh!

falkoschindler Sep 4, 2024 Maintainer

Uh oh!

mxiaonian Sep 4, 2024 Author

Uh oh!

Uh oh!

mxiaonian Sep 4, 2024 Author

Uh oh!

Uh oh!

mxiaonian Sep 5, 2024 Author

mxiaonian
Sep 3, 2024

Replies: 3 comments 10 replies

falkoschindler
Sep 3, 2024
Maintainer

rodja Sep 4, 2024
Maintainer

mxiaonian Sep 4, 2024
Author

rodja Sep 4, 2024
Maintainer

falkoschindler Sep 4, 2024
Maintainer

mxiaonian Sep 4, 2024
Author

mxiaonian
Sep 4, 2024
Author

mxiaonian
Sep 5, 2024
Author