Skip to content

[Bug]: Fish Speech S2 Pro: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/fish_ref_vwbsa8u2.npy' ONLY when server is launched on multiple GPUs. #2602

@ukemamaster

Description

@ukemamaster

Your current environment

The output of python collect_env.py
vllm docker image 0.18.0

Your code version

The commit id or version of vllm

The commit id or version of vllm-omni

🐛 Describe the bug

The following error happens ONLY when the server is launched on multiple GPUs, and i pass a ref_audio for cloning. On 1 GPU the inference is OK.
Also, when i dont pass any ref_audio, (using default voice), the inference is OK for any number of GPUs.

(APIServer pid=1) INFO:     172.26.161.142:34996 - "POST /v1/audio/speech HTTP/1.1" 200 OK
(APIServer pid=1) INFO 04-08 18:02:44 [orchestrator.py:584] [Orchestrator] _handle_add_request: stage=0 req=speech-954d5cc527259e76 prompt_type=OmniEngineCoreRequest original_prompt_type=dict final_stage=1 num_sampling_params=2
(APIServer pid=1) INFO 04-08 18:02:44 [stage_engine_core_client.py:113] [StageEngineCoreClient] Stage-0 adding request: speech-954d5cc527259e76
(APIServer pid=1) INFO 04-08 18:02:44 [stage_engine_core_client.py:113] [StageEngineCoreClient] Stage-1 adding request: speech-954d5cc527259e76
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932] WorkerProc hit an exception.
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932] Traceback (most recent call last):
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 927, in worker_busy_loop
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     output = func(*args, **kwargs)
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]              ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/worker_base.py", line 332, in execute_model
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return self.worker.execute_model(scheduler_output)
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return func(*args, **kwargs)
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 822, in execute_model
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     output = self.model_runner.execute_model(
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return func(*args, **kwargs)
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/worker/gpu_ar_model_runner.py", line 267, in execute_model
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     ) = self._preprocess(scheduler_output, num_tokens_padded, intermediate_tensors)
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/worker/gpu_model_runner.py", line 1225, in _preprocess
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     req_input_ids, req_embeds, update_dict = self.model.preprocess(
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                                              ^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/model_executor/models/fish_speech/fish_speech_slow_ar.py", line 368, in preprocess
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     prompt_embeds = self._build_structured_voice_clone_prefill_embeds(info_dict)
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/model_executor/models/fish_speech/fish_speech_slow_ar.py", line 530, in _build_structured_voice_clone_prefill_embeds
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     ref_audio_wav = np.load(ref_audio_path)
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                     ^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/numpy/lib/npyio.py", line 427, in load
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     fid = stack.enter_context(open(os_fspath(file), "rb"))
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932] FileNotFoundError: [Errno 2] No such file or directory: '/tmp/fish_ref_vwbsa8u2.npy'
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932] Traceback (most recent call last):
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 927, in worker_busy_loop
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     output = func(*args, **kwargs)
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]              ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/worker_base.py", line 332, in execute_model
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return self.worker.execute_model(scheduler_output)
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return func(*args, **kwargs)
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 822, in execute_model
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     output = self.model_runner.execute_model(
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return func(*args, **kwargs)
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/worker/gpu_ar_model_runner.py", line 267, in execute_model
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     ) = self._preprocess(scheduler_output, num_tokens_padded, intermediate_tensors)
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/worker/gpu_model_runner.py", line 1225, in _preprocess
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     req_input_ids, req_embeds, update_dict = self.model.preprocess(
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                                              ^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/model_executor/models/fish_speech/fish_speech_slow_ar.py", line 368, in preprocess
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     prompt_embeds = self._build_structured_voice_clone_prefill_embeds(info_dict)
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/model_executor/models/fish_speech/fish_speech_slow_ar.py", line 530, in _build_structured_voice_clone_prefill_embeds
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     ref_audio_wav = np.load(ref_audio_path)
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                     ^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/numpy/lib/npyio.py", line 427, in load
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     fid = stack.enter_context(open(os_fspath(file), "rb"))
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932] FileNotFoundError: [Errno 2] No such file or directory: '/tmp/fish_ref_vwbsa8u2.npy'
(Worker_TP3 pid=830) ERROR 04-08 18:02:45 [multiproc_executor.py:932] 
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932] WorkerProc hit an exception.
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932] Traceback (most recent call last):
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 927, in worker_busy_loop
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     output = func(*args, **kwargs)
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]              ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/worker_base.py", line 332, in execute_model
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return self.worker.execute_model(scheduler_output)
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return func(*args, **kwargs)
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 822, in execute_model
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     output = self.model_runner.execute_model(
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return func(*args, **kwargs)
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/worker/gpu_ar_model_runner.py", line 267, in execute_model
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     ) = self._preprocess(scheduler_output, num_tokens_padded, intermediate_tensors)
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/worker/gpu_model_runner.py", line 1225, in _preprocess
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     req_input_ids, req_embeds, update_dict = self.model.preprocess(
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                                              ^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/model_executor/models/fish_speech/fish_speech_slow_ar.py", line 368, in preprocess
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     prompt_embeds = self._build_structured_voice_clone_prefill_embeds(info_dict)
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/model_executor/models/fish_speech/fish_speech_slow_ar.py", line 530, in _build_structured_voice_clone_prefill_embeds
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     ref_audio_wav = np.load(ref_audio_path)
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                     ^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/numpy/lib/npyio.py", line 427, in load
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     fid = stack.enter_context(open(os_fspath(file), "rb"))
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932] FileNotFoundError: [Errno 2] No such file or directory: '/tmp/fish_ref_vwbsa8u2.npy'
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932] Traceback (most recent call last):
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 927, in worker_busy_loop
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     output = func(*args, **kwargs)
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]              ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/worker_base.py", line 332, in execute_model
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return self.worker.execute_model(scheduler_output)
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return func(*args, **kwargs)
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 822, in execute_model
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     output = self.model_runner.execute_model(
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return func(*args, **kwargs)
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/worker/gpu_ar_model_runner.py", line 267, in execute_model
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     ) = self._preprocess(scheduler_output, num_tokens_padded, intermediate_tensors)
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/worker/gpu_model_runner.py", line 1225, in _preprocess
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     req_input_ids, req_embeds, update_dict = self.model.preprocess(
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                                              ^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/model_executor/models/fish_speech/fish_speech_slow_ar.py", line 368, in preprocess
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     prompt_embeds = self._build_structured_voice_clone_prefill_embeds(info_dict)
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/model_executor/models/fish_speech/fish_speech_slow_ar.py", line 530, in _build_structured_voice_clone_prefill_embeds
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     ref_audio_wav = np.load(ref_audio_path)
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                     ^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/numpy/lib/npyio.py", line 427, in load
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     fid = stack.enter_context(open(os_fspath(file), "rb"))
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932] FileNotFoundError: [Errno 2] No such file or directory: '/tmp/fish_ref_vwbsa8u2.npy'
(Worker_TP2 pid=829) ERROR 04-08 18:02:45 [multiproc_executor.py:932] 
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932] WorkerProc hit an exception.
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932] Traceback (most recent call last):
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 927, in worker_busy_loop
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     output = func(*args, **kwargs)
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]              ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/worker_base.py", line 332, in execute_model
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return self.worker.execute_model(scheduler_output)
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return func(*args, **kwargs)
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 822, in execute_model
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     output = self.model_runner.execute_model(
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return func(*args, **kwargs)
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/worker/gpu_ar_model_runner.py", line 267, in execute_model
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     ) = self._preprocess(scheduler_output, num_tokens_padded, intermediate_tensors)
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/worker/gpu_model_runner.py", line 1225, in _preprocess
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     req_input_ids, req_embeds, update_dict = self.model.preprocess(
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                                              ^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/model_executor/models/fish_speech/fish_speech_slow_ar.py", line 368, in preprocess
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     prompt_embeds = self._build_structured_voice_clone_prefill_embeds(info_dict)
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/model_executor/models/fish_speech/fish_speech_slow_ar.py", line 530, in _build_structured_voice_clone_prefill_embeds
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     ref_audio_wav = np.load(ref_audio_path)
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                     ^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/numpy/lib/npyio.py", line 427, in load
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     fid = stack.enter_context(open(os_fspath(file), "rb"))
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932] FileNotFoundError: [Errno 2] No such file or directory: '/tmp/fish_ref_vwbsa8u2.npy'
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932] Traceback (most recent call last):
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/executor/multiproc_executor.py", line 927, in worker_busy_loop
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     output = func(*args, **kwargs)
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]              ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/worker_base.py", line 332, in execute_model
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return self.worker.execute_model(scheduler_output)
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return func(*args, **kwargs)
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py", line 822, in execute_model
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     output = self.model_runner.execute_model(
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/torch/utils/_contextlib.py", line 124, in decorate_context
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     return func(*args, **kwargs)
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]            ^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/worker/gpu_ar_model_runner.py", line 267, in execute_model
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     ) = self._preprocess(scheduler_output, num_tokens_padded, intermediate_tensors)
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/worker/gpu_model_runner.py", line 1225, in _preprocess
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     req_input_ids, req_embeds, update_dict = self.model.preprocess(
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                                              ^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/model_executor/models/fish_speech/fish_speech_slow_ar.py", line 368, in preprocess
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     prompt_embeds = self._build_structured_voice_clone_prefill_embeds(info_dict)
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/vllm_omni/model_executor/models/fish_speech/fish_speech_slow_ar.py", line 530, in _build_structured_voice_clone_prefill_embeds
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     ref_audio_wav = np.load(ref_audio_path)
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                     ^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]   File "/usr/local/lib/python3.12/dist-packages/numpy/lib/npyio.py", line 427, in load
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]     fid = stack.enter_context(open(os_fspath(file), "rb"))
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932]                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932] FileNotFoundError: [Errno 2] No such file or directory: '/tmp/fish_ref_vwbsa8u2.npy'
(Worker_TP1 pid=828) ERROR 04-08 18:02:45 [multiproc_executor.py:932] 
(Worker_TP0 pid=827) INFO 04-08 18:02:45 [dac_encoder.py:145] Encoded reference audio: 257533 samples @ 24000Hz -> 126 semantic tokens



Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions