-
Notifications
You must be signed in to change notification settings - Fork 677
Open
Description
The following error occasionally occurs when using CUDA for inference.
GPU:NVIDIA L20
2025-12-17 02:44:08,499 - __main__ - ERROR - [worker] Error while handling request: CUDA error: operation not permitted
Search for `cudaErrorNotPermitted' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Traceback (most recent call last):
File "/opt/voxcpm/main.py", line 148, in worker
for chunk in tts.generate_streaming(**kwargs):
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/voxcpm/src/voxcpm/core.py", line 269, in _generate
for wav, _, _ in generate_result:
^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 38, in generator_context
response = gen.send(None)
^^^^^^^^^^^^^^
File "/opt/voxcpm/src/voxcpm/model/voxcpm.py", line 670, in _generate_with_prompt_cache
for latent_pred, pred_audio_feat in inference_result:
^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 38, in generator_context
response = gen.send(None)
^^^^^^^^^^^^^^
File "/opt/voxcpm/src/voxcpm/model/voxcpm.py", line 742, in _inference
feat_embed = self.feat_encoder(feat) # [b, t, h_feat]
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 414, in __call__
return super().__call__(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 832, in compile_wrapper
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1775, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1786, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/voxcpm/src/voxcpm/modules/locenc/local_encoder.py", line 17, in forward
def forward(self, x):
File "/usr/local/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 1044, in _fn
return fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/_functorch/aot_autograd.py", line 1130, in forward
return compiled_fn(full_args)
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 353, in runtime_wrapper
all_outs = call_func_at_runtime_with_args(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/_functorch/_aot_autograd/utils.py", line 129, in call_func_at_runtime_with_args
out = normalize_as_list(f(args))
^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 724, in inner_fn
outs = compiled_fn(args)
^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/_functorch/_aot_autograd/runtime_wrappers.py", line 526, in wrapper
return compiled_fn(runtime_args)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/_inductor/output_code.py", line 613, in __call__
return self.current_callable(inputs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/torchinductor_root/u7/cu7iuglfv4fiy7yukmpkycnjjtrlt6ttrgfzj3ssjb5o5l4av6w4.py", line 2190, in call
(buf182,) = self.partitions[0](partition0_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/_inductor/compile_fx.py", line 1772, in run
return compiled_fn(new_inputs) # type: ignore[arg-type]
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/_inductor/cudagraph_trees.py", line 388, in deferred_cudagraphify
return fn(inputs)
^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/_inductor/utils.py", line 3017, in run
out = model(new_inputs)
^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/_inductor/cudagraph_trees.py", line 2012, in run
out = self._run(new_inputs, function_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/_inductor/cudagraph_trees.py", line 2182, in _run
return self.record_function(new_inputs, function_id)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/_inductor/cudagraph_trees.py", line 2219, in record_function
node = CUDAGraphNode(
^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/_inductor/cudagraph_trees.py", line 1037, in __init__
self.recording_outputs: Optional[OutputType] = self._record(
^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/site-packages/torch/_inductor/cudagraph_trees.py", line 1268, in _record
torch.cuda.graph(
File "/usr/local/lib/python3.12/site-packages/torch/cuda/graphs.py", line 265, in __exit__
self.cuda_graph.capture_end()
File "/usr/local/lib/python3.12/site-packages/torch/cuda/graphs.py", line 128, in capture_end
super().capture_end()
torch.AcceleratorError: CUDA error: operation not permitted
Search for `cudaErrorNotPermitted' in https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__TYPES.html for more information.
CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1
Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
Metadata
Metadata
Assignees
Labels
No labels