-
Notifications
You must be signed in to change notification settings - Fork 38
CUDA error: no kernel image is available for execution on the device #16
Description
python3 launch.py --config configs/$method.yaml --train --gpu 0 name="imagedream-sd21-shading" tag="astronaut" system.prompt_processor.prompt="an astronaut riding a horse" system.prompt_processor.image_path="${image_file}" system.guidance.ckpt_path="${ckpt_file}" system.guidance.config_path="${cfg_file}"
['0']
Seed set to 0
[INFO] Loading Multiview Diffusion ...
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is None and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is None and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is None and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 1280, context_dim is 1024 and using 20 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is None and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 640, context_dim is 1024 and using 10 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is None and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is None and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is 1024 and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is None and using 5 heads.
Setting up MemoryEfficientCrossAttention. Query dim is 320, context_dim is 1024 and using 5 heads.
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
making attention of type 'vanilla-xformers' with 512 in_channels
building MemoryEfficientAttnBlock with 512 in_channels...
[INFO] Loaded Multiview Diffusion!
[INFO] Using prompt [an astronaut riding a horse] and negative prompt [ugly, bad anatomy, blurry, pixelated obscure, unnatural colors, poor lighting, dull, and unclear, cropped, lowres, low quality, artifacts, duplicate, morbid, mutilated, poorly drawn face, deformed, dehydrated, bad proportions]
[INFO] Using view-dependent prompts [side]:[an astronaut riding a horse, side view] [front]:[an astronaut riding a horse, front view] [back]:[an astronaut riding a horse, back view] [overhead]:[an astronaut riding a horse, overhead view]
[INFO] Using 16bit Automatic Mixed Precision (AMP)
[INFO] GPU available: True (cuda), used: True
[INFO] TPU available: False, using: 0 TPU cores
[INFO] HPU available: False, using: 0 HPUs
[INFO] You are using a CUDA device ('NVIDIA A800-SXM4-80GB') that has Tensor Cores. To properly utilize them, you should set torch.set_float32_matmul_precision('medium' | 'high') which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
[INFO] LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
fatal: detected dubious ownership in repository at '/root/paddlejob/workspace/env_run'
To add an exception for this directory, call:
git config --global --add safe.directory /root/paddlejob/workspace/env_run
/root/paddlejob/workspace/env_run/threestudio/utils/callbacks.py:92: UserWarning: Code snapshot is not saved. Please make sure you have git installed and are in a git repository.
rank_zero_warn(
[INFO]
| Name | Type | Params | Mode
0 | geometry | ImplicitVolume | 12.6 M | train
1 | material | DiffuseWithPointLightMaterial | 0 | train
2 | background | NeuralEnvironmentMapBackground | 448 | train
3 | renderer | NeRFVolumeRenderer | 0 | train
4 | guidance | MultiviewDiffusionGuidance | 2.0 B | train
12.6 M Trainable params
2.0 B Non-trainable params
2.0 B Total params
8,096.164 Total estimated model params size (MB)
[INFO] Validation results will be saved to outputs/imagedream-sd21-shading/astronaut@20240805-231756/save
/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:424: The 'train_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the num_workers argumenttonum_workers=127in theDataLoaderto improve performance. /root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:424: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of thenum_workers argument to num_workers=127 in the DataLoader to improve performance.
Epoch 0: | | 0/? [00:00<?, ?it/s]Traceback (most recent call last):
File "/root/paddlejob/workspace/env_run/launch.py", line 238, in
main(args, extras)
File "/root/paddlejob/workspace/env_run/launch.py", line 181, in main
trainer.fit(system, datamodule=dm, ckpt_path=cfg.resume)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 543, in fit
call._call_and_handle_interrupt(
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 44, in _call_and_handle_interrupt
return trainer_fn(*args, **kwargs)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 579, in _fit_impl
self._run(model, ckpt_path=ckpt_path)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 986, in _run
results = self._run_stage()
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/trainer/trainer.py", line 1030, in _run_stage
self.fit_loop.run()
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 205, in run
self.advance()
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/loops/fit_loop.py", line 363, in advance
self.epoch_loop.run(self._data_fetcher)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 140, in run
self.advance(data_fetcher)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/loops/training_epoch_loop.py", line 250, in advance
batch_output = self.automatic_optimization.run(trainer.optimizers[0], batch_idx, kwargs)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 190, in run
self._optimizer_step(batch_idx, closure)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 268, in _optimizer_step
call._call_lightning_module_hook(
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 159, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/core/module.py", line 1308, in optimizer_step
optimizer.step(closure=optimizer_closure)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/core/optimizer.py", line 153, in step
step_output = self._strategy.optimizer_step(self._optimizer, closure, **kwargs)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/strategies/strategy.py", line 238, in optimizer_step
return self.precision_plugin.optimizer_step(optimizer, model=model, closure=closure, **kwargs)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/plugins/precision/amp.py", line 77, in optimizer_step
closure_result = closure()
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 144, in call
self._result = self.closure(*args, **kwargs)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 129, in closure
step_output = self._step_fn()
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/loops/optimization/automatic.py", line 317, in _training_step
training_step_output = call._call_strategy_hook(trainer, "training_step", *kwargs.values())
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/trainer/call.py", line 311, in _call_strategy_hook
output = fn(*args, **kwargs)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/pytorch_lightning/strategies/strategy.py", line 390, in training_step
return self.lightning_module.training_step(*args, **kwargs)
File "/root/paddlejob/workspace/env_run/threestudio/systems/imagedream.py", line 51, in training_step
out = self(batch)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in call_impl
return forward_call(*args, **kwargs)
File "/root/paddlejob/workspace/env_run/threestudio/systems/imagedream.py", line 48, in forward
return self.renderer(**batch)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in call_impl
return forward_call(*args, **kwargs)
File "/root/paddlejob/workspace/env_run/threestudio/models/renderers/nerf_volume_renderer.py", line 96, in forward
ray_indices, t_starts, t_ends = self.estimator.sampling(
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/nerfacc/estimators/occ_grid.py", line 164, in sampling
intervals, samples = traverse_grids(
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/nerfacc/grid.py", line 135, in traverse_grids
intervals, samples = _C.traverse_grids(
File "/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages/nerfacc/cuda/init.py", line 13, in call_cuda
return getattr(_C, name)(*args, **kwargs)
RuntimeError: CUDA error: no kernel image is available for execution on the device
Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.
nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:33:58_PDT_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0
pip list |grep torch
WARNING: Ignoring invalid distribution -formers (/root/anaconda3/envs/ImageDream/lib/python3.10/site-packages)
open-clip-torch 2.7.0
pytorch-lightning 2.3.3
tinycudann 1.7 /root/paddlejob/workspace/env_run/copy_file/tiny-cuda-nn/bindings/torch
torch 2.0.1+cu118
torchmetrics 1.4.0.post0
torchvision 0.15.2+cu118
How can I solve this cuda issue?