Skip to content

RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc) #10

@zwl8979

Description

@zwl8979

When I reproduced softpatch, the noise discriminator reported an error when selecting nearest. The first five categories (bottle, cable, capsule, carpet, grid) ran normally, but when it came to the mental_nut category, an error was reported when printing the Subsampling progress (batchsize=16, no overlap, without nosie augmentation). The same error occurred when the batchsize was set to 4. My graphics card is 4090 (24g memory):

INFO:main:Evaluating dataset [mvtec_hazelnut] (6/15)...
/home/zwl/anaconda3/envs/softpatch/lib/python3.8/site-packages/torchvision/models/_utils.py:208: UserWarning: The parameter 'pretrained' is deprecated since 0.13 and may be removed in the future, please use 'weights' instead.
warnings.warn(
/home/zwl/anaconda3/envs/softpatch/lib/python3.8/site-packages/torchvision/models/_utils.py:223: UserWarning: Arguments other than a weight enum or None for 'weights' are deprecated since 0.13 and may be removed in the future. The current behavior is equivalent to passing weights=Wide_ResNet50_2_Weights.IMAGENET1K_V1. You can also use weights=Wide_ResNet50_2_Weights.DEFAULT to get the most up-to-date weights.
warnings.warn(msg)
INFO:main:Training models (1/1)
Computing support features...: 100%|██████████| 27/27 [00:15<00:00, 1.80it/s]
Traceback (most recent call last):
File "/home/zwl/Zwl/SoftPatch-main/main.py", line 555, in
run(args)
File "/home/zwl/Zwl/SoftPatch-main/main.py", line 369, in run
coreset.fit(dataloaders["training"])
File "/home/zwl/Zwl/SoftPatch-main/src/softpatch.py", line 225, in fit
self._fill_memory_bank(training_data)
File "/home/zwl/Zwl/SoftPatch-main/src/softpatch.py", line 272, in _fill_memory_bank
sample_features, sample_indices = self.featuresampler.run(features)
File "/home/zwl/Zwl/SoftPatch-main/src/sampler.py", line 111, in run
reduced_features = self._reduce_features(features)
File "/home/zwl/Zwl/SoftPatch-main/src/sampler.py", line 90, in _reduce_features
return mapper(features)
File "/home/zwl/anaconda3/envs/softpatch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
return forward_call(*input, **kwargs)
File "/home/zwl/anaconda3/envs/softpatch/lib/python3.8/site-packages/torch/nn/modules/linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)

This is the parameter configuration of my main function(No overlap - without augumentation)
--gpu
0
--seed
0
--results_path
result
--log_project
MVTecAD-wideresnet50-noise0.1
--log_group
No-overlap-SW-NA-nearest_0
--save_segmentation_images
--sampler_name
approx_greedy_coreset
--sampling_ratio
0.1
--faiss_on_gpu
--faiss_num_workers
8
--weight_method
nearest

--threshold
0.15
--lof_k
6
--dataset
mvtec
--data_path
./MVTecAD
--subdatasets
bottle
cable
capsule
carpet
grid
hazelnut
leather
metal_nut
pill
screw
tile
toothbrush
transistor
wood
zipper
--batch_size
16

--resize
256
--imagesize
224
--noise
0.1
--fold
0

Have you ever encountered this problem? How did you solve it? @jam-cc

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions