ROCm PyTorch unit tests status

Summary of unit tests:

Legend::: N: Unittest group name, T: Total tests, F: Failed, E: Errors, S: Skipped (ROCm only), SG: Skipped on GPUs, EF: Expected Failures, P: Passed, PR: Pass rate [P*100/(T-EF-SG)], CM: Comments/Modifications

N			T	F	E	S	SG	EF	P	PR	CM
test_autograd		849	0	0	9	0	0	840	99%
test_c10d										Not enabled yet (ready for testing?)   
test_cpp_extensions									Not enabled yet
test_cuda		2036	0	0	447	0	0	1589	78%
test_dataloader		44	0	0	0	2	0	42	100%
test_distributed									Not enabled yet
test_distributions	176     0       0       3       0	0       173     98%
test_indexing		46	0	0	0	0	0	46	100%
test_jit		1198	0	0	36	14	3	1145	97%
test_legacy_nn		416	0	0	13	0	0	403	97%
test_multiprocessing	        							Not enabled yet
test_nccl			        						Not enabled yet
test_nn			1221	0	0	102	114	2	1003	90%
test_optim		34	0	0	2	0	0	32	94%
test_sparse		594	0	0	175	18	0	401	70%
test_torch		384	0	0	31	0	0	353	90%
test_utils										Not enabled yet
TOTAL			6998				136	5	6027	88%

Details of failing unit tests:

test_autograd

Skip due to seg fault:
test_pin_memory at aten/src/ATen/RegisterCUDA.cpp:30 (JMD: works for me)
test_set_requires_grad_only_for_floats_cuda

Skip due to undefined symbol hiprngMakeMTGP32Constants:
test_rnn_backward_to_input_but_not_parameters_cuda
test_requires_grad_factory (failed in CI)

Skip due to 'Memory access fault' (Failed in CI):
test_inputbuffer_add_multigpu
test_type_conversions
test_unused_output_gpu

test_dataloader

Skip due to hang:
test_manager_unclean_exit (due to leaked semaphores (?)) (JMD: according to comments, seems to be python 2.7 issue)

test_jit

Skip due to "RuntimeError: cannot compile a CUDA fusion group, CUDA is not enabled":
test_cpp
test_exp
test_fusion_distribute
test_lstm_fusion_concat
test_lstm_fusion_cuda
test_relu
test_tensor_number_math_cuda
test_comparison_ge_le
test_comparison_gt_lt
test_concat_fusion
test_ge_cuda
test_traced_module
JMD: will require us to enable CUDAFusionFunction which explicitly seems to call nvcc

test_optim

Skip due to hang:
test_adamax (JMD: works for me but fails on CI)
test_rprop - hangs in a thrust kernel

test_torch

Skip due to memory access page fault:
test_topk_noncontiguous_gpu (null pointer being passed to bitonic sort bitonicSortKVInPlace , it seems) JMD: fixed through gather changes, in branch

Skip due to seg fault:
test_half_tensor_cuda (due to build/aten/src/ATen/CUDAHalfType.cpp:2263)
test_print (due to build/aten/src/ATen/CUDAHalfType.cpp:151 fill)

Skip due to AssertionError:
test_norm_cuda (failing with "dim reduction failed for 0-norm")

Skip due to hang:
test_empty_full

Skip due to cublas runtime error:
test_blas_alpha_beta_empty
test_blas_empty

Skip due to RuntimeError:
test_pairwise_distance_empty (failing with "RuntimeError: cuda runtime error (1011) : hipErrorInvalidValue")
test_tensor_factories_empty (failing with "RuntimeError: cuda runtime error (1011) : hipErrorInvalidValue")
test_tensor_shape_empty (failing with "RuntimeError: cuda runtime error (1011) : hipErrorInvalidValue")

test_cuda

Skip due to assertion error:

8 test_\*Tensor_nonzero (Thrust issue; gives correct result for <=960 threads)  
16 test_\*Tensor_prod\*dim + 16 test_\*Tensor_sum\*dim + 4 test_\*Tensor_norm_3\*dim (issue with kernelReduceContigDim and kernelReduceNoncontigDim_shared)  
40 test_\*Tensor_sort\* + 24 test_\*Tensor_topk\* (Memory access fault due to bitonicSortKVInPlace (alternately fails with assertion error when not access faulting))  
2 test_DoubleTensor_mean\*dim (Native elementwise_kernel with div_constant_impl<double>)  
8 test_\*Tensor_mvlgamma\* (Native elementwise_kernel with div_add_impl<>)  
12 test_\*Tensor_renorm\* (THCTensor_kernel_renorm ?)

Skip due to runtime error:

test_fft_ifft_rfft_irfft (due to undefined symbol: hipfftCreate)  
test_from_sequence + test_randperm_cuda (due to undefined symbol: \_ZN12_GLOBAL__N_112__float2halfEf)  
test_DoubleTensor_inverse + test_FloatTensor_inverse + test_btrifact + test_btrisolve + (due to forced(?) rocblas internal error)  
test_events + test_caching_pinned_memory + test_record_stream (due to 'NoneType' object has no attribute 'cudaEventCreateWithFlags')  
test_streams (due to 'NoneType' object has no attribute 'cudaStreamQuery')  
test_nvtx (due to "undefined symbol: nvtxMarkA")  
test_bincount_cuda (due to hipErrorInvalidValue)  
test_trtrs + test_symeig + test_pinverse + test_matrix_rank + 2 test_gesv\* + test_det_logdet_slogdet + 12 test_(Float|Double)Tensor_svd\* + 8 test_(Float|Double)Tensor_qr\* + 2 test_(Float|Double)Tensor_geqrf + 2 test_(Float|Double)Tensor_eig_with_eigvec (due to no MAGMA library detected)  
12 test_HalfTensor_<addbmm* | addmm* | addr* | baddbmm*> cublas Runtime error in THCBlas.cu

Skip due to hang:

2 test_FloatTensor_mean\*dim (Native elementwise_kernel with div_constant_impl<float>)  
4 test_\*Tensor_add + 4 test_\*Tensor_add_ + 4 test_\*Tensor_sub + 4 test_\*Tensor_sub_ (Native elementwise_kernel with add_kernel_impl; float, double, int and long tensor tests pass for these)  
10 test_\*Tensor_div\* (Native_elementwise_kernel with div_constant_impl; double, int and long tensor tests pass for these)  
8 test_\*Tensor_mul\* (Native elementwise_kernel with mul_kernel_impl; float, double, int and long tensor tests pass for these)  
8 test_\*Tensor_put_ + test_broadcast (TensorPutOp bug)  
8 test_\*Tensor_take + 3 test_advancedindex\* + test_index + test_multinomial (TensorTakeOp bug)

Skip due to undefined symbol (float2half and half2float):

3 test_HalfTensor_addbmm*
3 test_HalfTensor_addmm*
6 test_HalfTensor_addmv*
3 test_HalfTensor_addr
3 test_HalfTensor_baddbmm
4 test_HalfTesnor_cum<prod|sum>
3 test_HalfTensor_dist*
10 test_HalfTensor_pow*
1 test_halfTensor_max
1 test_halftensor_min
4 test_HalfTensor_norm*
6 test_HalfTensor_renorm*
1 test_tiny_half_norm_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ROCm PyTorch unit tests status

Summary of unit tests:

Details of failing unit tests:

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally