Hi, I was running the test_all_reduce which fails because the tensors are not similar enough after around 7 digits. Is this an expected behavior ? or do we want exact matching value ?
torch.set_printoptions(precision=10)
ground_truth[torch_rank]
tensor([ 2.1518430710, -3.1110968590, -3.0482885838, -0.1535055935,
0.5451227427, 0.6111478806, -0.0351422131, -0.0474908650,
-1.0834826231, 0.4075058699], device='cuda:0')
data_list[torch_rank]
tensor([ 2.1518430710, -3.1110968590, -3.0482888222, -0.1535055935,
0.5451227427, 0.6111478806, -0.0351421833, -0.0474908650,
-1.0834826231, 0.4075058401], device='cuda:0')