Can we use a mean gradient in calculating the constrained gradient?

I think the correct way to implement the constrained gradient in batch is to do gradient rejection for each sample in a batch before reducing them into one grad. 

I don't think we can take the mean of the gradients and do rejection on the mean gradient and get the same results.

Any ideas?