In the original paper, I saw that the concat operation is after the activation. However, in your implementation, the order is reversed. Is there some reasons to change the order, or the order influences the results little. Thank you for your attention.