It seems that the released code did not implement 'stochastic depth' in CoAtNet module, but it was mentioned in the appedix A.2 of paper.