Replies: 1 comment
-
The attention layers are critical for the generalizability of the model . Please visit https://arxiv.org/abs/2208.08236 and check the ablation study in the updated version of the dpa-1 manuscript after Mon, 18 Sep 2023 00:00:00 GMT |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Dear all,
You previously(#2638 ) told me that atten_layer=0 is also useful and fast for multi-component models, but I would like to know how atten_layer behaves. It doesn't seem to have a significant impact on the accuracy of the model, does it?
Beta Was this translation helpful? Give feedback.
All reactions