Behavior of se_atten with respect to atten_layer #2817

ald3hyde · 2023-09-14T08:26:44Z

ald3hyde
Sep 14, 2023

Dear all,
You previously(#2638 ) told me that atten_layer=0 is also useful and fast for multi-component models, but I would like to know how atten_layer behaves. It doesn't seem to have a significant impact on the accuracy of the model, does it?

wanghan-iapcm · 2023-09-15T05:29:03Z

wanghan-iapcm
Sep 15, 2023
Maintainer

The attention layers are critical for the generalizability of the model . Please visit https://arxiv.org/abs/2208.08236 and check the ablation study in the updated version of the dpa-1 manuscript after Mon, 18 Sep 2023 00:00:00 GMT

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Behavior of se_atten with respect to atten_layer #2817

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Behavior of se_atten with respect to atten_layer #2817

Uh oh!

Uh oh!

ald3hyde Sep 14, 2023

Replies: 1 comment

Uh oh!

wanghan-iapcm Sep 15, 2023 Maintainer

ald3hyde
Sep 14, 2023

wanghan-iapcm
Sep 15, 2023
Maintainer