Implementation of Mutual Self Attention

Hello I have been reading your paper and using it as a baseline to develop specific editing algorithms for specific tasks. Fantastic paper btw
I have noticed that in #22  the issue of inconsistency between the description and implementation of mutual self attention, specifically in regards to the value vectors, was mentioned.
https://github.com/sled-group/InfEdit/blob/eaac91e139cbb6736934b4be7954fb2ae04bbb0b/app_infedit.py#L246-L251

However, unless I'm mistaken, it is not adressed or updated

assuming the attention vectors are split into [src-tgt-layout] wouldn't a more correct implementation of algorithm 3, during the self-edit step, be `                
  
                qu=torch.cat([qu[:num_heads],qu[:num_heads],qu[:num_heads]])

                qc=torch.cat([qc[:num_heads],qc[:num_heads],qc[:num_heads]])

                ku=torch.cat([ku[:num_heads],ku[:num_heads],ku[:num_heads]])

                kc=torch.cat([kc[:num_heads],kc[:num_heads],kc[:num_heads]])

                vu=torch.cat([vu[:num_heads],vu[:num_heads],vc[:num_heads]]) 

                vc=torch.cat([vc[:num_heads],vc[:num_heads],vc[:num_heads]])`

Can you please clarify my confusion ? 
Thank you


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of Mutual Self Attention #31

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	qu=torch.cat([qu[:num_heads],qu[:num_heads],qu[:num_heads]])
	qc=torch.cat([qc[:num_heads],qc[:num_heads],qc[:num_heads]])
	ku=torch.cat([ku[:num_heads],ku[:num_heads],ku[:num_heads]])
	kc=torch.cat([kc[:num_heads],kc[:num_heads],kc[:num_heads]])
	vu=torch.cat([vu[:num_heads*2],vu[:num_heads]])
	vc=torch.cat([vc[:num_heads*2],vc[:num_heads]])

Implementation of Mutual Self Attention #31

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions