might be an error in the implementation of mistral_attention_prefill_query

ape_mistral.py line 170: 
mistral_attention_prefill_query

https://github.com/Infini-AI-Lab/APE/blob/52513bcb0c3145aedec26636eb59f7c3deabb856/ape/ape_mistral.py#L170-L173

Are key_states_context and key_states_other reversed?

	key_states_context = torch.cat([key_states[:, :, :self.len_prefix], key_states[:, :, self.len_prefix+self.len_context:]], dim=-2)
	key_states_other = key_states[:, :, self.len_prefix:self.len_prefix+self.len_context]
	value_states_context = torch.cat([value_states[:, :, :self.len_prefix], value_states[:, :, self.len_prefix+self.len_context:]], dim=-2)
	value_states_other = value_states[:, :, self.len_prefix:self.len_prefix+self.len_context]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

might be an error in the implementation of mistral_attention_prefill_query #2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

might be an error in the implementation of mistral_attention_prefill_query #2

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions