ape_mistral.py line 170:
mistral_attention_prefill_query
|
key_states_context = torch.cat([key_states[:, :, :self.len_prefix], key_states[:, :, self.len_prefix+self.len_context:]], dim=-2) |
|
key_states_other = key_states[:, :, self.len_prefix:self.len_prefix+self.len_context] |
|
value_states_context = torch.cat([value_states[:, :, :self.len_prefix], value_states[:, :, self.len_prefix+self.len_context:]], dim=-2) |
|
value_states_other = value_states[:, :, self.len_prefix:self.len_prefix+self.len_context] |
Are key_states_context and key_states_other reversed?