Skip to content

Commit 2f4641f

Browse files
committed
Update on "add eval for attention sink"
This PR adds the function to evaluate the model's perplexity when AttentionSink is enabled. This is mostly copied from https://github.com/mit-han-lab/streaming-llm/blob/main/examples/eval_long_ppl.py which is used by the AttentionSink paper to evaluate the model's perplexity when AttentionSink is enabled. Differential Revision: [D66474732](https://our.internmc.facebook.com/intern/diff/D66474732/) Perplexity measured for llama 3.2 1B and 1B_Instruct model up to 40k tokens with AttentionSink enabled: <img width="966" alt="Screenshot 2024-11-25 at 2 46 04 PM" src="https://github.com/user-attachments/assets/ba7118f9-b5d7-4de8-b1fa-7d2ba0646515"> [ghstack-poisoned]
2 parents 38d9e1c + 493607e commit 2f4641f

File tree

1 file changed

+1
-3
lines changed

1 file changed

+1
-3
lines changed

examples/models/llama/eval_llama_lib.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -318,9 +318,7 @@ def eval_llama(
318318
print(f"{task}: {res}")
319319

320320

321-
def eval_llama_with_attention_sink(
322-
model_name: str, args: argparse.ArgumentParser
323-
):
321+
def eval_llama_with_attention_sink(model_name: str, args: argparse.ArgumentParser):
324322
"""
325323
Evaluate the model's perplexity when AttentionSink is enabled.
326324

0 commit comments

Comments
 (0)