Skip to content

Commit f5645ac

Browse files
typo fix
1 parent 34e43a5 commit f5645ac

File tree

1 file changed

+1
-1
lines changed
  • recipes/experimental/long-context/H2O

1 file changed

+1
-1
lines changed

recipes/experimental/long-context/H2O/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ python run_summarization.py \
2626

2727
##### **Results**
2828

29-
Expected results on XSUM (Rouge-2 score, ther higher the better) from the above scripts on Llama-2/3 models. The sequence length of inputs are ~2k. Here we constrains the size of KV cache, allowing only n KVs to be write/read after the prefilling stage. n ranges from **64** to **full** where we maintain all the KV pairs. With 128 KVs, the performance can be matched as the full baseline (~2k KVs) while performance degradation is observed with 64 KVs. Also, maintaining a smaller KV cache reduces the I/O cost of KVs, thus we can achieve better throughput.
29+
Expected results on XSUM (Rouge-2 score, the higher the better) from the above scripts on Llama-2/3 models. The sequence length of inputs are ~2k. Here we constrains the size of KV cache, allowing only n KVs to be write/read after the prefilling stage. n ranges from **64** to **full** where we maintain all the KV pairs. With 128 KVs, the performance can be matched as the full baseline (~2k KVs) while performance degradation is observed with 64 KVs. Also, maintaining a smaller KV cache reduces the I/O cost of KVs, thus we can achieve better throughput.
3030

3131
| KV Cache Size | 64 | 128 | 256 | 512 | 1024 | Full |
3232
| ------------- | ------ | ------ | ------ | ------ | ------ | ------ |

0 commit comments

Comments
 (0)