typo fix

HamidShojanazeri · web-flow · commit f5645ace4f4e · 2024-06-17T16:59:47.000-07:00
diff --git a/recipes/experimental/long-context/H2O/README.md b/recipes/experimental/long-context/H2O/README.md
@@ -26,7 +26,7 @@ python run_summarization.py \
 
 ##### **Results**
 
-Expected results on XSUM (Rouge-2 score, ther higher the better) from the above scripts on Llama-2/3 models. The sequence length of inputs are ~2k. Here we constrains the size of KV cache, allowing only n KVs to be write/read after the prefilling stage. n ranges from **64** to **full** where we maintain all the KV pairs. With 128 KVs, the performance can be matched as the full baseline (~2k KVs) while performance degradation is observed with 64 KVs. Also, maintaining a smaller KV cache reduces the I/O cost of KVs, thus we can achieve better throughput.
+Expected results on XSUM (Rouge-2 score, the higher the better) from the above scripts on Llama-2/3 models. The sequence length of inputs are ~2k. Here we constrains the size of KV cache, allowing only n KVs to be write/read after the prefilling stage. n ranges from **64** to **full** where we maintain all the KV pairs. With 128 KVs, the performance can be matched as the full baseline (~2k KVs) while performance degradation is observed with 64 KVs. Also, maintaining a smaller KV cache reduces the I/O cost of KVs, thus we can achieve better throughput.
 
 | KV Cache Size | 64     | 128    | 256    | 512    | 1024   | Full   |
 | ------------- | ------ | ------ | ------ | ------ | ------ | ------ |