Skip to content

Commit c9a92d7

Browse files
committed
MTBench: Added example figure for weather dataset
1 parent 0c9d44d commit c9a92d7

File tree

2 files changed

+5
-1
lines changed

2 files changed

+5
-1
lines changed
745 KB
Loading

app/projects/mtbench/page.mdx

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,10 @@ The news-driven QA task includes two sub-tasks: correlation prediction and multi
5151

5252
![Figure 4. An Example of Multi-choice QA and Correlation Prediction on Finance Dataset |scale=0.8](./assets/QA_sample.png)
5353

54+
Figure 5 illustrates examples of the aforementioned tasks using the weather dataset.
55+
56+
![Figure 5. An Example of Technical Indicator Prediction, Trend Prediction and Multi-Choice QA on Weather Dataset |scale=0.7](./assets/weather_example.png)
57+
5458
Various state-of-the-art large language models (LLMs) were evaluated on MTBench to measure their ability to link news with time-series trends (see **Leaderboard**). The results reveal key challenges—models struggle with long-term pattern recognition, cause-and-effect relationships, and seamlessly combining insights from text and numbers.
5559

5660
## Leaderboard
@@ -63,7 +67,7 @@ Various state-of-the-art large language models (LLMs) were evaluated on MTBench
6367
</details>
6468

6569
<details>
66-
<summary>Leaderboard Trend Prediction</summary>
70+
<summary>Leaderboard for Trend Prediction</summary>
6771
<Table2/>
6872
</details>
6973

0 commit comments

Comments
 (0)