|
1 | | -<h1 align="center">SciEval ToolKit</h1> |
| 1 | +<h1 align="center"><img src="assets/icon/opencompass.png" alt="OpenCompass" height="50" style="vertical-align:middle;" /> SciEval ToolKit</h1> |
2 | 2 |
|
3 | 3 | <p align="center"><strong> |
4 | 4 | A unified evaluation toolkit and leaderboard for rigorously assessing the scientific intelligence of large language and vision–language models across the full research workflow. |
5 | 5 | </strong></p> |
6 | 6 |
|
7 | 7 | <hr style="width:100%;margin:16px 0;border:0;border-top:0.1px solid #d0d7de;" /> |
8 | 8 |
|
9 | | -<p align="center"> |
10 | | - <span style="display:inline-block;vertical-align:middle;"> |
11 | | - <a href="https://opencompass.org.cn/Intern-Discovery-Eval" style="text-decoration:none;border-bottom:0;"> |
12 | | - <img src="https://img.shields.io/badge/Website-SciEval-b8dcff?style=for-the-badge&logo=google-chrome&logoColor=white" style="display:block;" /> |
13 | | - </a> |
14 | | - </span> |
15 | | - <span style="display:inline-block;vertical-align:middle;"> |
16 | | - <a href="https://huggingface.co/spaces/InternScience/SciEval-Leaderboard" style="text-decoration:none;border-bottom:0;"> |
17 | | - <img src="https://img.shields.io/badge/LEADERBOARD-Scieval-f6e58d?style=for-the-badge&logo=huggingface" style="display:block;" /> |
18 | | - </a> |
19 | | - </span> |
20 | | - <span style="display:inline-block;vertical-align:middle;"> |
21 | | - <a href="https://github.com/InternScience/SciEvalKit/blob/main/docs/SciEvalKit.pdf" style="text-decoration:none;border-bottom:0;"> |
22 | | - <img src="https://img.shields.io/badge/REPORT-Technical-f4c2d7?style=for-the-badge" style="display:block;" /> |
23 | | - </a> |
24 | | - </span> |
25 | | - <span style="display:inline-block;vertical-align:middle;"> |
26 | | - <a href="https://github.com/InternScience/SciEvalKit" style="text-decoration:none;border-bottom:0;"> |
27 | | - <img src="https://img.shields.io/badge/GitHub-Repository-c7b9e2?style=for-the-badge&logo=github&logoColor=white" style="display:block;" /> |
28 | | - </a> |
29 | | - </span> |
30 | | -</p> |
| 9 | +<div align="center"> |
31 | 10 |
|
32 | | -<p align="center"> |
33 | | - <img src="assets/icon/welcome.png" alt="welcome" height="24" style="vertical-align:middle;" /> |
34 | | - Welcome to the official repository of <strong>SciEval</strong>! |
35 | | -</p> |
| 11 | +[](https://opencompass.org.cn/Intern-Discovery-Eval)  |
| 12 | +[](https://huggingface.co/spaces/InternScience/SciEval-Leaderboard)  |
| 13 | +[](https://github.com/InternScience/SciEvalKit/blob/main/docs/SciEvalKit.pdf)  |
| 14 | +[](https://github.com/InternScience/SciEvalKit) |
| 15 | + |
| 16 | +<img src="assets/icon/welcome.png" alt="welcome" height="24" style="vertical-align:middle;" /> |
| 17 | + Welcome to the official repository of <strong>SciEval</strong>! |
36 | 18 |
|
| 19 | +</div> |
37 | 20 |
|
38 | 21 | ## <img src="assets/icon/why.png" alt="why" height="28" style="vertical-align:middle;" /> Why SciEval? |
39 | 22 |
|
@@ -151,7 +134,7 @@ python run.py \ |
151 | 134 | ## <img src="assets/icon/update.png" alt="update" height="28" style="vertical-align:middle;" /> Codebase Updates |
152 | 135 |
|
153 | 136 | * **Execution‑based Scoring** |
154 | | - Code‑generation tasks (SciCode, AstroVisBench) are now graded via sandboxed unit tests. |
| 137 | + • Code‑generation tasks (SciCode, AstroVisBench) are now graded via sandboxed unit tests. |
155 | 138 |
|
156 | 139 |
|
157 | 140 |
|
|
0 commit comments