File tree Expand file tree Collapse file tree 1 file changed +9
-9
lines changed
public/article-2025/04/grok3 Expand file tree Collapse file tree 1 file changed +9
-9
lines changed Original file line number Diff line number Diff line change @@ -29,15 +29,15 @@ Grok3 は、強力な推論能力と広範な知識を備えています。主
2929- ** ベンチマーク性能** : Chatbot Arena で Elo スコア 1402 を記録([ xAI ニュース] ( https://x.ai/news/grok-3 ) )。以下は主要ベンチマーク結果:
3030
3131| ベンチマーク | Grok3 Beta | Grok3 mini Beta |
32- | ------------ | ---------- | --------------- |
33- | AIME’24 | 52.2% | 39.7% |
34- | GPQA | 75.4% | 66.2% |
35- | LCB | 57.0% | 41.5% |
36- | MMLU-pro | 79.9% | 78 .9% |
37- | LOFT (128k) | 83.3% | 83.1% |
38- | SimpleQA | 43.6% | 21.7% |
39- | MMMU | 73.2% | 69.4% |
40- | EgoSchema | 74.5% | 74.3% |
32+ | ------------ | ---------: | --------------: |
33+ | AIME’24 | 52.2% | 39.7% |
34+ | GPQA | 75.4% | 66.2% |
35+ | LCB | 57.0% | 41.5% |
36+ | MMLU-pro | 79 .9% | 78.9% |
37+ | LOFT (128k) | 83.3% | 83.1% |
38+ | SimpleQA | 43.6% | 21.7% |
39+ | MMMU | 73.2% | 69.4% |
40+ | EgoSchema | 74.5% | 74.3% |
4141
4242- ** テスト時計算** : Grok3 Think(cons@64)では、AIME’25 で 93.3%、GPQA で 84.6%、LiveCodeBench で 79.4%。
4343- ** コンテキストウィンドウ** : 100 万トークン(従来モデルの 8 倍)。
You can’t perform that action at this time.
0 commit comments