Skip to content

Commit 9320df4

Browse files
authored
Create benchmarks-release-notes.md
1 parent 6abf054 commit 9320df4

File tree

1 file changed

+81
-0
lines changed

1 file changed

+81
-0
lines changed
Lines changed: 81 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,81 @@
1+
---
2+
title: Benchmarks in Azure AI Studio Release Notes
3+
titleSuffix: Azure AI Studio
4+
description: Release notes for the Benchmarks experience in Azure AI Studio
5+
author: jesscioffi
6+
ms.service: azure-ai-studio
7+
ms.custom:
8+
9+
ms.author: jcioffi
10+
ms.date: 02/01/2024
11+
ms.topic: reference
12+
---
13+
14+
# Benchmarks in Azure AI Studio release notes
15+
16+
In this article, learn about updates to the Benchmarks experience in Azure AI Studio. For additional information about the Benchmarks experience, check out [this page](./model-catalog.md).
17+
18+
Due to the rapidly evolving landscape of AI, we intend to update the Benchmarks experience with more models, more datasets, more tasks and more metrics over time.
19+
20+
To access the Benchmarks experience, please go to the `Explore` tab in the Azure AI Studio, and then from the left side menu, select `Benchmarks` from under the `Models` section.
21+
22+
## January 31, 2024
23+
24+
Added models:
25+
- `microsoft-phi-2`
26+
- `mistralai-mistral-7b-instruct-v01`
27+
- `mistralai-mistral-7b-v01`
28+
- `codellama-13b-hf`
29+
- `codellama-13b-instruct-hf`
30+
- `codellama-13b-python-hf`
31+
- `codellama-34b-hf`
32+
- `codellama-34b-instruct-hf`
33+
- `codellama-34b-python-hf`
34+
- `codellama-7b-hf`
35+
- `codellama-7b-instruct-hf`
36+
- `codellama-7b-python-hf`
37+
38+
Added datasets:
39+
- `truthfulqa_generation`
40+
- `truthfulqa_mc1`
41+
42+
Added metrics:
43+
- `Coherence`
44+
- `Fluency`
45+
- `GPTSimilarity`
46+
47+
## November 15, 2023
48+
49+
**Public Preview of Benchmarks in Azure AI Studio**
50+
51+
Added models:
52+
- `gpt-35-turbo-0301`
53+
- `gpt-4-0314`
54+
- `gpt-4-32k-0314`
55+
- `llama-2-13b-chat`
56+
- `llama-2-13b`
57+
- `llama-2-70b-chat`
58+
- `llama-2-70b`
59+
- `llama-2-7b-chat`
60+
- `llama-2-7b`
61+
62+
Added datasets:
63+
- `boolq`
64+
- `gsm8k`
65+
- `hellaswag`
66+
- `human_eval`
67+
- `mmlu_humanities`
68+
- `mmlu_other`
69+
- `mmlu_social_sciences`
70+
- `mmlu_stem`
71+
- `openbookqa`
72+
- `piqa`
73+
- `social_iqa`
74+
- `winogrande`
75+
76+
Added tasks:
77+
- `Question Answering`
78+
- `Text Generation`
79+
80+
Added metrics:
81+
- `Accuracy`

0 commit comments

Comments
 (0)