Create benchmarks-release-notes.md

jesscioffi · web-flow · commit 9320df4f3ad9 · 2024-02-01T17:32:31.000-05:00
diff --git a/articles/ai-studio/how-to/benchmarks-release-notes.md b/articles/ai-studio/how-to/benchmarks-release-notes.md
@@ -0,0 +1,81 @@
+---
+title: Benchmarks in Azure AI Studio Release Notes
+titleSuffix: Azure AI Studio
+description: Release notes for the Benchmarks experience in Azure AI Studio
+author: jesscioffi
+ms.service: azure-ai-studio
+ms.custom:
+
+ms.author: jcioffi
+ms.date: 02/01/2024
+ms.topic: reference
+---
+
+# Benchmarks in Azure AI Studio release notes
+
+In this article, learn about updates to the Benchmarks experience in Azure AI Studio. For additional information about the Benchmarks experience, check out [this page](./model-catalog.md).
+
+Due to the rapidly evolving landscape of AI, we intend to update the Benchmarks experience with more models, more datasets, more tasks and more metrics over time.
+
+To access the Benchmarks experience, please go to the `Explore` tab in the Azure AI Studio, and then from the left side menu, select `Benchmarks` from under the `Models` section.
+
+## January 31, 2024
+
+Added models:
+- `microsoft-phi-2`
+- `mistralai-mistral-7b-instruct-v01`
+- `mistralai-mistral-7b-v01`
+- `codellama-13b-hf`
+- `codellama-13b-instruct-hf`
+- `codellama-13b-python-hf`
+- `codellama-34b-hf`
+- `codellama-34b-instruct-hf`
+- `codellama-34b-python-hf`
+- `codellama-7b-hf`
+- `codellama-7b-instruct-hf`
+- `codellama-7b-python-hf`
+
+Added datasets:
+- `truthfulqa_generation`
+- `truthfulqa_mc1`
+
+Added metrics:
+- `Coherence`
+- `Fluency`
+- `GPTSimilarity`
+
+## November 15, 2023
+
+**Public Preview of Benchmarks in Azure AI Studio**
+
+Added models:
+- `gpt-35-turbo-0301`
+- `gpt-4-0314`
+- `gpt-4-32k-0314`
+- `llama-2-13b-chat`
+- `llama-2-13b`
+- `llama-2-70b-chat`
+- `llama-2-70b`
+- `llama-2-7b-chat`
+- `llama-2-7b`
+
+Added datasets:
+- `boolq`
+- `gsm8k`
+- `hellaswag`
+- `human_eval`
+- `mmlu_humanities`
+- `mmlu_other`
+- `mmlu_social_sciences`
+- `mmlu_stem`
+- `openbookqa`
+- `piqa`
+- `social_iqa`
+- `winogrande`
+
+Added tasks:
+- `Question Answering`
+- `Text Generation`
+  
+Added metrics:
+- `Accuracy`