Skip to content

Commit 1cf6129

Browse files
committed
formatting
1 parent de2b3ef commit 1cf6129

File tree

1 file changed

+4
-6
lines changed

1 file changed

+4
-6
lines changed

articles/ai-studio/concepts/model-benchmarks.md

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -34,17 +34,15 @@ Model benchmarks assess LLMs and SLMs across the following categories: quality,
3434

3535
### Quality
3636

37-
Azure AI assesses the quality of LLMs and SLMs across various metrics that are grouped into two main categories: accuracy, and prompt assisted metrics:
37+
Azure AI assesses the quality of LLMs and SLMs across various metrics that are grouped into two main categories: accuracy, and prompt-assisted metrics:
3838

39-
40-
- Accuracy
39+
For accuracy metric:
4140

4241
| Metric | Description |
4342
|--------|-------------|
4443
| Accuracy | Accuracy scores are available at the dataset and the model levels. At the dataset level, the score is the average value of an accuracy metric computed over all examples in the dataset. The accuracy metric used is `exact-match` in all cases, except for the _HumanEval_ dataset that uses a `pass@1` metric. Exact match compares model generated text with the correct answer according to the dataset, reporting one if the generated text matches the answer exactly and zero otherwise. The `pass@1` metric measures the proportion of model solutions that pass a set of unit tests in a code generation task. At the model level, the accuracy score is the average of the dataset-level accuracies for each model. |
4544

46-
47-
- Prompt assisted metrics
45+
For prompt-assisted metrics:
4846

4947
| Metric | Description |
5048
|--------|-------------|
@@ -58,7 +56,7 @@ Azure AI also displays the quality index as follows:
5856

5957
| Index | Description |
6058
|-------|-------------|
61-
| Quality Index | GPTSimilarity scaled down from zero to one, averaged with our accuracy metrics. A higher quality index value is better. |
59+
| Quality index | GPTSimilarity scaled down from zero to one, averaged with our accuracy metrics. A higher quality index value is better. |
6260

6361
Azure AI assesses the quality index by using both the measurement of accuracy and GPTSimilarity as the prompt assisted metric. The stability of the GPTSimilarity metric averaging with the accuracy of the model provides an indicator of the overall quality of the model.
6462

0 commit comments

Comments
 (0)