You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-studio/concepts/model-benchmarks.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -21,7 +21,7 @@ In Azure AI Studio, you can compare benchmarks across models and datasets availa
21
21
22
22
Azure AI supports model benchmarking for select models that are popular and most frequently used. Supported models have a _benchmarks_ icon that looks like a histogram. You can find these models in the model catalog by using the **Collections** filter and selecting **Benchmark results**. You can then use the search functionality to find specific models.
23
23
24
-
:::image type="content" source="../media/how-to/model-benchmarks/access-model-catalog-benchmark.png" alt-text="Image showing how to filter for benchmark models in the model catalog homepage." lightbox="../media/how-to/model-benchmarks/access-model-catalog-benchmark.png":::
24
+
:::image type="content" source="../media/how-to/model-benchmarks/access-model-catalog-benchmark.png" alt-text="Screenshot showing how to filter for benchmark models in the model catalog homepage." lightbox="../media/how-to/model-benchmarks/access-model-catalog-benchmark.png":::
25
25
26
26
Model benchmarks help you make informed decisions about the sustainability of models and datasets before you initiate any job. The benchmarks are a curated list of the best-performing models for a task, based on a comprehensive comparison of benchmarking metrics. Azure AI Studio provides the following benchmarks for models, based on model catalog collections:
Copy file name to clipboardExpand all lines: articles/ai-studio/how-to/benchmark-model-in-catalog.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -38,17 +38,17 @@ Azure AI supports model benchmarking for select models that are popular and most
38
38
39
39
1. Go to the **Benchmarks** tab to check the benchmark results for the model.
40
40
41
-
:::image type="content" source="../media/how-to/model-benchmarks/gpt4o-benchmark-tab.png" alt-text="Image showing the benchmarks tab for gpt-4o." lightbox="../media/how-to/model-benchmarks/gpt4o-benchmark-tab.png":::
41
+
:::image type="content" source="../media/how-to/model-benchmarks/gpt4o-benchmark-tab.png" alt-text="Screenshot showing the benchmarks tab for gpt-4o." lightbox="../media/how-to/model-benchmarks/gpt4o-benchmark-tab.png":::
42
42
43
43
1. Return to the homepage of the model catalog.
44
44
1. Select **Compare models** on the model catalog's homepage to explore models with benchmark support, view their metrics, and analyze the trade-offs among different models. This analysis can inform your selection of the model that best fits your requirements.
45
45
46
-
:::image type="content" source="../media/how-to/model-benchmarks/compare-models-model-catalog.png" alt-text="Image showing the model comparison button on the model catalog main page." lightbox="../media/how-to/model-benchmarks/compare-models-model-catalog.png":::
46
+
:::image type="content" source="../media/how-to/model-benchmarks/compare-models-model-catalog.png" alt-text="Screenshot showing the model comparison button on the model catalog main page." lightbox="../media/how-to/model-benchmarks/compare-models-model-catalog.png":::
47
47
48
48
1. Select your desired tasks and specify the dimensions of interest, such as _AI Quality_ versus _Cost_, to evaluate the trade-offs among different models.
49
49
1. You can switch to the **List view** to access more detailed results for each model.
50
50
51
-
:::image type="content" source="../media/how-to/model-benchmarks/compare-view.png" alt-text="Image showing an example of benchmark comparison view." lightbox="../media/how-to/model-benchmarks/compare-view.png":::
51
+
:::image type="content" source="../media/how-to/model-benchmarks/compare-view.png" alt-text="Screenshot showing an example of benchmark comparison view." lightbox="../media/how-to/model-benchmarks/compare-view.png":::
52
52
53
53
## Analyze benchmark results
54
54
@@ -58,19 +58,19 @@ When you're in the "Benchmarks" tab for a specific model, you can gather extensi
58
58
-**Comparative charts**: These charts display the model's relative position compared to related models.
59
59
-**Metric comparison table**: This table presents detailed results for each metric.
60
60
61
-
:::image type="content" source="../media/how-to/model-benchmarks/gpt4o-benchmark-tab-expand.png" alt-text="Image showing benchmarks tab for gpt-4o." lightbox="../media/how-to/model-benchmarks/gpt4o-benchmark-tab-expand.png":::
61
+
:::image type="content" source="../media/how-to/model-benchmarks/gpt4o-benchmark-tab-expand.png" alt-text="Screenshot showing benchmarks tab for gpt-4o." lightbox="../media/how-to/model-benchmarks/gpt4o-benchmark-tab-expand.png":::
62
62
63
63
By default, AI Studio displays an average index across various metrics and datasets to provide a high-level overview of model performance.
64
64
65
65
To access benchmark results for a specific metric and dataset:
66
66
67
67
1. Select the expand button on the chart. The pop-up comparison chart reveals detailed information and offers greater flexibility for comparison.
68
68
69
-
:::image type="content" source="../media/how-to/model-benchmarks/expand-to-detailed-metric.png" alt-text="Image showing the expand button to select for a detailed comparison chart." lightbox="../media/how-to/model-benchmarks/expand-to-detailed-metric.png":::
69
+
:::image type="content" source="../media/how-to/model-benchmarks/expand-to-detailed-metric.png" alt-text="Screenshot showing the expand button to select for a detailed comparison chart." lightbox="../media/how-to/model-benchmarks/expand-to-detailed-metric.png":::
70
70
71
71
1. Select the metric of interest and choose different datasets, based on your specific scenario. For more detailed definitions of the metrics and descriptions of the public datasets used to calculate results, select **Read more**.
72
72
73
-
:::image type="content" source="../media/how-to/model-benchmarks/comparison-chart-per-metric-data.png" alt-text="Image showing the comparison chart with a specific metric and dataset." lightbox="../media/how-to/model-benchmarks/comparison-chart-per-metric-data.png":::
73
+
:::image type="content" source="../media/how-to/model-benchmarks/comparison-chart-per-metric-data.png" alt-text="Screenshot showing the comparison chart with a specific metric and dataset." lightbox="../media/how-to/model-benchmarks/comparison-chart-per-metric-data.png":::
74
74
75
75
76
76
## Evaluate benchmark results with your data
@@ -80,7 +80,7 @@ The previous sections showed the benchmark results calculated by Microsoft, usin
80
80
1. Return to the **Benchmarks** tab in the model card.
81
81
1. Select **Try with your own data** to evaluate the model with your data. Evaluation on your data helps you see how the model performs in your particular scenarios.
82
82
83
-
:::image type="content" source="../media/how-to/model-benchmarks/try-with-your-own-data.png" alt-text="Image showing the button to select for evaluating with your own data." lightbox="../media/how-to/model-benchmarks/try-with-your-own-data.png":::
83
+
:::image type="content" source="../media/how-to/model-benchmarks/try-with-your-own-data.png" alt-text="Screenshot showing the button to select for evaluating with your own data." lightbox="../media/how-to/model-benchmarks/try-with-your-own-data.png":::
0 commit comments