Skip to content

Commit b062f35

Browse files
Merge pull request #265011 from jesscioffi/main
Update model-catalog.md
2 parents 5b431a4 + eb406aa commit b062f35

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

articles/ai-studio/how-to/model-catalog.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,9 @@ The model benchmarks help you make informed decisions about the suitability of m
4747
| Metric | Description |
4848
|--------------|-------|
4949
| Accuracy |Accuracy scores are available at the dataset and the model levels. At the dataset level, the score is the average value of an accuracy metric computed over all examples in the dataset. The accuracy metric used is exact-match in all cases except for the *HumanEval* dataset that uses a `pass@1` metric. Exact match simply compares model generated text with the correct answer according to the dataset, reporting one if the generated text matches the answer exactly and zero otherwise. `Pass@1` measures the proportion of model solutions that pass a set of unit tests in a code generation task. At the model level, the accuracy score is the average of the dataset-level accuracies for each model.|
50+
| Coherence |Coherence evaluates how well the language model can produce output that flows smoothly, reads naturally, and resembles human-like language.|
51+
| Fluency |Fluency evaluates the language proficiency of a generative AI's predicted answer. It assesses how well the generated text adheres to grammatical rules, syntactic structures, and appropriate usage of vocabulary, resulting in linguistically correct and natural-sounding responses.|
52+
| GPTSimilarity|GPTSimilarity is a measure that quantifies the similarity between a ground truth sentence (or document) and the prediction sentence generated by an AI model. It is calculated by first computing sentence-level embeddings using the embeddings API for both the ground truth and the model's prediction. These embeddings represent high-dimensional vector representations of the sentences, capturing their semantic meaning and context.|
5053

5154
The benchmarks are updated regularly as new metrics and datasets are added to existing models, and as new models are added to the model catalog.
5255

0 commit comments

Comments
 (0)