models prompt flow evaluator documentation

prompt flow evaluator

Models in this category

Coherence-Evaluator

| | | | -- | -- | | Score range | Integer [1-5]: where 1 is bad and 5 is good | | What is this metric? | Measures how well the language model can produce output that flows smoothly, reads naturally, and resembles human-like language. | | How does it work? | The coherence measure assesses the abi...
F1Score-Evaluator

| | | | -- | -- | | Score range | Float [0-1] | | What is this metric? | Measures the ratio of the number of shared words between the model generation and the ground truth answers. | | How does it work? | The F1-score computes the ratio of the number of shared words between the model generation ...
Fluency-Evaluator

| | | | -- | -- | | Score range | Integer [1-5]: where 1 is bad and 5 is good | | What is this metric? | Measures the grammatical proficiency of a generative AI's predicted answer. | | How does it work? | The fluency measure assesses the extent to which the generated text conforms to grammatical...
Groundedness-Evaluator

| | | | -- | -- | | Score range | Integer [1-5]: where 1 is bad and 5 is good | | What is this metric? | Measures how well the model's generated answers align with information from the source data (user-defined context). | | How does it work? | The groundedness measure assesses the correspondenc...
Hate-and-Unfairness-Evaluator

Definition

Hateful and unfair content refers to any language pertaining to hate toward or unfair representations of individuals and social groups along factors including but not limited to race, ethnicity, nationality, gender, sexual orientation, religion, immigration status, ability, persona...

Relevance-Evaluator

| | | | -- | -- | | Score range | Integer [1-5]: where 1 is bad and 5 is good | | What is this metric? | Measures the extent to which the model's generated responses are pertinent and directly related to the given questions. | | How does it work? | The relevance measure assesses the ability of a...
Self-Harm-Related-Content-Evaluator

Definition

Self-harm-related content includes language pertaining to actions intended to hurt, injure, or damage one's body or kill oneself.

Severity scale

Safety evaluations annotate self-harm-related content using a 0-7 scale.

Very Low (0-1) refers to

Content that contains self-...

Sexual-Content-Evaluator

Definition

Sexual content includes language pertaining to anatomical organs and genitals, romantic relationships, acts portrayed in erotic terms, pregnancy, physical sexual acts (including assault or sexual violence), prostitution, pornography, and sexual abuse.

Severity scale

Safety eva...

Similarity-Evaluator

| | | | -- | -- | | Score range | Integer [1-5]: where 1 is bad and 5 is good | | What is this metric? | Measures the similarity between a source data (ground truth) sentence and the generated response by an AI model. | | How does it work? | The GPT-similarity measure evaluates the likeness betw...
Violent-Content-Evaluator

Definition

Violent content includes language pertaining to physical actions intended to hurt, injure, damage, or kill someone or something. It also includes descriptions of weapons and guns (and related entities such as manufacturers and associations).

Severity scale

Safety evaluations ...

Wiki menu

Home
Reference Documentation
- Components
- Data
- Environments
- Models
Contributing

models prompt flow evaluator documentation

prompt flow evaluator

Models in this category

Definition

Definition

Severity scale

Definition

Severity scale

Definition

Severity scale

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!