Skip to content

Commit ae6ff78

Browse files
aprilk-msCopilot
andauthored
Bump version for builtin evaluators after azure_ai_project removal (#4897)
Bump versions for all 11 evaluators that had azure_ai_project removed: - code_vulnerability: 2 -> 3 - groundedness_pro: 6 -> 7 - hate_unfairness: 2 -> 3 - indirect_attack: 2 -> 3 - prohibited_actions: 4 -> 5 - protected_material: 2 -> 3 - self_harm: 2 -> 3 - sensitive_data_leakage: 4 -> 5 - sexual: 2 -> 3 - ungrounded_attributes: 2 -> 3 - violence: 2 -> 3 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
1 parent 052a46a commit ae6ff78

File tree

11 files changed

+11
-11
lines changed

11 files changed

+11
-11
lines changed

assets/evaluators/builtin/code_vulnerability/spec.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
type: "evaluator"
22
name: "builtin.code_vulnerability"
3-
version: 2
3+
version: 3
44
displayName: "Code-Vulnerability-Evaluator"
55
description: "Assesses whether generated code contains potential security flaws. Lower scores indicate safer, more secure code. Use this metric in code generation, security auditing, and software quality assurance."
66
evaluatorType: "builtin"

assets/evaluators/builtin/groundedness_pro/spec.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
type: "evaluator"
22
name: "builtin.groundedness_pro"
3-
version: 6
3+
version: 7
44
displayName: "Service-Groundedness-Evaluator"
55
description: "Checks if the response is properly grounded in provided sources. A true result means it aligns with reliable context. It’s best used for retrieval-augmented generation question and answering (RAG QA). Use the groundedness pro metric when you need to verify that AI-generated responses align with and are validated by the provided context."
66
evaluatorType: "builtin"

assets/evaluators/builtin/hate_unfairness/spec.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
type: "evaluator"
22
name: "builtin.hate_unfairness"
3-
version: 2
3+
version: 3
44
displayName: "Hate-and-Unfairness-Evaluator"
55
description: "Detects hateful, biased, or unfair language in the response. Lower scores indicate more neutral and inclusive content."
66
evaluatorType: "builtin"

assets/evaluators/builtin/indirect_attack/spec.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
type: "evaluator"
22
name: "builtin.indirect_attack"
3-
version: 2
3+
version: 3
44
displayName: "Indirect-Attack-Evaluator"
55
description: "Identifies subtle or indirect attacks toward individuals or groups. Lower scores indicate more respectful and non‑aggressive tone."
66
evaluatorType: "builtin"

assets/evaluators/builtin/prohibited_actions/spec.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
type: "evaluator"
22
name: "builtin.prohibited_actions"
3-
version: 4
3+
version: 5
44
displayName: "Prohibited-Actions-Evaluator"
55
description: "Flags content that suggests or enables actions violating safety guidelines. Helps prevent risky or disallowed behaviors."
66
evaluatorType: "builtin"

assets/evaluators/builtin/protected_material/spec.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
type: "evaluator"
22
name: "builtin.protected_material"
3-
version: 2
3+
version: 3
44
displayName: "Protected-Material-Evaluator"
55
description: "Detects use of copyrighted or proprietary content in responses. Ensures generated text stays within acceptable use boundaries."
66
evaluatorType: "builtin"

assets/evaluators/builtin/self_harm/spec.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
type: "evaluator"
22
name: "builtin.self_harm"
3-
version: 2
3+
version: 3
44
displayName: "Self-Harm-Related-Content-Evaluator"
55
description: "Detects language that promotes or discusses self‑harm. Lower scores indicate safer, more supportive content."
66
evaluatorType: "builtin"

assets/evaluators/builtin/sensitive_data_leakage/spec.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
type: "evaluator"
22
name: "builtin.sensitive_data_leakage"
3-
version: 4
3+
version: 5
44
displayName: "Sensitive-Data-Leakage-Evaluator"
55
description: "Tests whether an AI system leaks sensitive or private data (e.g., financial, medical, or PII) when exposed to direct or obfuscated adversarial queries. Use it to detect and classify leakage risk levels—ranging from benign direct queries to high-severity outputs containing realistic sensitive information."
66
evaluatorType: "builtin"

assets/evaluators/builtin/sexual/spec.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
type: "evaluator"
22
name: "builtin.sexual"
3-
version: 2
3+
version: 3
44
displayName: "Sexual-Content-Evaluator"
55
description: "Detects sexual or explicit content in responses. Lower scores indicate safer and more appropriate language."
66
evaluatorType: "builtin"

assets/evaluators/builtin/ungrounded_attributes/spec.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
type: "evaluator"
22
name: "builtin.ungrounded_attributes"
3-
version: 2
3+
version: 3
44
displayName: "Ungrounded-Attributes-Evaluator"
55
description: "Identifies details added by the model that are not supported by provided data. Helps catch hallucinated or made‑up information. This evaluator is useful for evaluating summarization, reporting, and generative ai systems where factual grounding is critical."
66
evaluatorType: "builtin"

0 commit comments

Comments
 (0)