Bump azure-ai-evaluation from 1.8.0 to 1.10.0 #140

dependabot · 2025-08-04T12:58:54Z

Bumps azure-ai-evaluation from 1.8.0 to 1.10.0.

Release notes

Sourced from azure-ai-evaluation's releases.

azure-ai-evaluation_1.10.0

1.10.0 (2025-07-31)

Breaking Changes

Added evaluate_query parameter to all RAI service evaluators that can be passed as a keyword argument. This parameter controls whether queries are included in evaluation data when evaluating query-response pairs. Previously, queries were always included in evaluations. When set to True, both query and response will be evaluated; when set to False (default), only the response will be evaluated. This parameter is available across all RAI service evaluators including ContentSafetyEvaluator, ViolenceEvaluator, SexualEvaluator, SelfHarmEvaluator, HateUnfairnessEvaluator, ProtectedMaterialEvaluator, IndirectAttackEvaluator, CodeVulnerabilityEvaluator, UngroundedAttributesEvaluator, GroundednessProEvaluator, and EciEvaluator. Existing code that relies on queries being evaluated will need to explicitly set evaluate_query=True to maintain the previous behavior.

Features Added

Added support for Azure OpenAI Python grader via AzureOpenAIPythonGrader class, which serves as a wrapper around Azure Open AI Python grader configurations. This new grader object can be supplied to the main evaluate method as if it were a normal callable evaluator.

Added attack_success_thresholds parameter to RedTeam class for configuring custom thresholds that determine attack success. This allows users to set specific threshold values for each risk category, with scores greater than the threshold considered successful attacks (i.e. higher threshold means higher tolerance for harmful responses).

Enhanced threshold reporting in RedTeam results to include default threshold values when custom thresholds aren't specified, providing better transparency about the evaluation criteria used.

Bugs Fixed

Fixed red team scan output_path issue where individual evaluation results were overwriting each other instead of being preserved as separate files. Individual evaluations now create unique files while the user's output_path is reserved for final aggregated results.

Significant improvements to TaskAdherence evaluator. New version has less variance, is much faster and consumes fewer tokens.

Significant improvements to Relevance evaluator. New version has more concrete rubrics and has less variance, is much faster and consumes fewer tokens.

Other Changes

The default engine for evaluation was changed from promptflow (PFClient) to an in-SDK batch client (RunSubmitterClient)

Note: We've temporarily kept an escape hatch to fall back to the legacy promptflow implementation by setting _use_pf_client=True when invoking evaluate(). This is due to be removed in a future release.

azure-ai-evaluation_1.9.0

1.9.0 (2025-07-02)

Features Added

Added support for Azure Open AI evaluation via AzureOpenAIScoreModelGrader class, which serves as a wrapper around Azure Open AI score model configurations. This new grader object can be supplied to the main evaluate method as if it were a normal callable evaluator.

Added new experimental risk categories ProtectedMaterial and CodeVulnerability for redteam agent scan.

Bugs Fixed

Significant improvements to IntentResolution evaluator. New version has less variance, is nearly 2x faster and consumes fewer tokens.

Fixed MeteorScoreEvaluator and other threshold-based evaluators returning incorrect binary results due to integer conversion of decimal scores. Previously, decimal scores like 0.9375 were incorrectly converted to integers (0) before threshold comparison, causing them to fail even when above the threshold. #41415

Added a new enum ADVERSARIAL_QA_DOCUMENTS which moves all the "file_content" type prompts away from ADVERSARIAL_QA to the new enum

Commits

6e86b7a [evaluation] fix: Remove lazy imports from all (#42313)
976dc5c azure-monitor-opentelemetry-exporter release 1.0.0b41 (#42309)
5aff1d1 [Monitor Query] Rename migration guide (#42312)
4c5bc05 Howie/deep research test (#42296)
1f16747 Remove str from credentials (#42310)
b001b0a [evaluation] Remove _InternalRiskCategory from type hint (#42308)
7feb902 Update min OT version and remove Python 3.8 support (#42246)
09c6ca6 Fix mypy errors (#42192)
55caf99 mongocluster release 1.1.0b1
412d06a GA6 Changes for callautomation (#41918)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [azure-ai-evaluation](https://github.com/Azure/azure-sdk-for-python) from 1.8.0 to 1.10.0. - [Release notes](https://github.com/Azure/azure-sdk-for-python/releases) - [Changelog](https://github.com/Azure/azure-sdk-for-python/blob/main/doc/esrp_release.md) - [Commits](Azure/azure-sdk-for-python@azure-ai-evaluation_1.8.0...azure-ai-evaluation_1.10.0) --- updated-dependencies: - dependency-name: azure-ai-evaluation dependency-version: 1.10.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]>

dependabot · 2025-09-08T08:16:05Z

Superseded by #145.

dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Aug 4, 2025

dependabot bot mentioned this pull request Aug 4, 2025

Bump azure-ai-evaluation from 1.8.0 to 1.9.0 #139

Closed

dependabot bot closed this Sep 8, 2025

dependabot bot deleted the dependabot/pip/azure-ai-evaluation-1.10.0 branch September 8, 2025 08:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bump azure-ai-evaluation from 1.8.0 to 1.10.0 #140

Bump azure-ai-evaluation from 1.8.0 to 1.10.0 #140

Uh oh!

dependabot bot commented on behalf of github Aug 4, 2025

Uh oh!

dependabot bot commented on behalf of github Sep 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Bump azure-ai-evaluation from 1.8.0 to 1.10.0 #140

Bump azure-ai-evaluation from 1.8.0 to 1.10.0 #140

Uh oh!

Conversation

dependabot bot commented on behalf of github Aug 4, 2025

azure-ai-evaluation_1.10.0

1.10.0 (2025-07-31)

Breaking Changes

Features Added

Bugs Fixed

Other Changes

azure-ai-evaluation_1.9.0

1.9.0 (2025-07-02)

Features Added

Bugs Fixed

Uh oh!

dependabot bot commented on behalf of github Sep 8, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant