Skip to content

Commit 14a41de

Browse files
needuvNeehar DuvvuriYoYoJasingankitcatalinaperalta
authored
Structured Eval Results + OTel Logging (#43398)
* skeleton code * add function that logs eval results to app insights * format * log red team data * get agent info from app insights config * add eval result converter (#43233) * add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * rename function * fix a thing * Jessli/convert (#43342) * add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * Groundedness Evaluator to not add tool result to tool call message (#43290) * Groundededness Evalautor to not add tool result to tool call message * Fixing reformatting issues * Add ledger certificate package (#43278) * add ledger certificate package * regen * update changelog --------- Co-authored-by: catalinaperalta <[email protected]> * [Identity] Update test-resources bicep (#43304) The vmSize for the AKS resource was updated to an SKU that is available in our subscription/location. Explicit PrincipalType fields were removed from role assignments that could potentially be user principals. Azure can automatically determine the type. Signed-off-by: Paul Van Eck <[email protected]> * [Communication Shared] Adding the mypy fixes (#42925) * Adding the mypy fixes * addressing the comments * addressing comments * Make docs happy * Updated docstring references --------- Co-authored-by: antisch <[email protected]> * add error msg and error code * Surface evaluator error msg --------- Signed-off-by: Paul Van Eck <[email protected]> Co-authored-by: Ankit Singhal <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: Paul Van Eck <[email protected]> Co-authored-by: Vinothini Dharmaraj <[email protected]> Co-authored-by: antisch <[email protected]> * Fix usage (#43355) * add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * Groundedness Evaluator to not add tool result to tool call message (#43290) * Groundededness Evalautor to not add tool result to tool call message * Fixing reformatting issues * Add ledger certificate package (#43278) * add ledger certificate package * regen * update changelog --------- Co-authored-by: catalinaperalta <[email protected]> * [Identity] Update test-resources bicep (#43304) The vmSize for the AKS resource was updated to an SKU that is available in our subscription/location. Explicit PrincipalType fields were removed from role assignments that could potentially be user principals. Azure can automatically determine the type. Signed-off-by: Paul Van Eck <[email protected]> * [Communication Shared] Adding the mypy fixes (#42925) * Adding the mypy fixes * addressing the comments * addressing comments * Make docs happy * Updated docstring references --------- Co-authored-by: antisch <[email protected]> * add error msg and error code * Surface evaluator error msg * update UT * fix usage --------- Signed-off-by: Paul Van Eck <[email protected]> Co-authored-by: Ankit Singhal <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: Paul Van Eck <[email protected]> Co-authored-by: Vinothini Dharmaraj <[email protected]> Co-authored-by: antisch <[email protected]> * save * add _type to evals/aoai graders * Jessli/convert make eval_meta_data optional (#43376) * add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * Groundedness Evaluator to not add tool result to tool call message (#43290) * Groundededness Evalautor to not add tool result to tool call message * Fixing reformatting issues * Add ledger certificate package (#43278) * add ledger certificate package * regen * update changelog --------- Co-authored-by: catalinaperalta <[email protected]> * [Identity] Update test-resources bicep (#43304) The vmSize for the AKS resource was updated to an SKU that is available in our subscription/location. Explicit PrincipalType fields were removed from role assignments that could potentially be user principals. Azure can automatically determine the type. Signed-off-by: Paul Van Eck <[email protected]> * [Communication Shared] Adding the mypy fixes (#42925) * Adding the mypy fixes * addressing the comments * addressing comments * Make docs happy * Updated docstring references --------- Co-authored-by: antisch <[email protected]> * add error msg and error code * Surface evaluator error msg * update UT * fix usage * make eval_meta_data optional * remove useless lines --------- Signed-off-by: Paul Van Eck <[email protected]> Co-authored-by: Ankit Singhal <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: Paul Van Eck <[email protected]> Co-authored-by: Vinothini Dharmaraj <[email protected]> Co-authored-by: antisch <[email protected]> * add error logging for otel event emission * add input/output tokens for prompty evals * Jessli/convert - update param name to add underscore (#43411) * add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * add error msg and error code * Surface evaluator error msg * update UT * fix usage * make eval_meta_data optional * remove useless lines * update param name to add underscore * exclude token counts from aggregation * add total token count to prompty output * fix prompty tests * remove fields from app insights config * make new evaluation result fields private, and add a toggle in evaluate * change output fields to be private * Jessli/convert parse annotation and add trace_id (#43463) * add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * add error msg and error code * Surface evaluator error msg * update UT * fix usage * make eval_meta_data optional * remove useless lines * update param name to add underscore * parse updated annotation results * Jessli/convert add trace_id, response_id, conversation_id (#43469) * add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * add error msg and error code * Surface evaluator error msg * update UT * fix usage * make eval_meta_data optional * remove useless lines * update param name to add underscore * parse updated annotation results * update trace_id * refactor app insights push to prevent warnings * run black on code * move otel import to internal module * Jessli/convert expose sample data for sdk promty based evaluators (#43474) * add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * add error msg and error code * Surface evaluator error msg * update UT * fix usage * make eval_meta_data optional * remove useless lines * update param name to add underscore * parse updated annotation results * update trace_id * expose sample data for sdk evaluators * update * Jessli/convert remove token counts from metrics (#43477) * add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * add error msg and error code * Surface evaluator error msg * update UT * fix usage * make eval_meta_data optional * remove useless lines * update param name to add underscore * parse updated annotation results * update trace_id * expose sample data for sdk evaluators * update * update * Jessli/convert remove useless lines and fix UT (#43480) * add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * add error msg and error code * Surface evaluator error msg * update UT * fix usage * make eval_meta_data optional * remove useless lines * update param name to add underscore * parse updated annotation results * update trace_id * expose sample data for sdk evaluators * update * update * fix UT * try changing prompty output to dict * change prompty output to dict * run black * fix relevance and prompty test * fix unit tests * fix prompty tests * fix similarity test * move groundedness to actual prompty impl * chore: Update assets.json * Jessli/convert Fix test failure (#43518) * add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * add error msg and error code * Surface evaluator error msg * update UT * fix usage * make eval_meta_data optional * remove useless lines * update param name to add underscore * parse updated annotation results * update trace_id * expose sample data for sdk evaluators * update * update * fix UT * fix tests * fix test * add extra attributes to app insights config, remove agent name/id/version/ from app insights config * pin otel<1.39.0 since breaking change coming in that version * implement scrubber for sensitive information * run black formatter * fix spelling for evaluation sdk * use non-deprecated path for emitting traces * remove upper bound on otel sdk * shuffle imports * Jessli/convert1021 Fxi bug (#43563) * add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * add error msg and error code * Surface evaluator error msg * update UT * fix usage * make eval_meta_data optional * remove useless lines * update param name to add underscore * parse updated annotation results * update trace_id * expose sample data for sdk evaluators * update * update * fix UT * fix tests * fix test * Jessli/convert (#43556) merge main * add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * add error msg and error code * Surface evaluator error msg * update UT * fix usage * make eval_meta_data optional * remove useless lines * update param name to add underscore * parse updated annotation results * update trace_id * expose sample data for sdk evaluators * update * Fix column mapping bug for AOAI evaluators with custom data mapping (#43429) * fix nesting bug for custom data mapping * address comments * remove extra code and fix test case * run formatter * use dumps * Modify logic for message body on Microsoft.ApplicationInsights.MessageData to include default message for messages with empty body and export logs (#43091) * Modify logic in PR (#43060) to include default message for messages with empty body and export logs * Update CHANGELOG * Update logic as per updated spec * Addressed comments * Set-VcpkgWriteModeCache -- add token timeout param for cmake generate's that exceed 1 hour (this can happen in C++ API View) (#43470) Co-authored-by: Daniel Jurek <[email protected]> * update * fix UT * fix tests * Added Tests and Samples for Paginated Queries (#43472) * added tests and samples for paginated queries * Apply suggestions from code review Co-authored-by: Copilot <[email protected]> * added single partition pagination sample --------- Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Copilot <[email protected]> * [Test Proxy] Support AARCH64 platform (#43428) * Delete doc/dev/how_to_request_a_feature_in_sdk.md (#43415) this doc is outdated * fix test * [AutoRelease] t2-iothub-2025-10-03-03336(can only be merged by SDK owner) (#43230) * code and test * update pyproject.toml --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> * [AutoRelease] t2-redisenterprise-2025-10-17-18412(can only be merged by SDK owner) (#43476) * code and test * update changelog * update changelog * Update CHANGELOG.md --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> * Extend basic test for "project_client.agents" to do more operations (#43516) * Sync eng/common directory with azure-sdk-tools for PR 12478 (#43457) * Updated validate pkg template to use packageInfo * Fixed typo * Fixed the right variable to use * output debug log * Fixed errors in expression evaluation * removed debug code * Fixed an issue in pipeline * Updated condition for variable setting step * Join paths of the script path * Use join-path * return from the function rather than exit --------- Co-authored-by: ray chen <[email protected]> * Reorder error and warning log line processing (#43456) Co-authored-by: Wes Haggard <[email protected]> * [App Configuration] - Release 1.7.2 (#43520) * release 1.7.2 * update change log * Modify CODEOWNERS for Azure SDK ownership changes (#43524) Updated CODEOWNERS to reflect new ownership for Azure SDK components. * Migrate Confidential Ledger library from swagger to typespec codegen (#42664) * regen * add default cert endpoint with tsp * remove refs to old namespace * update async operation patch * fix operations patch * fix header impl * more header fixes * revert receipt directory removal * cspell * regen certificates under correct namespace * regen ledger client * update namespace name * revert certificate change * update shared files after regen * updates * delete extra files * cspell * match return type to current behavior * cspell * mypy * pylint * update docs * regen * regen * fix patch * Revert "mypy" This reverts commit 6351eadac629e4546e7c42242c52e1519b0863b3. * add info in tsp_location.yaml * regen * update patch files * update patch files * fix patch * update patch files * regen * update tsp-location.yaml * generate certificate client * update patch files * fixes * regen clients * update pyproject.toml deps * update assets * regen * revert test change * nit * fix test input * regen with new model * update tests * update tests * apiview props * regen * update tests * update assets * apiview props * temp relative package updates * fix name * fix ledger ci (#43181) * remove swagger * remove extra configs * wip revert package dep temporarily * update readme * fix config files * Revert "wip revert package dep temporarily" This reverts commit db553c4737919ee04582e316ba41635ebaa328b6. * move tests * add identity samples --------- Co-authored-by: catalinaperalta <[email protected]> * rm certificate files * update changelog * misc fixes * update shared reqs * test * pylint --------- Co-authored-by: catalinaperalta <[email protected]> * update scripts (#43527) Co-authored-by: helen229 <[email protected]> * [AutoPR azure-mgmt-mongocluster]-generated-from-SDK Generation - Python-5459673 (#43448) * Configurations: 'specification/mongocluster/resource-manager/Microsoft.DocumentDB/MongoCluster/tspconfig.yaml', API Version: 2025-09-01, SDK Release Type: stable, and CommitSHA: 'c5601446fc65494f18157aecbcc79cebcfbab1fb' in SpecRepo: 'https://github.com/Azure/azure-rest-api-specs' Pipeline run: https://dev.azure.com/azure-sdk/internal/_build/results?buildId=5459673 Refer to https://eng.ms/docs/products/azure-developer-experience/develop/sdk-release/sdk-release-prerequisites to prepare for SDK release. * update changelog --------- Co-authored-by: ChenxiJiang333 <[email protected]> * App Configuration Provider - Key Vault Refresh (#41882) * Sync refresh changes * Key Vault Refresh * adding tests and fixing sync refresh * Updating Async * Fixed Async Tests * Updated tests and change log * Apply suggestions from code review Co-authored-by: Copilot <[email protected]> * Fixing merge issue * Updating comments * Updating secret refresh * Update _azureappconfigurationproviderasync.py * Fixing Optional Endpoint * fix mypy issue * fixing async test * mixing merge * fixing test after merge * Update testcase.py * Secret Provider Base * removing unused imports * updating exception * updating resolve key vault references * Review comments * fixing tests * tox updates * Updating Tests * Updating Async to be the same as sync * Fixing formatting * fixing tox and unneeded "" * fixing tox items * fix cspell + tests recording * Update test_async_secret_provider.py * Post Merge updates * Move cache to shared code * removed unneeded disabled * Update Secret Provider * Updating usage * Update assets.json * Updated to make secret refresh update dictionary * removing _secret_version_cache * Update assets.json * Update _secret_provider_base.py --------- Co-authored-by: Copilot <[email protected]> * Increment package version after release of azure-appconfiguration (#43531) * Patch `azure-template` back to `green` (#43533) * Update sdk/template/azure-template/pyproject.toml to use `repository` instead of `source` * added brackets for sql query keyword value (#43525) Co-authored-by: Andrew Mathew <[email protected]> * update changelog (#43532) Co-authored-by: catalinaperalta <[email protected]> * App Config Provider - Provider Refactor (#43196) * Code Cleanup * Move validation to shared file * Updating Header Check * Update test_azureappconfigurationproviderbase.py * moved async tests to aio folder * post merge updates --------- Co-authored-by: Ethan Winters <[email protected]> Co-authored-by: rads-1996 <[email protected]> Co-authored-by: Azure SDK Bot <[email protected]> Co-authored-by: Daniel Jurek <[email protected]> Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: McCoy Patiño <[email protected]> Co-authored-by: Yuchao Yan <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: Darren Cohen <[email protected]> Co-authored-by: ray chen <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: Zhiyuan Liang <[email protected]> Co-authored-by: Matthew Metcalf <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: helen229 <[email protected]> Co-authored-by: Scott Beddall <[email protected]> * Jessli/convert Fix bug (#43557) * add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * add error msg and error code * Surface evaluator error msg * update UT * fix usage * make eval_meta_data optional * remove useless lines * update param name to add underscore * parse updated annotation results * update trace_id * expose sample data for sdk evaluators * update * Fix column mapping bug for AOAI evaluators with custom data mapping (#43429) * fix nesting bug for custom data mapping * address comments * remove extra code and fix test case * run formatter * use dumps * Modify logic for message body on Microsoft.ApplicationInsights.MessageData to include default message for messages with empty body and export logs (#43091) * Modify logic in PR (#43060) to include default message for messages with empty body and export logs * Update CHANGELOG * Update logic as per updated spec * Addressed comments * Set-VcpkgWriteModeCache -- add token timeout param for cmake generate's that exceed 1 hour (this can happen in C++ API View) (#43470) Co-authored-by: Daniel Jurek <[email protected]> * update * fix UT * fix tests * Added Tests and Samples for Paginated Queries (#43472) * added tests and samples for paginated queries * Apply suggestions from code review Co-authored-by: Copilot <[email protected]> * added single partition pagination sample --------- Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Copilot <[email protected]> * [Test Proxy] Support AARCH64 platform (#43428) * Delete doc/dev/how_to_request_a_feature_in_sdk.md (#43415) this doc is outdated * fix test * [AutoRelease] t2-iothub-2025-10-03-03336(can only be merged by SDK owner) (#43230) * code and test * update pyproject.toml --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> * [AutoRelease] t2-redisenterprise-2025-10-17-18412(can only be merged by SDK owner) (#43476) * code and test * update changelog * update changelog * Update CHANGELOG.md --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> * Extend basic test for "project_client.agents" to do more operations (#43516) * Sync eng/common directory with azure-sdk-tools for PR 12478 (#43457) * Updated validate pkg template to use packageInfo * Fixed typo * Fixed the right variable to use * output debug log * Fixed errors in expression evaluation * removed debug code * Fixed an issue in pipeline * Updated condition for variable setting step * Join paths of the script path * Use join-path * return from the function rather than exit --------- Co-authored-by: ray chen <[email protected]> * Reorder error and warning log line processing (#43456) Co-authored-by: Wes Haggard <[email protected]> * [App Configuration] - Release 1.7.2 (#43520) * release 1.7.2 * update change log * Modify CODEOWNERS for Azure SDK ownership changes (#43524) Updated CODEOWNERS to reflect new ownership for Azure SDK components. * Migrate Confidential Ledger library from swagger to typespec codegen (#42664) * regen * add default cert endpoint with tsp * remove refs to old namespace * update async operation patch * fix operations patch * fix header impl * more header fixes * revert receipt directory removal * cspell * regen certificates under correct namespace * regen ledger client * update namespace name * revert certificate change * update shared files after regen * updates * delete extra files * cspell * match return type to current behavior * cspell * mypy * pylint * update docs * regen * regen * fix patch * Revert "mypy" This reverts commit 6351eadac629e4546e7c42242c52e1519b0863b3. * add info in tsp_location.yaml * regen * update patch files * update patch files * fix patch * update patch files * regen * update tsp-location.yaml * generate certificate client * update patch files * fixes * regen clients * update pyproject.toml deps * update assets * regen * revert test change * nit * fix test input * regen with new model * update tests * update tests * apiview props * regen * update tests * update assets * apiview props * temp relative package updates * fix name * fix ledger ci (#43181) * remove swagger * remove extra configs * wip revert package dep temporarily * update readme * fix config files * Revert "wip revert package dep temporarily" This reverts commit db553c4737919ee04582e316ba41635ebaa328b6. * move tests * add identity samples --------- Co-authored-by: catalinaperalta <[email protected]> * rm certificate files * update changelog * misc fixes * update shared reqs * test * pylint --------- Co-authored-by: catalinaperalta <[email protected]> * update scripts (#43527) Co-authored-by: helen229 <[email protected]> * [AutoPR azure-mgmt-mongocluster]-generated-from-SDK Generation - Python-5459673 (#43448) * Configurations: 'specification/mongocluster/resource-manager/Microsoft.DocumentDB/MongoCluster/tspconfig.yaml', API Version: 2025-09-01, SDK Release Type: stable, and CommitSHA: 'c5601446fc65494f18157aecbcc79cebcfbab1fb' in SpecRepo: 'https://github.com/Azure/azure-rest-api-specs' Pipeline run: https://dev.azure.com/azure-sdk/internal/_build/results?buildId=5459673 Refer to https://eng.ms/docs/products/azure-developer-experience/develop/sdk-release/sdk-release-prerequisites to prepare for SDK release. * update changelog --------- Co-authored-by: ChenxiJiang333 <[email protected]> * App Configuration Provider - Key Vault Refresh (#41882) * Sync refresh changes * Key Vault Refresh * adding tests and fixing sync refresh * Updating Async * Fixed Async Tests * Updated tests and change log * Apply suggestions from code review Co-authored-by: Copilot <[email protected]> * Fixing merge issue * Updating comments * Updating secret refresh * Update _azureappconfigurationproviderasync.py * Fixing Optional Endpoint * fix mypy issue * fixing async test * mixing merge * fixing test after merge * Update testcase.py * Secret Provider Base * removing unused imports * updating exception * updating resolve key vault references * Review comments * fixing tests * tox updates * Updating Tests * Updating Async to be the same as sync * Fixing formatting * fixing tox and unneeded "" * fixing tox items * fix cspell + tests recording * Update test_async_secret_provider.py * Post Merge updates * Move cache to shared code * removed unneeded disabled * Update Secret Provider * Updating usage * Update assets.json * Updated to make secret refresh update dictionary * removing _secret_version_cache * Update assets.json * Update _secret_provider_base.py --------- Co-authored-by: Copilot <[email protected]> * Increment package version after release of azure-appconfiguration (#43531) * Patch `azure-template` back to `green` (#43533) * Update sdk/template/azure-template/pyproject.toml to use `repository` instead of `source` * added brackets for sql query keyword value (#43525) Co-authored-by: Andrew Mathew <[email protected]> * update changelog (#43532) Co-authored-by: catalinaperalta <[email protected]> * App Config Provider - Provider Refactor (#43196) * Code Cleanup * Move validation to shared file * Updating Header Check * Update test_azureappconfigurationproviderbase.py * moved async tests to aio folder * post merge updates --------- Co-authored-by: Ethan Winters <[email protected]> Co-authored-by: rads-1996 <[email protected]> Co-authored-by: Azure SDK Bot <[email protected]> Co-authored-by: Daniel Jurek <[email protected]> Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: McCoy Patiño <[email protected]> Co-authored-by: Yuchao Yan <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: Darren Cohen <[email protected]> Co-authored-by: ray chen <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: Zhiyuan Liang <[email protected]> Co-authored-by: Matthew Metcalf <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: helen229 <[email protected]> Co-authored-by: Scott Beddall <[email protected]> * fix bug --------- Co-authored-by: Ethan Winters <[email protected]> Co-authored-by: rads-1996 <[email protected]> Co-authored-by: Azure SDK Bot <[email protected]> Co-authored-by: Daniel Jurek <[email protected]> Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: McCoy Patiño <[email protected]> Co-authored-by: Yuchao Yan <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: Darren Cohen <[email protected]> Co-authored-by: ray chen <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: Zhiyuan Liang <[email protected]> Co-authored-by: Matthew Metcalf <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: helen229 <[email protected]> Co-authored-by: Scott Beddall <[email protected]> * Jessli/convert1021 fix null value and black run (#43567) * add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * add error msg and error code * Surface evaluator error msg * update UT * fix usage * make eval_meta_data optional * remove useless lines * update param name to add underscore * parse updated annotation results * update trace_id * expose sample data for sdk evaluators * update * update * fix UT * fix tests * fix test * Jessli/convert (#43556) merge main * add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * add error msg and error code * Surface evaluator error msg * update UT * fix usage * make eval_meta_data optional * remove useless lines * update param name to add underscore * parse updated annotation results * update trace_id * expose sample data for sdk evaluators * update * Fix column mapping bug for AOAI evaluators with custom data mapping (#43429) * fix nesting bug for custom data mapping * address comments * remove extra code and fix test case * run formatter * use dumps * Modify logic for message body on Microsoft.ApplicationInsights.MessageData to include default message for messages with empty body and export logs (#43091) * Modify logic in PR (#43060) to include default message for messages with empty body and export logs * Update CHANGELOG * Update logic as per updated spec * Addressed comments * Set-VcpkgWriteModeCache -- add token timeout param for cmake generate's that exceed 1 hour (this can happen in C++ API View) (#43470) Co-authored-by: Daniel Jurek <[email protected]> * update * fix UT * fix tests * Added Tests and Samples for Paginated Queries (#43472) * added tests and samples for paginated queries * Apply suggestions from code review Co-authored-by: Copilot <[email protected]> * added single partition pagination sample --------- Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Copilot <[email protected]> * [Test Proxy] Support AARCH64 platform (#43428) * Delete doc/dev/how_to_request_a_feature_in_sdk.md (#43415) this doc is outdated * fix test * [AutoRelease] t2-iothub-2025-10-03-03336(can only be merged by SDK owner) (#43230) * code and test * update pyproject.toml --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> * [AutoRelease] t2-redisenterprise-2025-10-17-18412(can only be merged by SDK owner) (#43476) * code and test * update changelog * update changelog * Update CHANGELOG.md --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> * Extend basic test for "project_client.agents" to do more operations (#43516) * Sync eng/common directory with azure-sdk-tools for PR 12478 (#43457) * Updated validate pkg template to use packageInfo * Fixed typo * Fixed the right variable to use * output debug log * Fixed errors in expression evaluation * removed debug code * Fixed an issue in pipeline * Updated condition for variable setting step * Join paths of the script path * Use join-path * return from the function rather than exit --------- Co-authored-by: ray chen <[email protected]> * Reorder error and warning log line processing (#43456) Co-authored-by: Wes Haggard <[email protected]> * [App Configuration] - Release 1.7.2 (#43520) * release 1.7.2 * update change log * Modify CODEOWNERS for Azure SDK ownership changes (#43524) Updated CODEOWNERS to reflect new ownership for Azure SDK components. * Migrate Confidential Ledger library from swagger to typespec codegen (#42664) * regen * add default cert endpoint with tsp * remove refs to old namespace * update async operation patch * fix operations patch * fix header impl * more header fixes * revert receipt directory removal * cspell * regen certificates under correct namespace * regen ledger client * update namespace name * revert certificate change * update shared files after regen * updates * delete extra files * cspell * match return type to current behavior * cspell * mypy * pylint * update docs * regen * regen * fix patch * Revert "mypy" This reverts commit 6351eadac629e4546e7c42242c52e1519b0863b3. * add info in tsp_location.yaml * regen * update patch files * update patch files * fix patch * update patch files * regen * update tsp-location.yaml * generate certificate client * update patch files * fixes * regen clients * update pyproject.toml deps * update assets * regen * revert test change * nit * fix test input * regen with new model * update tests * update tests * apiview props * regen * update tests * update assets * apiview props * temp relative package updates * fix name * fix ledger ci (#43181) * remove swagger * remove extra configs * wip revert package dep temporarily * update readme * fix config files * Revert "wip revert package dep temporarily" This reverts commit db553c4737919ee04582e316ba41635ebaa328b6. * move tests * add identity samples --------- Co-authored-by: catalinaperalta <[email protected]> * rm certificate files * update changelog * misc fixes * update shared reqs * test * pylint --------- Co-authored-by: catalinaperalta <[email protected]> * update scripts (#43527) Co-authored-by: helen229 <[email protected]> * [AutoPR azure-mgmt-mongocluster]-generated-from-SDK Generation - Python-5459673 (#43448) * Configurations: 'specification/mongocluster/resource-manager/Microsoft.DocumentDB/MongoCluster/tspconfig.yaml', API Version: 2025-09-01, SDK Release Type: stable, and CommitSHA: 'c5601446fc65494f18157aecbcc79cebcfbab1fb' in SpecRepo: 'https://github.com/Azure/azure-rest-api-specs' Pipeline run: https://dev.azure.com/azure-sdk/internal/_build/results?buildId=5459673 Refer to https://eng.ms/docs/products/azure-developer-experience/develop/sdk-release/sdk-release-prerequisites to prepare for SDK release. * update changelog --------- Co-authored-by: ChenxiJiang333 <[email protected]> * App Configuration Provider - Key Vault Refresh (#41882) * Sync refresh changes * Key Vault Refresh * adding tests and fixing sync refresh * Updating Async * Fixed Async Tests * Updated tests and change log * Apply suggestions from code review Co-authored-by: Copilot <[email protected]> * Fixing merge issue * Updating comments * Updating secret refresh * Update _azureappconfigurationproviderasync.py * Fixing Optional Endpoint * fix mypy issue * fixing async test * mixing merge * fixing test after merge * Update testcase.py * Secret Provider Base * removing unused imports * updating exception * updating resolve key vault references * Review comments * fixing tests * tox updates * Updating Tests * Updating Async to be the same as sync * Fixing formatting * fixing tox and unneeded "" * fixing tox items * fix cspell + tests recording * Update test_async_secret_provider.py * Post Merge updates * Move cache to shared code * removed unneeded disabled * Update Secret Provider * Updating usage * Update assets.json * Updated to make secret refresh update dictionary * removing _secret_version_cache * Update assets.json * Update _secret_provider_base.py --------- Co-authored-by: Copilot <[email protected]> * Increment package version after release of azure-appconfiguration (#43531) * Patch `azure-template` back to `green` (#43533) * Update sdk/template/azure-template/pyproject.toml to use `repository` instead of `source` * added brackets for sql query keyword value (#43525) Co-authored-by: Andrew Mathew <[email protected]> * update changelog (#43532) Co-authored-by: catalinaperalta <[email protected]> * App Config Provider - Provider Refactor (#43196) * Code Cleanup * Move validation to shared file * Updating Header Check * Update test_azureappconfigurationproviderbase.py * moved async tests to aio folder * post merge updates --------- Co-authored-by: Ethan Winters <[email protected]> Co-authored-by: rads-1996 <[email protected]> Co-authored-by: Azure SDK Bot <[email protected]> Co-authored-by: Daniel Jurek <[email protected]> Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: McCoy Patiño <[email protected]> Co-authored-by: Yuchao Yan <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: Darren Cohen <[email protected]> Co-authored-by: ray chen <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: Zhiyuan Liang <[email protected]> Co-authored-by: Matthew Metcalf <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: helen229 <[email protected]> Co-authored-by: Scott Beddall <[email protected]> * Jessli/convert Fix bug (#43557) * add eval result converter * Add result converter * update converter params to optional * add eval meta data * fix type * remove useless file * get eval meta data as input * fix build errors * remove useless import * resolve comments * update * update comments * fix checker failure * add error msg and error code * Surface evaluator error msg * update UT * fix usage * make eval_meta_data optional * remove useless lines * update param name to add underscore * parse updated annotation results * update trace_id * expose sample data for sdk evaluators * update * Fix column mapping bug for AOAI evaluators with custom data mapping (#43429) * fix nesting bug for custom data mapping * address comments * remove extra code and fix test case * run formatter * use dumps * Modify logic for message body on Microsoft.ApplicationInsights.MessageData to include default message for messages with empty body and export logs (#43091) * Modify logic in PR (#43060) to include default message for messages with empty body and export logs * Update CHANGELOG * Update logic as per updated spec * Addressed comments * Set-VcpkgWriteModeCache -- add token timeout param for cmake generate's that exceed 1 hour (this can happen in C++ API View) (#43470) Co-authored-by: Daniel Jurek <[email protected]> * update * fix UT * fix tests * Added Tests and Samples for Paginated Queries (#43472) * added tests and samples for paginated queries * Apply suggestions from code review Co-authored-by: Copilot <[email protected]> * added single partition pagination sample --------- Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Copilot <[email protected]> * [Test Proxy] Support AARCH64 platform (#43428) * Delete doc/dev/how_to_request_a_feature_in_sdk.md (#43415) this doc is outdated * fix test * [AutoRelease] t2-iothub-2025-10-03-03336(can only be merged by SDK owner) (#43230) * code and test * update pyproject.toml --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> * [AutoRelease] t2-redisenterprise-2025-10-17-18412(can only be merged by SDK owner) (#43476) * code and test * update changelog * update changelog * Update CHANGELOG.md --------- Co-authored-by: azure-sdk <PythonSdkPipelines> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> * Extend basic test for "project_client.agents" to do more operations (#43516) * Sync eng/common directory with azure-sdk-tools for PR 12478 (#43457) * Updated validate pkg template to use packageInfo * Fixed typo * Fixed the right variable to use * output debug log * Fixed errors in expression evaluation * removed debug code * Fixed an issue in pipeline * Updated condition for variable setting step * Join paths of the script path * Use join-path * return from the function rather than exit --------- Co-authored-by: ray chen <[email protected]> * Reorder error and warning log line processing (#43456) Co-authored-by: Wes Haggard <[email protected]> * [App Configuration] - Release 1.7.2 (#43520) * release 1.7.2 * update change log * Modify CODEOWNERS for Azure SDK ownership changes (#43524) Updated CODEOWNERS to reflect new ownership for Azure SDK components. * Migrate Confidential Ledger library from swagger to typespec codegen (#42664) * regen * add default cert endpoint with tsp * remove refs to old namespace * update async operation patch * fix operations patch * fix header impl * more header fixes * revert receipt directory removal * cspell * regen certificates under correct namespace * regen ledger client * update namespace name * revert certificate change * update shared files after regen * updates * delete extra files * cspell * match return type to current behavior * cspell * mypy * pylint * update docs * regen * regen * fix patch * Revert "mypy" This reverts commit 6351eadac629e4546e7c42242c52e1519b0863b3. * add info in tsp_location.yaml * regen * update patch files * update patch files * fix patch * update patch files * regen * update tsp-location.yaml * generate certificate client * update patch files * fixes * regen clients * update pyproject.toml deps * update assets * regen * revert test change * nit * fix test input * regen with new model * update tests * update tests * apiview props * regen * update tests * update assets * apiview props * temp relative package updates * fix name * fix ledger ci (#43181) * remove swagger * remove extra configs * wip revert package dep temporarily * update readme * fix config files * Revert "wip revert package dep temporarily" This reverts commit db553c4737919ee04582e316ba41635ebaa328b6. * move tests * add identity samples --------- Co-authored-by: catalinaperalta <[email protected]> * rm certificate files * update changelog * misc fixes * update shared reqs * test * pylint --------- Co-authored-by: catalinaperalta <[email protected]> * update scripts (#43527) Co-authored-by: helen229 <[email protected]> * [AutoPR azure-mgmt-mongocluster]-generated-from-SDK Generation - Python-5459673 (#43448) * Configurations: 'specification/mongocluster/resource-manager/Microsoft.DocumentDB/MongoCluster/tspconfig.yaml', API Version: 2025-09-01, SDK Release Type: stable, and CommitSHA: 'c5601446fc65494f18157aecbcc79cebcfbab1fb' in SpecRepo: 'https://github.com/Azure/azure-rest-api-specs' Pipeline run: https://dev.azure.com/azure-sdk/internal/_build/results?buildId=5459673 Refer to https://eng.ms/docs/products/azure-developer-experience/develop/sdk-release/sdk-release-prerequisites to prepare for SDK release. * update changelog --------- Co-authored-by: ChenxiJiang333 <[email protected]> * App Configuration Provider - Key Vault Refresh (#41882) * Sync refresh changes * Key Vault Refresh * adding tests and fixing sync refresh * Updating Async * Fixed Async Tests * Updated tests and change log * Apply suggestions from code review Co-authored-by: Copilot <[email protected]> * Fixing merge issue * Updating comments * Updating secret refresh * Update _azureappconfigurationproviderasync.py * Fixing Optional Endpoint * fix mypy issue * fixing async test * mixing merge * fixing test after merge * Update testcase.py * Secret Provider Base * removing unused imports * updating exception * updating resolve key vault references * Review comments * fixing tests * tox updates * Updating Tests * Updating Async to be the same as sync * Fixing formatting * fixing tox and unneeded "" * fixing tox items * fix cspell + tests recording * Update test_async_secret_provider.py * Post Merge updates * Move cache to shared code * removed unneeded disabled * Update Secret Provider * Updating usage * Update assets.json * Updated to make secret refresh update dictionary * removing _secret_version_cache * Update assets.json * Update _secret_provider_base.py --------- Co-authored-by: Copilot <[email protected]> * Increment package version after release of azure-appconfiguration (#43531) * Patch `azure-template` back to `green` (#43533) * Update sdk/template/azure-template/pyproject.toml to use `repository` instead of `source` * added brackets for sql query keyword value (#43525) Co-authored-by: Andrew Mathew <[email protected]> * update changelog (#43532) Co-authored-by: catalinaperalta <[email protected]> * App Config Provider - Provider Refactor (#43196) * Code Cleanup * Move validation to shared file * Updating Header Check * Update test_azureappconfigurationproviderbase.py * moved async tests to aio folder * post merge updates --------- Co-authored-by: Ethan Winters <[email protected]> Co-authored-by: rads-1996 <[email protected]> Co-authored-by: Azure SDK Bot <[email protected]> Co-authored-by: Daniel Jurek <[email protected]> Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: McCoy Patiño <[email protected]> Co-authored-by: Yuchao Yan <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: Darren Cohen <[email protected]> Co-authored-by: ray chen <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: Zhiyuan Liang <[email protected]> Co-authored-by: Matthew Metcalf <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: helen229 <[email protected]> Co-authored-by: Scott Beddall <[email protected]> * fix bug * fix null value --------- Co-authored-by: Ethan Winters <[email protected]> Co-authored-by: rads-1996 <[email protected]> Co-authored-by: Azure SDK Bot <[email protected]> Co-authored-by: Daniel Jurek <[email protected]> Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: McCoy Patiño <[email protected]> Co-authored-by: Yuchao Yan <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: Darren Cohen <[email protected]> Co-authored-by: ray chen <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: Zhiyuan Liang <[email protected]> Co-authored-by: Matthew Metcalf <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: helen229 <[email protected]> Co-authored-by: Scott Beddall <[email protected]> * add span id logging * [fix] enrich log attributes (#43522) * [fix] enrich log attributes * fix * fix * tune conflict --------- Co-authored-by: zyysurely <[email protected]> * fix up code * fix property --------- Signed-off-by: Paul Van Eck <[email protected]> Co-authored-by: Neehar Duvvuri <[email protected]> Co-authored-by: Jessie Li <[email protected]> Co-authored-by: Ankit Singhal <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: catalinaperalta <[email protected]> Co-authored-by: Paul Van Eck <[email protected]> Co-authored-by: Vinothini Dharmaraj <[email protected]> Co-authored-by: antisch <[email protected]> Co-authored-by: kdestin <[email protected]> Co-authored-by: Ethan Winters <[email protected]> Co-authored-by: rads-1996 <[email protected]> Co-authored-by: Azure SDK Bot <[email protected]> Co-authored-by: Daniel Jurek <[email protected]> Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Andrew Mathew <[email protected]> Co-authored-by: Copilot <[email protected]> Co-authored-by: McCoy Patiño <[email protected]> Co-authored-by: Yuchao Yan <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: ChenxiJiang333 <[email protected]> Co-authored-by: Darren Cohen <[email protected]> Co-authored-by: ray chen <[email protected]> Co-authored-by: Wes Haggard <[email protected]> Co-authored-by: Zhiyuan Liang <[email protected]> Co-authored-by: Matthew Metcalf <[email protected]> Co-authored-by: helen229 <[email protected]> Co-authored-by: Scott Beddall <[email protected]> Co-authored-by: zyying <[email protected]> Co-authored-by: zyysurely <[email protected]>
1 parent 0aa98bf commit 14a41de

29 files changed

+1691
-58
lines changed

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_aoai/label_grader.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ class AzureOpenAILabelGrader(AzureOpenAIGrader):
4343
"""
4444

4545
id = "azureai://built-in/evaluators/azure-openai/label_grader"
46+
_type = "label_model"
4647

4748
def __init__(
4849
self,
@@ -62,6 +63,6 @@ def __init__(
6263
model=model,
6364
name=name,
6465
passing_labels=passing_labels,
65-
type="label_model",
66+
type=AzureOpenAILabelGrader._type,
6667
)
6768
super().__init__(model_config=model_config, grader_config=grader, credential=credential, **kwargs)

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_aoai/python_grader.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,7 @@ class AzureOpenAIPythonGrader(AzureOpenAIGrader):
5454
"""
5555

5656
id = "azureai://built-in/evaluators/azure-openai/python_grader"
57+
_type = "python"
5758

5859
def __init__(
5960
self,
@@ -79,7 +80,7 @@ def __init__(
7980
image_tag=image_tag,
8081
pass_threshold=pass_threshold,
8182
source=source,
82-
type="python",
83+
type=AzureOpenAIPythonGrader._type,
8384
)
8485

8586
super().__init__(model_config=model_config, grader_config=grader, credential=credential, **kwargs)

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_aoai/score_model_grader.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,7 @@ class AzureOpenAIScoreModelGrader(AzureOpenAIGrader):
4949
"""
5050

5151
id = "azureai://built-in/evaluators/azure-openai/score_model_grader"
52+
_type = "score_model"
5253

5354
def __init__(
5455
self,
@@ -80,7 +81,7 @@ def __init__(
8081
self.pass_threshold = pass_threshold
8182

8283
# Create OpenAI ScoreModelGrader instance
83-
grader_kwargs = {"input": input, "model": model, "name": name, "type": "score_model"}
84+
grader_kwargs = {"input": input, "model": model, "name": name, "type": AzureOpenAIScoreModelGrader._type}
8485

8586
if range is not None:
8687
grader_kwargs["range"] = range

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_aoai/string_check_grader.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ class AzureOpenAIStringCheckGrader(AzureOpenAIGrader):
3838
"""
3939

4040
id = "azureai://built-in/evaluators/azure-openai/string_check_grader"
41+
_type = "string_check"
4142

4243
def __init__(
4344
self,
@@ -60,6 +61,6 @@ def __init__(
6061
name=name,
6162
operation=operation,
6263
reference=reference,
63-
type="string_check",
64+
type=AzureOpenAIStringCheckGrader._type,
6465
)
6566
super().__init__(model_config=model_config, grader_config=grader, credential=credential, **kwargs)

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_aoai/text_similarity_grader.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ class AzureOpenAITextSimilarityGrader(AzureOpenAIGrader):
4343
"""
4444

4545
id = "azureai://built-in/evaluators/azure-openai/text_similarity_grader"
46+
_type = "text_similarity"
4647

4748
def __init__(
4849
self,
@@ -74,6 +75,6 @@ def __init__(
7475
pass_threshold=pass_threshold,
7576
name=name,
7677
reference=reference,
77-
type="text_similarity",
78+
type=AzureOpenAITextSimilarityGrader._type,
7879
)
7980
super().__init__(model_config=model_config, grader_config=grader, credential=credential, **kwargs)

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_common/rai_service.py

Lines changed: 102 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -411,6 +411,25 @@ def parse_response( # pylint: disable=too-many-branches,too-many-statements
411411
result[pm_metric_name + "_reason"] = (
412412
parsed_response["reasoning"] if "reasoning" in parsed_response else ""
413413
)
414+
result[pm_metric_name + "_total_tokens"] = (
415+
parsed_response["totalTokenCount"] if "totalTokenCount" in parsed_response else ""
416+
)
417+
result[pm_metric_name + "_prompt_tokens"] = (
418+
parsed_response["inputTokenCount"] if "inputTokenCount" in parsed_response else ""
419+
)
420+
result[pm_metric_name + "_completion_tokens"] = (
421+
parsed_response["outputTokenCount"] if "outputTokenCount" in parsed_response else ""
422+
)
423+
result[pm_metric_name + "_finish_reason"] = (
424+
parsed_response["finish_reason"] if "finish_reason" in parsed_response else ""
425+
)
426+
result[pm_metric_name + "_sample_input"] = (
427+
parsed_response["sample_input"] if "sample_input" in parsed_response else ""
428+
)
429+
result[pm_metric_name + "_sample_output"] = (
430+
parsed_response["sample_output"] if "sample_output" in parsed_response else ""
431+
)
432+
result[pm_metric_name + "_model"] = parsed_response["model"] if "model" in parsed_response else ""
414433
return result
415434
if metric_name not in batch_response[0]:
416435
return {}
@@ -442,9 +461,39 @@ def parse_response( # pylint: disable=too-many-branches,too-many-statements
442461
# Add all attributes under the details.
443462
details = {}
444463
for key, value in parsed_response.items():
445-
if key not in {"label", "reasoning", "version"}:
464+
if key not in {
465+
"label",
466+
"reasoning",
467+
"version",
468+
"totalTokenCount",
469+
"inputTokenCount",
470+
"outputTokenCount",
471+
"finish_reason",
472+
"sample_input",
473+
"sample_output",
474+
"model",
475+
}:
446476
details[key.replace("-", "_")] = value
447477
result[metric_display_name + "_details"] = details
478+
result[metric_display_name + "_total_tokens"] = (
479+
parsed_response["totalTokenCount"] if "totalTokenCount" in parsed_response else ""
480+
)
481+
result[metric_display_name + "_prompt_tokens"] = (
482+
parsed_response["inputTokenCount"] if "inputTokenCount" in parsed_response else ""
483+
)
484+
result[metric_display_name + "_completion_tokens"] = (
485+
parsed_response["outputTokenCount"] if "outputTokenCount" in parsed_response else ""
486+
)
487+
result[metric_display_name + "_finish_reason"] = (
488+
parsed_response["finish_reason"] if "finish_reason" in parsed_response else ""
489+
)
490+
result[metric_display_name + "_sample_input"] = (
491+
parsed_response["sample_input"] if "sample_input" in parsed_response else ""
492+
)
493+
result[metric_display_name + "_sample_output"] = (
494+
parsed_response["sample_output"] if "sample_output" in parsed_response else ""
495+
)
496+
result[metric_display_name + "_model"] = parsed_response["model"] if "model" in parsed_response else ""
448497
return result
449498
return _parse_content_harm_response(batch_response, metric_name, metric_display_name)
450499

@@ -484,6 +533,13 @@ def _parse_content_harm_response(
484533
except Exception: # pylint: disable=broad-exception-caught
485534
harm_response = response[metric_name]
486535

536+
total_tokens = 0
537+
prompt_tokens = 0
538+
completion_tokens = 0
539+
finish_reason = ""
540+
sample_input = ""
541+
sample_output = ""
542+
model = ""
487543
if harm_response != "" and isinstance(harm_response, dict):
488544
# check if "output" is one key in harm_response
489545
if "output" in harm_response:
@@ -511,6 +567,44 @@ def _parse_content_harm_response(
511567
reason = harm_response["reason"]
512568
else:
513569
reason = ""
570+
571+
# get token_usage
572+
if "totalTokenCount" in harm_response:
573+
total_tokens = harm_response["totalTokenCount"]
574+
else:
575+
total_tokens = 0
576+
if "inputTokenCount" in harm_response:
577+
prompt_tokens = harm_response["inputTokenCount"]
578+
else:
579+
prompt_tokens = 0
580+
if "outputTokenCount" in harm_response:
581+
completion_tokens = harm_response["outputTokenCount"]
582+
else:
583+
completion_tokens = 0
584+
585+
# get finish_reason
586+
if "finish_reason" in harm_response:
587+
finish_reason = harm_response["finish_reason"]
588+
else:
589+
finish_reason = ""
590+
591+
# get sample_input
592+
if "sample_input" in harm_response:
593+
sample_input = harm_response["sample_input"]
594+
else:
595+
sample_input = ""
596+
597+
# get sample_output
598+
if "sample_output" in harm_response:
599+
sample_output = harm_response["sample_output"]
600+
else:
601+
sample_output = ""
602+
603+
# get model
604+
if "model" in harm_response:
605+
model = harm_response["model"]
606+
else:
607+
model = ""
514608
elif harm_response != "" and isinstance(harm_response, str):
515609
metric_value_match = re.findall(r"(\b[0-7])\b", harm_response)
516610
if metric_value_match:
@@ -537,6 +631,13 @@ def _parse_content_harm_response(
537631
result[key] = get_harm_severity_level(harm_score)
538632
result[key + "_score"] = harm_score
539633
result[key + "_reason"] = reason
634+
result[key + "_total_tokens"] = total_tokens
635+
result[key + "_prompt_tokens"] = prompt_tokens
636+
result[key + "_completion_tokens"] = completion_tokens
637+
result[key + "_finish_reason"] = finish_reason
638+
result[key + "_sample_input"] = sample_input
639+
result[key + "_sample_output"] = sample_output
640+
result[key + "_model"] = model
540641

541642
return result
542643

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_constants.py

Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -93,6 +93,88 @@ class TokenScope(str, enum.Enum):
9393
AZURE_ML = "https://ml.azure.com/.default"
9494

9595

96+
class _EvaluatorMetricMapping:
97+
"""
98+
Static mapping of evaluator names to their metric names, based on assets.json.
99+
The 'builtin.' prefix is removed from the evaluator name keys.
100+
"""
101+
102+
EVALUATOR_NAME_METRICS_MAPPINGS = {
103+
"bleu_score": ["bleu"],
104+
"coherence": ["coherence"],
105+
"document_retrieval": [
106+
"ndcg@3",
107+
"xdcg@3",
108+
"fidelity",
109+
"top1_relevance",
110+
"top3_max_relevance",
111+
"holes",
112+
"holes_ratio",
113+
"total_retrieved_documents",
114+
"total_ground_truth_documents",
115+
],
116+
"f1_score": ["f1_score"],
117+
"fluency": ["fluency"],
118+
"gleu_score": ["gleu"],
119+
"meteor_score": ["meteor"],
120+
"relevance": ["relevance"],
121+
"response_completeness": ["response_completeness"],
122+
"rouge_score": ["rouge_precision", "rouge_recall", "rouge_f1_score"],
123+
"groundedness_pro": ["groundedness_pro"],
124+
"similarity": ["similarity"],
125+
"intent_resolution": ["intent_resolution"],
126+
"retrieval": ["retrieval"],
127+
"task_adherence": ["task_adherence"],
128+
"tool_call_accuracy": ["tool_call_accuracy"],
129+
"groundedness": ["groundedness"],
130+
"code_vulnerability": ["code_vulnerability"],
131+
"eci": ["eci"],
132+
"protected_material": ["protected_material"],
133+
"ungrounded_attributes": ["ungrounded_attributes"],
134+
"indirect_attack": ["xpia", "xpia_manipulated_content", "xpia_intrusion", "xpia_information_gathering"],
135+
"label_grader": ["label_model"],
136+
"string_check_grader": ["string_check"],
137+
"text_similarity_grader": ["similarity"],
138+
"score_model_grader": ["score_model"],
139+
"sexual": ["sexual"],
140+
"self_harm": ["self_harm"],
141+
"violence": ["violence"],
142+
"hate_unfairness": ["hate_unfairness"],
143+
}
144+
145+
EVAL_CLASS_NAME_MAP = {
146+
"BleuScoreEvaluator": "bleu_score",
147+
"CodeVulnerabilityEvaluator": "code_vulnerability",
148+
"CoherenceEvaluator": "coherence",
149+
"ContentSafetyEvaluator": "content_safety",
150+
"DocumentRetrievalEvaluator": "document_retrieval",
151+
"ECIEvaluator": "eci",
152+
"F1ScoreEvaluator": "f1_score",
153+
"FluencyEvaluator": "fluency",
154+
"GleuScoreEvaluator": "gleu_score",
155+
"GroundednessEvaluator": "groundedness",
156+
"GroundednessProEvaluator": "groundedness_pro",
157+
"HateUnfairnessEvaluator": "hate_unfairness",
158+
"IndirectAttackEvaluator": "indirect_attack",
159+
"IntentResolutionEvaluator": "intent_resolution",
160+
"MeteorScoreEvaluator": "meteor_score",
161+
"ProtectedMaterialEvaluator": "protected_material",
162+
"QAEvaluator": "qa",
163+
"RelevanceEvaluator": "relevance",
164+
"ResponseCompletenessEvaluator": "response_completeness",
165+
"RetrievalEvaluator": "retrieval",
166+
"RougeScoreEvaluator": "rouge_score",
167+
"SelfHarmEvaluator": "self_harm",
168+
"SexualEvaluator": "sexual",
169+
"SimilarityEvaluator": "similarity",
170+
"TaskAdherenceEvaluator": "task_adherence",
171+
"TaskCompletionEvaluator": "task_completion",
172+
"ToolCallAccuracyEvaluator": "tool_call_accuracy",
173+
"UngroundedAttributesEvaluator": "ungrounded_attributes",
174+
"ViolenceEvaluator": "violence",
175+
}
176+
177+
96178
DEFAULT_EVALUATION_RESULTS_FILE_NAME = "evaluation_results.json"
97179

98180
CONTENT_SAFETY_DEFECT_RATE_THRESHOLD_DEFAULT = 4
@@ -119,3 +201,6 @@ class TokenScope(str, enum.Enum):
119201
AOAI_COLUMN_NAME = "aoai"
120202
DEFAULT_OAI_EVAL_RUN_NAME = "AI_SDK_EVAL_RUN"
121203
DEFAULT_AOAI_API_VERSION = "2025-04-01-preview" # Unfortunately relying on preview version for now.
204+
205+
# OpenTelemetry event names
206+
EVALUATION_EVENT_NAME = "gen_ai.evaluation.result"

sdk/evaluation/azure-ai-evaluation/azure/ai/evaluation/_evaluate/_batch_run/_run_submitter_client.py

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -159,6 +159,16 @@ def get_run_summary(self, client_run: BatchClientRun) -> Dict[str, Any]:
159159
"completed_lines": total_lines - failed_lines,
160160
"failed_lines": failed_lines,
161161
"log_path": None,
162+
"error_message": (
163+
f"({run.result.error.blame.value}) {run.result.error.message}"
164+
if run.result and run.result.error and run.result.error.blame
165+
else None
166+
),
167+
"error_code": (
168+
f"{run.result.error.category.value}"
169+
if run.result and run.result.error and run.result.error.category
170+
else None
171+
),
162172
}
163173

164174
@staticmethod

0 commit comments

Comments
 (0)