Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 33 additions & 7 deletions .openpublishing.redirection.ai.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,22 @@
"redirections": [
{
"source_path_from_root": "/docs/ai/ai-extensions.md",
"redirect_url": "/dotnet/ai/microsoft-extensions-ai"
"redirect_url": "/dotnet/ai/microsoft-extensions-ai",
"redirect_document_id": true
},
{
"source_path_from_root": "/docs/ai/conceptual/agents.md",
"redirect_url": "/dotnet/ai"
},
{
"source_path_from_root": "/docs/ai/conceptual/evaluation-libraries.md",
"redirect_url": "/dotnet/ai/evaluation/libraries",
"redirect_document_id": true
},
{
"source_path_from_root": "/docs/ai/get-started/dotnet-ai-overview.md",
"redirect_url": "/dotnet/ai/overview"
"redirect_url": "/dotnet/ai/overview",
"redirect_document_id": true
},
{
"source_path_from_root": "/docs/ai/how-to/app-service-db-auth.md",
Expand All @@ -24,6 +31,11 @@
"source_path_from_root": "/docs/ai/how-to/work-with-local-models.md",
"redirect_url": "/dotnet/ai"
},
{
"source_path_from_root": "/docs/ai/quickstarts/evaluate-ai-response.md",
"redirect_url": "/dotnet/ai/evaluation/evaluate-ai-response",
"redirect_document_id": true
},
{
"source_path_from_root": "/docs/ai/quickstarts/get-started-azure-openai.md",
"redirect_url": "/dotnet/ai/quickstarts/build-chat-app"
Expand All @@ -38,27 +50,41 @@
},
{
"source_path_from_root": "/docs/ai/quickstarts/quickstart-assistants.md",
"redirect_url": "/dotnet/ai/quickstarts/create-assistant"
"redirect_url": "/dotnet/ai/quickstarts/create-assistant",
"redirect_document_id": true
},
{
"source_path_from_root": "/docs/ai/quickstarts/quickstart-azure-openai-tool.md",
"redirect_url": "/dotnet/ai/quickstarts/use-function-calling"
},
{
"source_path_from_root": "/docs/ai/quickstarts/quickstart-local-ai.md",
"redirect_url": "/dotnet/ai/quickstarts/chat-local-model"
"redirect_url": "/dotnet/ai/quickstarts/chat-local-model",
"redirect_document_id": true
},
{
"source_path_from_root": "/docs/ai/quickstarts/quickstart-openai-generate-images.md",
"redirect_url": "/dotnet/ai/quickstarts/generate-images"
"redirect_url": "/dotnet/ai/quickstarts/generate-images",
"redirect_document_id": true
},
{
"source_path_from_root": "/docs/ai/quickstarts/quickstart-openai-summarize-text.md",
"redirect_url": "/dotnet/ai/quickstarts/prompt-model"
"redirect_url": "/dotnet/ai/quickstarts/prompt-model",
"redirect_document_id": true
},
{
"source_path_from_root": "/docs/ai/tutorials/llm-eval.md",
"redirect_url": "/dotnet/ai/quickstarts/evaluate-ai-response"
"redirect_url": "/dotnet/ai/evaluation/evaluate-ai-response"
},
{
"source_path_from_root": "/docs/ai/tutorials/evaluate-safety.md",
"redirect_url": "/dotnet/ai/evaluation/evaluate-safety",
"redirect_document_id": true
},
{
"source_path_from_root": "/docs/ai/tutorials/evaluate-with-reporting.md",
"redirect_url": "/dotnet/ai/evaluation/evaluate-with-reporting",
"redirect_document_id": true
}
]
}
6 changes: 3 additions & 3 deletions .openpublishing.redirection.csharp.json
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@
},
{
"source_path_from_root": "/redirections/proposals/csharp-7.2/conditional-ref.md",
"redirect_url": "/dotnet/csharp/language-reference/language-specification/expressions#1218-conditional-operator"
"redirect_url": "/dotnet/csharp/language-reference/language-specification/expressions#1219-conditional-operator"
},
{
"source_path_from_root": "/redirections/proposals/csharp-7.2/non-trailing-named-arguments.md",
Expand Down Expand Up @@ -94,7 +94,7 @@
},
{
"source_path_from_root": "/redirections/proposals/csharp-7.3/pattern-based-fixed.md",
"redirect_url": "/dotnet/csharp/language-reference/language-specification/unsafe-code#237-the-fixed-statement"
"redirect_url": "/dotnet/csharp/language-reference/language-specification/unsafe-code#247-the-fixed-statement"
},
{
"source_path_from_root": "/redirections/proposals/csharp-7.3/ref-local-reassignment.md",
Expand Down Expand Up @@ -122,7 +122,7 @@
},
{
"source_path_from_root": "/redirections/proposals/csharp-8.0/null-coalescing-assignment.md",
"redirect_url": "/dotnet/csharp/language-reference/language-specification/expressions#1221-assignment-operators"
"redirect_url": "/dotnet/csharp/language-reference/language-specification/expressions#1222-assignment-operators"
},
{
"source_path_from_root": "/redirections/proposals/csharp-8.0/async-streams.md",
Expand Down
7 changes: 4 additions & 3 deletions docfx.json
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@
"csharp-8.0/readonly-instance-members.md",
"csharp-8.0/null-coalescing-assignment.md",
"csharp-8.0/async-streams.md",
"csharp-8.0/ranges.md",
"csharp-9.0/nullable-reference-types-specification.md",
"csharp-9.0/nullable-constructor-analysis.md",
"csharp-9.0/nullable-parameter-default-value-analysis.md",
Expand Down Expand Up @@ -622,6 +623,7 @@
"_csharpstandard/standard/classes.md": "Classes",
"_csharpstandard/standard/structs.md": "Structs",
"_csharpstandard/standard/arrays.md": "Arrays",
"_csharpstandard/standard/ranges.md": "Indexes and ranges",
"_csharpstandard/standard/interfaces.md": "Interfaces",
"_csharpstandard/standard/enums.md": "Enums",
"_csharpstandard/standard/delegates.md": "Delegates",
Expand All @@ -635,7 +637,6 @@
"_csharpstandard/standard/Bibliography.md": "Bibliography",
"_csharplang/proposals/csharp-8.0/patterns.md": "Recursive pattern matching",
"_csharplang/proposals/csharp-8.0/default-interface-methods.md": "Default interface methods",
"_csharplang/proposals/csharp-8.0/ranges.md": "Ranges and indices",
"_csharplang/proposals/csharp-8.0/using.md": "Pattern based using and using declarations",
"_csharplang/proposals/csharp-9.0/covariant-returns.md": "Covariant return types",
"_csharplang/proposals/csharp-9.0/extending-partial-methods.md": "Extending partial methods",
Expand Down Expand Up @@ -749,7 +750,8 @@
"_csharpstandard/standard/namespaces.md": "This chapter defines namespaces, including how to declare them and how to use them.",
"_csharpstandard/standard/classes.md": "This chapter covers class declarations, including all member types that can be included in classes. This includes generic classes as well as non-generic classes.",
"_csharpstandard/standard/structs.md": "This chapter defines struct declarations. In many cases, the descriptions are defined using the differences between classes and structs.",
"_csharpstandard/standard/arrays.md": "This chapter defines arrays. It includes the rules for array variance, multi-dimensional arrays and jagged arrays.",
"_csharpstandard/standard/arrays.md": "This chapter defines arrays. It includes the rules for array variance, multi-dimensional arrays, and jagged arrays.",
"_csharpstandard/standard/ranges.md": "This chapter defines the index and range operators for indexing into arrays, strings, and spans.",
"_csharpstandard/standard/interfaces.md": "This chapter defines interfaces. This includes interface declarations, implementing interfaces, and explicit interface implementation.",
"_csharpstandard/standard/enums.md": "This chapter defines the enum types in C#. Enums create a set of named constants and are represented by an underlying integral set of values.",
"_csharpstandard/standard/delegates.md": "This chapter defines delegates, which are objects that hold type safe function pointers.",
Expand All @@ -763,7 +765,6 @@
"_csharpstandard/standard/Bibliography.md": "This appendix lists external standards referenced in this specification.",
"_csharplang/proposals/csharp-8.0/patterns.md": "This feature specification describes recursive pattern matching, where patterns can nest inside other patterns.",
"_csharplang/proposals/csharp-8.0/default-interface-methods.md": "This feature specification describe the syntax updates necessary to support default interface methods. This includes declaring bodies in interface declarations, and supporting modifiers on declarations.",
"_csharplang/proposals/csharp-8.0/ranges.md": "This feature specification describes the syntax for ranges and indices, which support indexing individual elements of a sequence or a range of a sequence from the start or end of that sequence.",
"_csharplang/proposals/csharp-8.0/using.md": "This feature specification supports pattern based using and using declarations to simplify resource cleanup.",
"_csharplang/proposals/csharp-9.0/covariant-returns.md": "This feature specification describes covariant return types, where overriding member declarations can return a type derived from the overridden member declaration.",
"_csharplang/proposals/csharp-9.0/extending-partial-methods.md": "This feature specification describes extensions to partial methods. These extensions enable source generators to create or call partial methods.",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ ms.topic: quickstart
In this quickstart, you create an MSTest app to evaluate the quality of a chat response from an OpenAI model. The test app uses the [Microsoft.Extensions.AI.Evaluation](https://www.nuget.org/packages/Microsoft.Extensions.AI.Evaluation) libraries.

> [!NOTE]
> This quickstart demonstrates the simplest usage of the evaluation API. Notably, it doesn't demonstrate use of the [response caching](../conceptual/evaluation-libraries.md#cached-responses) and [reporting](../conceptual/evaluation-libraries.md#reporting) functionality, which are important if you're authoring unit tests that run as part of an "offline" evaluation pipeline. The scenario shown in this quickstart is suitable in use cases such as "online" evaluation of AI responses within production code and logging scores to telemetry, where caching and reporting aren't relevant. For a tutorial that demonstrates the caching and reporting functionality, see [Tutorial: Evaluate a model's response with response caching and reporting](../tutorials/evaluate-with-reporting.md)
> This quickstart demonstrates the simplest usage of the evaluation API. Notably, it doesn't demonstrate use of the [response caching](libraries.md#cached-responses) and [reporting](libraries.md#reporting) functionality, which are important if you're authoring unit tests that run as part of an "offline" evaluation pipeline. The scenario shown in this quickstart is suitable in use cases such as "online" evaluation of AI responses within production code and logging scores to telemetry, where caching and reporting aren't relevant. For a tutorial that demonstrates the caching and reporting functionality, see [Tutorial: Evaluate a model's response with response caching and reporting](evaluate-with-reporting.md)

## Prerequisites

Expand Down Expand Up @@ -103,4 +103,4 @@ If you no longer need them, delete the Azure OpenAI resource and GPT-4 model dep
## Next steps

- Evaluate the responses from different OpenAI models.
- Add response caching and reporting to your evaluation code. For more information, see [Tutorial: Evaluate a model's response with response caching and reporting](../tutorials/evaluate-with-reporting.md).
- Add response caching and reporting to your evaluation code. For more information, see [Tutorial: Evaluate a model's response with response caching and reporting](evaluate-with-reporting.md).
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@ Complete the following steps to create an MSTest project.
> [!NOTE]
> This code example passes the LLM <xref:Microsoft.Extensions.AI.IChatClient> as `originalChatClient` to <xref:Microsoft.Extensions.AI.Evaluation.Safety.ContentSafetyServiceConfigurationExtensions.ToChatConfiguration(Microsoft.Extensions.AI.Evaluation.Safety.ContentSafetyServiceConfiguration,Microsoft.Extensions.AI.IChatClient)>. The reason to include the LLM chat client here is to enable getting a chat response from the LLM, and notably, to enable response caching for it. (If you don't want to cache the LLM's response, you can create a separate, local <xref:Microsoft.Extensions.AI.IChatClient> to fetch the response from the LLM.) Instead of passing a <xref:Microsoft.Extensions.AI.IChatClient>, if you already have a <xref:Microsoft.Extensions.AI.Evaluation.ChatConfiguration> for an LLM from another reporting configuration, you can pass that instead, using the <xref:Microsoft.Extensions.AI.Evaluation.Safety.ContentSafetyServiceConfigurationExtensions.ToChatConfiguration(Microsoft.Extensions.AI.Evaluation.Safety.ContentSafetyServiceConfiguration,Microsoft.Extensions.AI.Evaluation.ChatConfiguration)> overload.
>
> Similarly, if you configure both [LLM-based evaluators](../conceptual/evaluation-libraries.md#quality-evaluators) and [Azure AI Foundry Evaluation service&ndash;based evaluators](../conceptual/evaluation-libraries.md#safety-evaluators) in the reporting configuration, you also need to pass the LLM <xref:Microsoft.Extensions.AI.Evaluation.ChatConfiguration> to <xref:Microsoft.Extensions.AI.Evaluation.Safety.ContentSafetyServiceConfigurationExtensions.ToChatConfiguration(Microsoft.Extensions.AI.Evaluation.Safety.ContentSafetyServiceConfiguration,Microsoft.Extensions.AI.Evaluation.ChatConfiguration)>. Then it returns a <xref:Microsoft.Extensions.AI.Evaluation.ChatConfiguration> that can talk to both types of evaluators.
> Similarly, if you configure both [LLM-based evaluators](libraries.md#quality-evaluators) and [Azure AI Foundry Evaluation service&ndash;based evaluators](libraries.md#safety-evaluators) in the reporting configuration, you also need to pass the LLM <xref:Microsoft.Extensions.AI.Evaluation.ChatConfiguration> to <xref:Microsoft.Extensions.AI.Evaluation.Safety.ContentSafetyServiceConfigurationExtensions.ToChatConfiguration(Microsoft.Extensions.AI.Evaluation.Safety.ContentSafetyServiceConfiguration,Microsoft.Extensions.AI.Evaluation.ChatConfiguration)>. Then it returns a <xref:Microsoft.Extensions.AI.Evaluation.ChatConfiguration> that can talk to both types of evaluators.

1. Add a method to define the [chat options](xref:Microsoft.Extensions.AI.ChatOptions) and ask the model for a response to a given question.

Expand Down Expand Up @@ -148,6 +148,6 @@ To generate a report to view the evaluation results, see [Generate a report](eva

This tutorial covers the basics of evaluating content safety. As you create your test suite, consider the following next steps:

- Configure additional evaluators, such as the [quality evaluators](../conceptual/evaluation-libraries.md#quality-evaluators). For an example, see the AI samples repo [quality and safety evaluation example](https://github.com/dotnet/ai-samples/blob/main/src/microsoft-extensions-ai-evaluation/api/reporting/ReportingExamples.Example10_RunningQualityAndSafetyEvaluatorsTogether.cs).
- Configure additional evaluators, such as the [quality evaluators](libraries.md#quality-evaluators). For an example, see the AI samples repo [quality and safety evaluation example](https://github.com/dotnet/ai-samples/blob/main/src/microsoft-extensions-ai-evaluation/api/reporting/ReportingExamples.Example10_RunningQualityAndSafetyEvaluatorsTogether.cs).
- Evaluate the content safety of generated images. For an example, see the AI samples repo [image response example](https://github.com/dotnet/ai-samples/blob/main/src/microsoft-extensions-ai-evaluation/api/reporting/ReportingExamples.Example09_RunningSafetyEvaluatorsAgainstResponsesWithImages.cs).
- In real-world evaluations, you might not want to validate individual results, since the LLM responses and evaluation scores can vary over time as your product (and the models used) evolve. You might not want individual evaluation tests to fail and block builds in your CI/CD pipelines when this happens. Instead, in such cases, it might be better to rely on the generated report and track the overall trends for evaluation scores across different scenarios over time (and only fail individual builds in your CI/CD pipelines when there's a significant drop in evaluation scores across multiple different tests).
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,9 @@ Run the test using your preferred test workflow, for example, by using the CLI c
dotnet tool install --local Microsoft.Extensions.AI.Evaluation.Console
```

> [!TIP]
> You might need to create a manifest file first. For more information about that and installing local tools, see [Local tools](../../core/tools/dotnet-tool-install.md#local-tools).

1. Generate a report by running the following command:

```dotnetcli
Expand Down
Loading
Loading