Skip to content

Commit ebc3dce

Browse files
authored
Merge pull request #5459 from MicrosoftDocs/repo_sync_working_branch
Confirm merge from repo_sync_working_branch to main to sync with https://github.com/MicrosoftDocs/azure-ai-docs (branch main)
2 parents 164e22a + 0cb51b2 commit ebc3dce

File tree

6 files changed

+18
-21
lines changed

6 files changed

+18
-21
lines changed

articles/ai-foundry/concepts/evaluation-evaluators/agent-evaluators.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -70,7 +70,7 @@ intent_resolution(
7070

7171
### Intent resolution output
7272

73-
The numerical score on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score <= threshold, or "fail" otherwise. Using the reason and additional fields can help you understand why the score is high or low.
73+
The numerical score on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score >= threshold, or "fail" otherwise. Using the reason and additional fields can help you understand why the score is high or low.
7474

7575
```python
7676
{
@@ -137,7 +137,7 @@ tool_call_accuracy(
137137

138138
### Tool call accuracy output
139139

140-
The numerical score (passing rate of correct tool calls) is 0-1 and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score <= threshold, or "fail" otherwise. Using the reason and tool call detail fields can help you understand why the score is high or low.
140+
The numerical score (passing rate of correct tool calls) is 0-1 and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score >= threshold, or "fail" otherwise. Using the reason and tool call detail fields can help you understand why the score is high or low.
141141

142142
```python
143143
{
@@ -174,7 +174,7 @@ task_adherence(
174174

175175
### Task adherence output
176176

177-
The numerical score on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score <= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
177+
The numerical score on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score >= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
178178

179179
```python
180180
{

articles/ai-foundry/concepts/evaluation-evaluators/general-purpose-evaluators.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,7 @@ coherence(
5959

6060
### Coherence output
6161

62-
The numerical score on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score <= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
62+
The numerical score on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score >= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
6363

6464
```python
6565
{
@@ -88,7 +88,7 @@ fluency(
8888

8989
### Fluency output
9090

91-
The numerical score on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score <= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
91+
The numerical score on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score >= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
9292

9393
```python
9494
{
@@ -127,7 +127,7 @@ qa_eval(
127127

128128
### QA output
129129

130-
While F1 score outputs a numerical score on 0-1 float scale, the other evaluators output numerical scores on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score <= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
130+
While F1 score outputs a numerical score on 0-1 float scale, the other evaluators output numerical scores on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score >= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
131131

132132
```python
133133
{

articles/ai-foundry/concepts/evaluation-evaluators/rag-evaluators.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ retrieval(
6363

6464
### Retrieval output
6565

66-
The numerical score on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (a default is set), we also output "pass" if the score <= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
66+
The numerical score on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (a default is set), we also output "pass" if the score >= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
6767

6868
```python
6969
{
@@ -163,7 +163,7 @@ document_retrieval_evaluator(retrieval_ground_truth=retrieval_ground_truth, retr
163163

164164
### Document retrieval output
165165

166-
All numerical scores have `high_is_better=True` except for `holes` and `holes_ratio` which have `high_is_better=False`. Given a numerical threshold (default to 3), we also output "pass" if the score <= threshold, or "fail" otherwise.
166+
All numerical scores have `high_is_better=True` except for `holes` and `holes_ratio` which have `high_is_better=False`. Given a numerical threshold (default to 3), we also output "pass" if the score >= threshold, or "fail" otherwise.
167167

168168
```python
169169
{
@@ -206,7 +206,7 @@ groundedness(
206206

207207
### Groundedness output
208208

209-
The numerical score on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score <= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
209+
The numerical score on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score >= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
210210

211211
```python
212212
{
@@ -276,7 +276,7 @@ relevance(
276276

277277
### Relevance output
278278

279-
The numerical score on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score <= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
279+
The numerical score on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score >= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
280280

281281
```python
282282
{
@@ -306,7 +306,7 @@ response_completeness(
306306

307307
### Response completeness output
308308

309-
The numerical score on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score <= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
309+
The numerical score on a likert scale (integer 1 to 5) and a higher score is better. Given a numerical threshold (default to 3), we also output "pass" if the score >= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
310310

311311
```python
312312
{

articles/ai-foundry/concepts/evaluation-evaluators/textual-similarity-evaluators.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ similarity(
5858

5959
### Similarity output
6060

61-
The numerical score on a likert scale (integer 1 to 5) and a higher score means a higher degree of similarity. Given a numerical threshold (default to 3), we also output "pass" if the score <= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
61+
The numerical score on a likert scale (integer 1 to 5) and a higher score means a higher degree of similarity. Given a numerical threshold (default to 3), we also output "pass" if the score >= threshold, or "fail" otherwise. Using the reason field can help you understand why the score is high or low.
6262

6363
```python
6464
{
@@ -87,7 +87,7 @@ f1_score(
8787

8888
### F1 score output
8989

90-
The numerical score is a 0-1 float and a higher score is better. Given a numerical threshold (default to 0.5), we also output "pass" if the score <= threshold, or "fail" otherwise.
90+
The numerical score is a 0-1 float and a higher score is better. Given a numerical threshold (default to 0.5), we also output "pass" if the score >= threshold, or "fail" otherwise.
9191

9292
```python
9393
{
@@ -115,7 +115,7 @@ bleu_score(
115115

116116
### BLEU output
117117

118-
The numerical score is a 0-1 float and a higher score is better. Given a numerical threshold (default to 0.5), we also output "pass" if the score <= threshold, or "fail" otherwise.
118+
The numerical score is a 0-1 float and a higher score is better. Given a numerical threshold (default to 0.5), we also output "pass" if the score >= threshold, or "fail" otherwise.
119119

120120
```python
121121
{
@@ -144,7 +144,7 @@ gleu_score(
144144

145145
### GLEU score output
146146

147-
The numerical score is a 0-1 float and a higher score is better. Given a numerical threshold (default to 0.5), we also output "pass" if the score <= threshold, or "fail" otherwise.
147+
The numerical score is a 0-1 float and a higher score is better. Given a numerical threshold (default to 0.5), we also output "pass" if the score >= threshold, or "fail" otherwise.
148148

149149
```python
150150
{
@@ -173,7 +173,7 @@ rouge(
173173

174174
### ROUGE score output
175175

176-
The numerical score is a 0-1 float and a higher score is better. Given a numerical threshold (default to 0.5), we also output "pass" if the score <= threshold, or "fail" otherwise.
176+
The numerical score is a 0-1 float and a higher score is better. Given a numerical threshold (default to 0.5), we also output "pass" if the score >= threshold, or "fail" otherwise.
177177

178178
```python
179179
{
@@ -208,7 +208,7 @@ meteor_score(
208208

209209
### METEOR score output
210210

211-
The numerical score is a 0-1 float and a higher score is better. Given a numerical threshold (default to 0.5), we also output "pass" if the score <= threshold, or "fail" otherwise.
211+
The numerical score is a 0-1 float and a higher score is better. Given a numerical threshold (default to 0.5), we also output "pass" if the score >= threshold, or "fail" otherwise.
212212

213213
```python
214214
{

articles/search/search-region-support.md

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -102,14 +102,12 @@ You can create an Azure AI Search service in any of the following Azure public r
102102
| Central India ||||||
103103
| Jio India West​​ || ||||
104104
| South India | || | | |
105-
| Japan East <sup>1</sup> ||||||
105+
| Japan East ||||||
106106
| Japan West​ || ||| |
107107
| Korea Central ||||||
108108
| Korea South​​ | | ||| |
109109
| Indonesia Central | || | | |
110110

111-
<sup>1</sup> This region has capacity constraints on all tiers.
112-
113111
## Azure Government regions
114112

115113
| Region | AI enrichment | Availability zones | Agentic retrieval | Semantic ranker | Query rewrite |

articles/search/search-sku-tier.md

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,6 @@ Currently, several regions are capacity-constrained for specific tiers and can't
6161
| Region | Disabled tier (SKU) due to over-capacity | Suggested alternative |
6262
|--------|------------------------------------------|-----------------------|
6363
| West US 2 | Basic, S1, S2, S3, L1, L2 | West US, West US 3|
64-
| Japan East | Basic, S1, S2, S3, L1, L2| Japan West|
6564

6665
## Feature availability by tier
6766

0 commit comments

Comments
 (0)