Skip to content

Commit d1324db

Browse files
committed
Merge branch 'main' of https://github.com/MicrosoftDocs/azure-docs-pr into reloc-site-recovery
2 parents c7cff21 + ccf7c36 commit d1324db

28 files changed

+138
-98
lines changed

articles/ai-services/language-service/conversational-language-understanding/concepts/best-practices.md

Lines changed: 24 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ You also want to avoid mixing different schema designs. Do not build half of you
4343

4444
## Use standard training before advanced training
4545

46-
[Standard training](../how-to/train-model.md#training-modes) is free and faster than Advanced training, making it useful to quickly understand the effect of changing your training set or schema while building the model. Once you are satisfied with the schema, consider using advanced training to get the best AIQ out of your model.
46+
[Standard training](../how-to/train-model.md#training-modes) is free and faster than Advanced training, making it useful to quickly understand the effect of changing your training set or schema while building the model. Once you're satisfied with the schema, consider using advanced training to get the best AIQ out of your model.
4747

4848
## Use the evaluation feature
4949

@@ -73,17 +73,31 @@ To resolve this, you would label a learned component in your training data for a
7373
If you require the learned component, make sure that *ticket quantity* is only returned when the learned component predicts it in the right context. If you also require the prebuilt component, you can then guarantee that the returned *ticket quantity* entity is both a number and in the correct position.
7474

7575

76-
## Addressing casing inconsistencies
76+
## Addressing model inconsistencies
7777

78-
If you have poor AI quality and determine the casing used in your training data is dissimilar to the testing data, you can use the `normalizeCasing` project setting. This normalizes the casing of utterances when training and testing the model. If you've migrated from LUIS, you might recognize that LUIS did this by default.
78+
If your model is overly sensitive to small grammatical changes, like casing or diacritics, you can systematically manipulate your dataset directly in the Language Studio. To use these features, click on the Settings tab on the left toolbar and locate the **Advanced project settings** section. First, you can ***Enable data transformation for casing***, which normalizes the casing of utterances when training, testing, and implementing your model. If you've migrated from LUIS, you might recognize that LUIS did this normalization by default. To access this feature via the API, set the `"normalizeCasing"` parameter to `true`. See an example below:
7979

8080
```json
8181
{
8282
"projectFileVersion": "2022-10-01-preview",
8383
...
8484
"settings": {
85-
"confidenceThreshold": 0.5,
85+
...
8686
"normalizeCasing": true
87+
...
88+
}
89+
...
90+
```
91+
Second, you can also leverage the **Advanced project settings** to ***Enable data augmentation for diacritics*** to generate variations of your training data for possible diacritic variations used in natural language. This feature is available for all languages, but it is especially useful for Germanic and Slavic languages, where users often write words using classic English characters instead of the correct characters. For example, the phrase "Navigate to the sports channel" in French is "Accédez à la chaîne sportive". When this feature is enabled, the phrase "Accedez a la chaine sportive" (without diacritic characters) is also included in the training dataset. If you enable this feature, please note that the utterance count of your training set will increase, and you may need to adjust your training data size accordingly. The current maximum utterance count after augmentation is 25,000. To access this feature via the API, set the `"augmentDiacritics"` parameter to `true`. See an example below:
92+
93+
```json
94+
{
95+
"projectFileVersion": "2022-10-01-preview",
96+
...
97+
"settings": {
98+
...
99+
"augmentDiacritics": true
100+
...
87101
}
88102
...
89103
```
@@ -125,9 +139,9 @@ Once the request is sent, you can track the progress of the training job in Lang
125139

126140
Model version 2023-04-15, conversational language understanding provides normalization in the inference layer that doesn't affect training.
127141

128-
The normalization layer normalizes the classification confidence scores to a confined range. The range selected currently is from `[-a,a]` where "a" is the square root of the number of intents. As a result, the normalization depends on the number of intents in the app. If there is a very low number of intents, the normalization layer has a very small range to work with. With a fairly large number of intents, the normalization is more effective.
142+
The normalization layer normalizes the classification confidence scores to a confined range. The range selected currently is from `[-a,a]` where "a" is the square root of the number of intents. As a result, the normalization depends on the number of intents in the app. If there's a very low number of intents, the normalization layer has a very small range to work with. With a fairly large number of intents, the normalization is more effective.
129143

130-
If this normalization doesn’t seem to help intents that are out of scope to the extent that the confidence threshold can be used to filter out of scope utterances, it might be related to the number of intents in the app. Consider adding more intents to the app, or if you are using an orchestrated architecture, consider merging apps that belong to the same domain together.
144+
If this normalization doesn’t seem to help intents that are out of scope to the extent that the confidence threshold can be used to filter out of scope utterances, it might be related to the number of intents in the app. Consider adding more intents to the app, or if you're using an orchestrated architecture, consider merging apps that belong to the same domain together.
131145

132146
## Debugging composed entities
133147

@@ -146,7 +160,7 @@ Data in a conversational language understanding project can have two data sets.
146160

147161
## Custom parameters for target apps and child apps
148162

149-
If you are using [orchestrated apps](./app-architecture.md), you may want to send custom parameter overrides for various child apps. The `targetProjectParameters` field allows users to send a dictionary representing the parameters for each target project. For example, consider an orchestrator app named `Orchestrator` orchestrating between a conversational language understanding app named `CLU1` and a custom question answering app named `CQA1`. If you want to send a parameter named "top" to the question answering app, you can use the above parameter.
163+
If you're using [orchestrated apps](./app-architecture.md), you may want to send custom parameter overrides for various child apps. The `targetProjectParameters` field allows users to send a dictionary representing the parameters for each target project. For example, consider an orchestrator app named `Orchestrator` orchestrating between a conversational language understanding app named `CLU1` and a custom question answering app named `CQA1`. If you want to send a parameter named "top" to the question answering app, you can use the above parameter.
150164

151165
```console
152166
curl --request POST \
@@ -249,6 +263,6 @@ curl --location 'https://<your-resource>.cognitiveservices.azure.com/language/au
249263
Once the request is sent, you can track the progress of the training job in Language Studio as usual.
250264

251265
Caveats:
252-
- The None Score threshold for the app (confidence threshold below which the topIntent is marked as None) when using this recipe should be set to 0. This is because this new recipe attributes a certain portion of the in domain probabiliities to out of domain so that the model is not incorrectly overconfident about in domain utterances. As a result, users may see slightly reduced confidence scores for in domain utterances as compared to the prod recipe.
253-
- This recipe is not recommended for apps with just two (2) intents, such as IntentA and None, for example.
254-
- This recipe is not recommended for apps with low number of utterances per intent. A minimum of 25 utterances per intent is highly recommended.
266+
- The None Score threshold for the app (confidence threshold below which the topIntent is marked as None) when using this recipe should be set to 0. This is because this new recipe attributes a certain portion of the in domain probabilities to out of domain so that the model isn't incorrectly overconfident about in domain utterances. As a result, users may see slightly reduced confidence scores for in domain utterances as compared to the prod recipe.
267+
- This recipe isn't recommended for apps with just two (2) intents, such as IntentA and None, for example.
268+
- This recipe isn't recommended for apps with low number of utterances per intent. A minimum of 25 utterances per intent is highly recommended.

articles/ai-services/language-service/language-detection/language-support.md

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -132,7 +132,7 @@ If you have content expressed in a less frequently used language, you can try La
132132
| Tongan | `to` |
133133
| Turkish | `tr` |
134134
| Turkmen | `tk` |
135-
| Upper Sorbian | `hsb` |
135+
| Upper Sorbian | `hsb` |
136136
| Uyghur | `ug` |
137137
| Ukrainian | `uk` |
138138
| Urdu | `ur` |
@@ -164,23 +164,23 @@ If you have content expressed in a less frequently used language, you can try La
164164

165165
## Script detection
166166

167-
| Language |Script code | Scripts |
168-
| --- | --- | --- |
169-
| Bengali (Bengali-Assamese) | `as` | `Latn`, `Beng` |
170-
| Bengali (Bangla) | `bn` | `Latn`, `Beng` |
171-
| Gujarati | `gu` | `Latn`, `Gujr` |
172-
| Hindi | `hi` | `Latn`, `Deva` |
173-
| Kannada | `kn` | `Latn`, `Knda` |
174-
| Malayalam | `ml` | `Latn`, `Mlym` |
175-
| Marathi | `mr` | `Latn`, `Deva` |
176-
| Oriya | `or` | `Latn`, `Orya` |
177-
| Gurmukhi | `pa` | `Latn`, `Guru` |
178-
| Tamil | `ta` | `Latn`, `Taml` |
179-
| Telugu | `te` | `Latn`, `Telu` |
180-
| Arabic | `ur` | `Latn`, `Arab` |
181-
| Cyrillic | `tt` | `Latn`, `Cyrl` |
182-
| Serbian `sr` | `Latn`, `Cyrl` |
183-
| Unified Canadian Aboriginal Syllabics | `iu` | `Latn`, `Cans` |
167+
| Language | Script code | Scripts |
168+
| ------------------------------------- | ---------- | -------------- |
169+
| Bengali (Bengali-Assamese) | `as` | `Latn`, `Beng` |
170+
| Bengali (Bangla) | `bn` | `Latn`, `Beng` |
171+
| Gujarati | `gu` | `Latn`, `Gujr` |
172+
| Hindi | `hi` | `Latn`, `Deva` |
173+
| Kannada | `kn` | `Latn`, `Knda` |
174+
| Malayalam | `ml` | `Latn`, `Mlym` |
175+
| Marathi | `mr` | `Latn`, `Deva` |
176+
| Oriya | `or` | `Latn`, `Orya` |
177+
| Gurmukhi | `pa` | `Latn`, `Guru` |
178+
| Tamil | `ta` | `Latn`, `Taml` |
179+
| Telugu | `te` | `Latn`, `Telu` |
180+
| Arabic | `ar` | `Latn`, `Arab` |
181+
| Cyrillic | `tt` | `Latn`, `Cyrl` |
182+
| Serbian | `sr` | `Latn`, `Cyrl` |
183+
| Unified Canadian Aboriginal Syllabics | `iu` | `Latn`, `Cans` |
184184

185185
## Next steps
186186

articles/ai-services/language-service/named-entity-recognition/how-to/skill-parameters.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,9 +24,11 @@ The “inclusionList” parameter allows for you to specify which of the NER ent
2424

2525
The “exclusionList” parameter allows for you to specify which of the NER entity tags, listed here [link to Preview API table], you would like excluded in the entity list output in your inference JSON listing out all words and categorizations recognized by the NER service. By default, all recognized entities will be listed.
2626

27+
<!--
2728
## Example
2829
2930
To do: work with Bidisha & Mikael to update with a good example
31+
-->
3032

3133
## overlapPolicy parameter
3234

articles/ai-services/language-service/summarization/overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -155,7 +155,7 @@ For more information, *see* [**Use native documents for language processing**](.
155155
# [Conversation summarization](#tab/conversation-summarization)
156156

157157
* Conversation summarization takes structured text for analysis. For more information, see [data and service limits](../concepts/data-limits.md).
158-
* Conversation summarization accepts text in English. For more information, see [language support](language-support.md?tabs=conversation-summarization).
158+
* Conversation summarization works with various spoken languages. For more information, see [language support](language-support.md?tabs=conversation-summarization).
159159

160160
# [Document summarization](#tab/document-summarization)
161161

articles/ai-studio/how-to/deploy-models-phi-3.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -34,16 +34,6 @@ The model belongs to the Phi-3 model family, and the Mini version comes in two v
3434

3535
The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures. When assessed against benchmarks that test common sense, language understanding, math, code, long context and logical reasoning, Phi-3-Mini-4K-Instruct and Phi-3-Mini-128K-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.
3636

37-
# [Phi-3-medium](#tab/phi-3-medium)
38-
Phi-3 Medium is a 14B parameters, lightweight, state-of-the-art open model. Phi-3-Medium was trained with Phi-3 datasets that include both synthetic data and the filtered, publicly-available websites data, with a focus on high quality and reasoning-dense properties.
39-
40-
The model belongs to the Phi-3 model family, and the Medium version comes in two variants, 4K and 128K, which denote the context length (in tokens) that each model variant can support.
41-
42-
- Phi-3-medium-4k-Instruct
43-
- Phi-3-medium-128k-Instruct
44-
45-
The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures. When assessed against benchmarks that test common sense, language understanding, math, code, long context and logical reasoning, Phi-3-Medium-4k-Instruct and Phi-3-Medium-128k-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.
46-
4737
# [Phi-3-small](#tab/phi-3-small)
4838

4939
Phi-3-Small is a 7B parameters, lightweight, state-of-the-art open model. Phi-3-Small was trained with Phi-3 datasets that include both synthetic data and the filtered, publicly-available websites data, with a focus on high quality and reasoning-dense properties.
@@ -55,6 +45,16 @@ The model belongs to the Phi-3 model family, and the Small version comes in two
5545

5646
The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures. When assessed against benchmarks that test common sense, language understanding, math, code, long context and logical reasoning, Phi-3-Small-8k-Instruct and Phi-3-Small-128k-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.
5747

48+
# [Phi-3-medium](#tab/phi-3-medium)
49+
Phi-3 Medium is a 14B parameters, lightweight, state-of-the-art open model. Phi-3-Medium was trained with Phi-3 datasets that include both synthetic data and the filtered, publicly-available websites data, with a focus on high quality and reasoning-dense properties.
50+
51+
The model belongs to the Phi-3 model family, and the Medium version comes in two variants, 4K and 128K, which denote the context length (in tokens) that each model variant can support.
52+
53+
- Phi-3-medium-4k-Instruct
54+
- Phi-3-medium-128k-Instruct
55+
56+
The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures. When assessed against benchmarks that test common sense, language understanding, math, code, long context and logical reasoning, Phi-3-Medium-4k-Instruct and Phi-3-Medium-128k-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.
57+
5858
---
5959

6060
## Deploy Phi-3 models as serverless APIs

articles/azure-monitor/logs/logs-dedicated-clusters.md

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -595,11 +595,6 @@ After you create your cluster resource and it's fully provisioned, you can edit
595595
>[!IMPORTANT]
596596
>Cluster update should not include both identity and key identifier details in the same operation. If you need to update both, the update should be in two consecutive operations.
597597
598-
<!--
599-
> [!NOTE]
600-
> The *billingType* property isn't supported in CLI.
601-
-->
602-
603598
#### [Portal](#tab/azure-portal)
604599

605600
N/A
-108 KB
Loading

articles/defender-for-cloud/secure-score-security-controls.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,7 @@ Defender for Cloud calculates each control every eight hours for each Azure subs
4949
5050
### Example scores for a control
5151

52-
The following example focuses on secure score recommendations for enabling multifactor authentication (MFA).
52+
The following example focuses on secure score recommendations for **Remediate vulnerabilities**.
5353

5454
:::image type="content" source="./media/secure-score-security-controls/remediate-vulnerabilities-control.png" alt-text="Screenshot that shows secure score recommendations for multifactor authentication." lightbox="./media/secure-score-security-controls/remediate-vulnerabilities-control.png":::
5555

@@ -59,8 +59,8 @@ This example illustrates the following fields in the recommendations.
5959
--- | ---
6060
**Remediate vulnerabilities** | A grouping of recommendations for discovering and resolving known vulnerabilities.
6161
**Max score** | The maximum number of points that you can gain by completing all recommendations within a control.<br/><br/> The maximum score for a control indicates the relative significance of that control and is fixed for every environment.<br/><br/>Use the values in this column to determine which issues to work on first.
62-
**Current score** | The current score for this control.<br/><br/> Current score = [Score per resource] * [Number of healthy resources]<br/><br/>Each control contributes to the total score. In this example, the control is contributing 2.00 points to current total score.
63-
**Potential score increase** | The remaining points available to you within the control. If you remediate all the recommendations in this control, your score increases by 9%.<br/><br/> Potential score increase = [Score per resource] * [Number of unhealthy resources]
62+
**Current score** | The current score for this control.<br/><br/> Current score = [Score per resource] * [Number of healthy resources]<br/><br/>Each control contributes to the total score. In this example, the control is contributing 3.33 points to current total score.
63+
**Potential score increase** | The remaining points available to you within the control. If you remediate all the recommendations in this control, your score increases by 4%.<br/><br/> Potential score increase = [Score per resource] * [Number of unhealthy resources]
6464
**Insights** | Extra details for each recommendation, such as:<br/><br/> - :::image type="icon" source="media/secure-score-security-controls/preview-icon.png" border="false"::: **Preview recommendation**: This recommendation affects the secure score only when it's generally available.<br/><br/> - :::image type="icon" source="media/secure-score-security-controls/fix-icon.png" border="false"::: **Fix**: Resolve this issue.<br/><br/> - :::image type="icon" source="media/secure-score-security-controls/enforce-icon.png" border="false"::: **Enforce**: Automatically deploy a policy to fix this issue whenever someone creates a noncompliant resource.<br/><br/> - :::image type="icon" source="media/secure-score-security-controls/deny-icon.png" border="false"::: **Deny**: Prevent new resources from being created with this issue.
6565

6666
## Score calculation equations

articles/governance/policy/index.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,7 @@ landingContent:
8989
url: ./concepts/regulatory-compliance.md
9090
- text: Azure Policy for Kubernetes
9191
url: ./concepts/policy-for-kubernetes.md
92-
- text: Azure Automanage Machine Configuration
92+
- text: Azure Machine Configuration
9393
url: ../machine-configuration/overview.md
9494

9595
- title: Review & Remediate resources

0 commit comments

Comments
 (0)