Skip to content

Commit cc2b074

Browse files
authored
Merge pull request #179089 from MicrosoftDocs/repo_sync_working_branch
Confirm merge from repo_sync_working_branch to master to sync with https://github.com/MicrosoftDocs/azure-docs (branch master)
2 parents ee55a02 + 4289855 commit cc2b074

File tree

5 files changed

+97
-63
lines changed

5 files changed

+97
-63
lines changed

articles/azure-resource-manager/management/overview.md

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -75,9 +75,15 @@ There are some important factors to consider when defining your resource group:
7575

7676
* The resources in a resource group can be located in different regions than the resource group.
7777

78-
* When creating a resource group, you need to provide a location for that resource group. You may be wondering, "Why does a resource group need a location? And, if the resources can have different locations than the resource group, why does the resource group location matter at all?" The resource group stores metadata about the resources. When you specify a location for the resource group, you're specifying where that metadata is stored. For compliance reasons, you may need to ensure that your data is stored in a particular region.
78+
* When you create a resource group, you need to provide a location for that resource group.
7979

80-
If the resource group's region is temporarily unavailable, you can't update resources in the resource group because the metadata is unavailable. The resources in other regions will still function as expected, but you can't update them. For more information about building reliable applications, see [Designing reliable Azure applications](/azure/architecture/checklist/resiliency-per-service).
80+
You may be wondering, "Why does a resource group need a location? And, if the resources can have different locations than the resource group, why does the resource group location matter at all?"
81+
82+
The resource group stores metadata about the resources. When you specify a location for the resource group, you're specifying where that metadata is stored. For compliance reasons, you may need to ensure that your data is stored in a particular region.
83+
84+
Except in global resources like Azure Content Delivery Network, Azure Traffic Manager, and Azure Front Door, if a resource group's region is temporarily unavailable, you can't update resources in the resource group because the metadata is unavailable. The resources in other regions will still function as expected, but you can't update them.
85+
86+
For more information about building reliable applications, see [Designing reliable Azure applications](/azure/architecture/checklist/resiliency-per-service).
8187

8288
* A resource group can be used to scope access control for administrative actions. To manage a resource group, you can assign [Azure Policies](../../governance/policy/overview.md), [Azure roles](../../role-based-access-control/role-assignments-portal.md), or [resource locks](lock-resources.md).
8389

articles/cognitive-services/Speech-Service/how-to-custom-speech-test-and-train.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -185,7 +185,7 @@ Additionally, you'll want to account for the following restrictions:
185185

186186
## Structured text data for training (Public Preview)
187187

188-
Often the expected utterances follow a certain pattern. One common pattern is that utterances only differ by words or phrases from a list. Examples of this could be “I have a question about `product`,” where `product` is a list of possible products. Or, “Make that `object` `color`,” where `object` is a list of geometric shapes and `color` is a list of colors. To simplify the creation of training data and to enable better modeling inside the Custom Language Model, you can use a structured text in markdown format to define lists of items and then reference these inside your training utterances. Additionally, the markdown format also supports specifying the phonetic pronunciation of words. The markdown format shares its format with the `.lu` markdown used to train Language Understanding models, in particular list entities and example utterances. For more information about the complete `.lu` markdown, see the <a href="/azure/bot-service/file-format/bot-builder-lu-file-format" target="_blank"> `.lu` file format</a>.
188+
Often the expected utterances follow a certain pattern. One common pattern is that utterances only differ by words or phrases from a list. Examples of this could be “I have a question about `product`,” where `product` is a list of possible products. Or, “Make that `object` `color`,” where `object` is a list of geometric shapes and `color` is a list of colors. To simplify the creation of training data and to enable better modeling inside the Custom Language Model, you can use a structured text in markdown format to define lists of items and then reference these inside your training utterances. Additionally, the markdown format also supports specifying the phonetic pronunciation of words. The markdown file should have a `.md` extension. The syntax of the markdown is the same as that from the Language Understanding models, in particular list entities and example utterances. For more information about the complete markdown syntax, see the <a href="/azure/bot-service/file-format/bot-builder-lu-file-format" target="_blank"> Language Understanding markdown</a>.
189189

190190
Here is an example of the markdown format:
191191

articles/expressroute/how-to-configure-custom-bgp-communities.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ BGP communities are groupings of IP prefixes tagged with a community value. This
8686
1. Update the `VirtualNetworkCommunity` value for your virtual network.
8787
8888
```azurepowershell-interactive
89-
$vnet.BgpCommunities.VirtualNetworkCommunity = '12076:20002'
89+
$vnet.BgpCommunities = @{VirtualNetworkCommunity = '12076:20002'}
9090
$vnet | Set-AzVirtualNetwork
9191
```
9292

articles/machine-learning/how-to-troubleshoot-online-endpoints.md

Lines changed: 85 additions & 57 deletions
Original file line numberDiff line numberDiff line change
@@ -92,117 +92,145 @@ Add `--help` and/or `--debug` to commands to see more information. Include the `
9292

9393
Below is a list of common deployment errors that are reported as part of the deployment operation status.
9494

95-
### ERR_1100: Not enough quota
95+
* [OutOfQuota](#error-outofquota)
96+
* [OutOfCapacity](#error-outofcapacity)
97+
* [BadArgument](#error-badargument)
98+
* [ResourceNotReady](#error-resourcenotready)
99+
* [ResourceNotFound](#error-resourcenotfound)
100+
* [OperationCancelled](#error-operationcancelled)
101+
* [InternalServerError](#error-internalservererror)
96102

97-
Before deploying a model, you need to have enough compute quota. This quota defines how much virtual cores are available per subscription, per workspace, per SKU, and per region. Each deployment subtracts from available quota and adds it back after deletion, based on type of the SKU.
103+
### ERROR: OutOfQuota
98104

99-
A possible mitigation is to check if there are unused deployments that can be deleted. Or you can submit a [request for a quota increase](./how-to-manage-quotas.md).
105+
Below is a list of common resources that might run out of quota when using Azure services:
100106

101-
### ERR_1101: Out of capacity
107+
* [CPU](#cpu-quota)
108+
* [Role assignments](#role-assignment-quota)
109+
* [Endpoints](#endpoint-quota)
110+
* [Kubernetes](#kubernetes-quota)
111+
* [Other](#other-quota)
102112

103-
The specified VM Size failed to provision due to a lack of Azure Machine Learning capacity. Retry later or try deploying to a different region.
113+
#### CPU Quota
104114

105-
### ERR_1102: No more role assignments
115+
Before deploying a model, you need to have enough compute quota. This quota defines how much virtual cores are available per subscription, per workspace, per SKU, and per region. Each deployment subtracts from available quota and adds it back after deletion, based on type of the SKU.
106116

107-
Delete some unused role assignments in this subscription. You can check all role assignments in the Azure portal in the Access Control menu.
117+
A possible mitigation is to check if there are unused deployments that can be deleted. Or you can submit a [request for a quota increase](how-to-manage-quotas.md#request-quota-increases).
108118

109-
### ERR_1103: Endpoint quota reached
119+
#### Role assignment quota
110120

111-
Delete some unused endpoints in this subscription.
121+
Try to delete some unused role assignments in this subscription. You can check all role assignments in the Azure portal in the Access Control menu.
112122

113-
### ERR_1200: Unable to download user container image
123+
#### Endpoint quota
114124

115-
During deployment creation after the compute provisioning, Azure tries to pull the user container image from the workspace private Azure Container Registry (ACR). There could be two possible issues.
125+
Try to delete some unused endpoints in this subscription.
116126

117-
- The user container image isn't found.
127+
#### Kubernetes quota
118128

119-
Make sure container image is available in workspace ACR.
120-
For example, if image is `testacr.azurecr.io/azureml/azureml_92a029f831ce58d2ed011c3c42d35acb:latest` check the repository with
121-
`az acr repository show-tags -n testacr --repository azureml/azureml_92a029f831ce58d2ed011c3c42d35acb --orderby time_desc --output table`.
129+
The requested CPU or memory couldn't be satisfied. Please adjust your request or the cluster.
122130

123-
- There's a permission issue accessing ACR.
131+
#### Other quota
124132

125-
To pull the image, Azure uses [managed identities](../active-directory/managed-identities-azure-resources/overview.md) to access ACR.
133+
To run the `score.py` provided as part of the deployment, Azure creates a container that includes all the resources that the `score.py` needs, and runs the scoring script on that container.
126134

127-
- If you created the associated endpoint with SystemAssigned, then Azure role-based access control (RBAC) permission is automatically granted, and no further permissions are needed.
128-
- If you created the associated endpoint with UserAssigned, then the user's managed identity must have AcrPull permission for the workspace ACR.
135+
If your container could not start, this means scoring could not happen. It might be that the container is requesting more resources than what `instance_type` can support. If so, consider updating the `instance_type` of the online deployment.
129136

130-
To get more details about this error, run:
137+
To get the exact reason for an error, run:
131138

132139
```azurecli
133140
az ml online-deployment get-logs -e <endpoint-name> -n <deployment-name> -l 100
134141
```
135142

136-
### ERR_1300: Unable to download user model\code artifacts
143+
### ERROR: OutOfCapacity
144+
145+
The specified VM Size failed to provision due to a lack of Azure Machine Learning capacity. Retry later or try deploying to a different region.
137146

138-
After provisioning the compute resource, during deployment creation, Azure tries to mount the user model and code artifacts into the user container from the workspace storage account.
147+
### ERROR: BadArgument
139148

140-
- User model\code artifacts not found.
149+
Below is a list of reasons you might run into this error:
141150

142-
- Make sure model and code artifacts are registered to the same workspace as the deployment. Use the `show` command to show details for a model or code artifact in a workspace. For example:
143-
144-
```azurecli
145-
az ml model show --name <model-name>
146-
az ml code show --name <code-name> --version <version>
147-
```
151+
* [Resource request was greater than limits](#resource-requests-greater-than-limits)
152+
* [Unable to download resources](#unable-to-download-resources)
153+
154+
#### Resource requests greater than limits
155+
156+
Requests for resources must be less than or equal to limits. If you don't set limits, we set default values when you attach your compute to an Azure Machine Learning workspace. You can check limits in the Azure portal or by using the `az ml compute show` command.
148157

149-
- You can also check if the blobs are present in the workspace storage account.
158+
#### Unable to download resources
150159

151-
For example, if the blob is `https://foobar.blob.core.windows.net/210212154504-1517266419/WebUpload/210212154504-1517266419/GaussianNB.pkl` you can use this command to check if it exists: `az storage blob exists --account-name foobar --container-name 210212154504-1517266419 --name WebUpload/210212154504-1517266419/GaussianNB.pkl --subscription <sub-name>`
160+
After provisioning the compute resource, during deployment creation, Azure tries to pull the user container image from the workspace private Azure Container Registry (ACR) and mount the user model and code artifacts into the user container from the workspace storage account.
152161

153-
- Permission issue accessing ACR.
162+
First, check if there is a permissions issue accessing ACR.
154163

155-
To pull blobs, Azure uses [managed identities](../active-directory/managed-identities-azure-resources/overview.md) to access the storage account.
164+
To pull blobs, Azure uses [managed identities](../active-directory/managed-identities-azure-resources/overview.md) to access the storage account.
156165

157166
- If you created the associated endpoint with SystemAssigned, Azure role-based access control (RBAC) permission is automatically granted, and no further permissions are needed.
158167

159168
- If you created the associated endpoint with UserAssigned, the user's managed identity must have Storage blob data reader permission on the workspace storage account.
160169

161-
To get more details about this error, run:
170+
During this process, you can run into a few different issues depending on which stage the operation failed at:
162171

163-
```azurecli
164-
az ml online-deployment get-logs -e <endpoint-name> -n <deployment-name> -l 100
165-
```
172+
* [Unable to download user container image](#unable-to-download-user-container-image)
173+
* [Unable to download user model or code artifacts](#unable-to-download-user-model-or-code-artifacts)
166174

167-
### ERR_1350: Unable to download user model, not enough space on the disk
175+
To get more details about these errors, run:
168176

169-
This issue happens when the size of the model is bigger than the available disk space. Try an SKU with more disk space.
177+
```azurecli
178+
az ml online-deployment get-logs -n <endpoint-name> --deployment <deployment-name> --l 100
179+
```
170180

171-
### ERR_2100: Unable to start user container
181+
#### Unable to download user container image
172182

173-
To run the `score.py` provided as part of the deployment, Azure creates a container that includes all the resources that the `score.py` needs, and runs the scoring script on that container.
183+
It is possible that the user container could not be found.
174184

175-
This error means that this container couldn't start, which means scoring could not happen. It could be that the container is requesting more resources than what `instance_type` could support. If so, consider updating the `instance_type` of the online deployment.
185+
Make sure container image is available in workspace ACR.
176186

177-
To get the exact reason for an error, run:
187+
For example, if image is `testacr.azurecr.io/azureml/azureml_92a029f831ce58d2ed011c3c42d35acb:latest` check the repository with
188+
`az acr repository show-tags -n testacr --repository azureml/azureml_92a029f831ce58d2ed011c3c42d35acb --orderby time_desc --output table`.
178189

179-
```azurecli
180-
az ml online-deployment get-logs -e <endpoint-name> -n <deployment-name> -l 100
181-
```
190+
#### Unable to download user model or code artifacts
182191

183-
### ERR_2101: Kubernetes unschedulable
192+
It is possible that the user model or code artifacts can't be found.
184193

185-
The requested CPU or memory can't be satisfied. Please adjust your request or the cluster.
194+
Make sure model and code artifacts are registered to the same workspace as the deployment. Use the `show` command to show details for a model or code artifact in a workspace.
186195

187-
### ERR_2102: Resources requests invalid
196+
- For example:
197+
198+
```azurecli
199+
az ml model show --name <model-name>
200+
az ml code show --name <code-name> --version <version>
201+
```
202+
203+
You can also check if the blobs are present in the workspace storage account.
188204

189-
Requests for resources must be less than or equal to limits. If you don't set limits, we set default values when you attach your compute to an Azure Machine Learning workspace. You can check limits in the Azure portal or by using the `az ml compute show` command.
205+
- For example, if the blob is `https://foobar.blob.core.windows.net/210212154504-1517266419/WebUpload/210212154504-1517266419/GaussianNB.pkl`, you can use this command to check if it exists:
190206

191-
### ERR_2200: User container has crashed\terminated
207+
`az storage blob exists --account-name foobar --container-name 210212154504-1517266419 --name WebUpload/210212154504-1517266419/GaussianNB.pkl --subscription <sub-name>`
192208

193-
To run the `score.py` provided as part of the deployment, Azure creates a container that includes all the resources that the `score.py` needs, and runs the scoring script on that container. The error in this scenario is that this container is crashing when running, which means scoring couldn't happen. This error happens when:
209+
### ERROR: ResourceNotReady
210+
211+
To run the `score.py` provided as part of the deployment, Azure creates a container that includes all the resources that the `score.py` needs, and runs the scoring script on that container. The error in this scenario is that this container is crashing when running, which means scoring can't happen. This error happens when:
194212

195213
- There's an error in `score.py`. Use `get-logs` to help diagnose common problems:
196-
- A package that was imported but is not in the conda environment
197-
- A syntax error
198-
- A failure in the `init()` method
214+
- A package that was imported but is not in the conda environment.
215+
- A syntax error.
216+
- A failure in the `init()` method.
199217
- If `get-logs` isn't producing any logs, it usually means that the container has failed to start. To debug this issue, try [deploying locally](https://github.com/MicrosoftDocs/azure-docs/blob/master/articles/machine-learning/how-to-troubleshoot-online-endpoints.md#deploy-locally) instead.
200218
- Readiness or liveness probes are not set up correctly.
201219
- There's an error in the environment setup of the container, such as a missing dependency.
202220

203-
### ERR_5000: Internal error
221+
### ERROR: ResourceNotFound
222+
223+
This error occurs when Azure Resource Manager can't find a required resource. For example, you will receive this error if a storage account was referred to but cannot be found at the path on which it was specified. Be sure to double check resources which might have been supplied by exact path or the spelling of their names.
224+
225+
For more information, see [Resolve resource not found errors](../azure-resource-manager/troubleshooting/error-not-found.md).
226+
227+
### ERROR: OperationCancelled
228+
229+
Azure operations have a certain priority level and are executed from highest to lowest. This error happens when your operation happened to be overridden by another operation that has a higher priority. Retrying the operation might allow it to be performed without cancellation.
230+
231+
### ERROR: InternalServerError
204232

205-
While we do our best to provide a stable and reliable service, sometimes things don't go according to plan. If you get this error, it means something isn't right on our side and we need to fix it. Submit a [customer support ticket](https://portal.azure.com/#blade/Microsoft_Azure_Support/HelpAndSupportBlade/newsupportrequest) with all related information and we'll address the issue.
233+
Although we do our best to provide a stable and reliable service, sometimes things don't go according to plan. If you get this error, it means that something isn't right on our side, and we need to fix it. Submit a [customer support ticket](https://portal.azure.com/#blade/Microsoft_Azure_Support/HelpAndSupportBlade/newsupportrequest) with all related information and we'll address the issue.
206234

207235
## Autoscaling issues
208236

0 commit comments

Comments
 (0)