You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-high-availability-machine-learning.md
+9-5Lines changed: 9 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,23 +20,24 @@ To maximize your uptime, plan ahead to maintain business continuity and prepare
20
20
Microsoft strives to ensure that Azure services are always available. However, unplanned service outages may occur. We recommend having a disaster recovery plan in place for handling regional service outages. In this article, you'll learn how to:
21
21
22
22
* Plan for a multi-regional deployment of Azure Machine Learning and associated resources.
23
+
* Maximize chances to recover logs, notebooks, docker images, and other metadata.
23
24
* Design for high availability of your solution.
24
25
* Initiate a failover to another region.
25
26
26
27
> [!NOTE]
27
-
> Azure Machine Learning itself does not provide automatic failover or disaster recovery.
28
+
> Azure Machine Learning itself does not provide automatic failover or disaster recovery. Backup and restore of workspace metadata such as run history is unavailable.
28
29
29
30
In case you have accidentally deleted your workspace or corresponding components, this article also provides you with currently supported recovery options.
30
31
31
32
## Understand Azure services for Azure Machine Learning
32
33
33
-
Azure Machine Learning depends on multiple Azure services and has several layers. Some of these services are provisioned in your (customer) subscription. You're responsible for the high-availability configuration of these services. Other services are created in a Microsoft subscription and managed by Microsoft.
34
+
Azure Machine Learning depends on multiple Azure services. Some of these services are provisioned in your subscription. You're responsible for the high-availability configuration of these services. Other services are created in a Microsoft subscription and are managed by Microsoft.
34
35
35
36
Azure services include:
36
37
37
38
***Azure Machine Learning infrastructure**: A Microsoft-managed environment for the Azure Machine Learning workspace.
38
39
39
-
***Associated resources**: Resources provisioned in your subscription during Azure Machine Learning workspace creation. These resources include Azure Storage, Azure Key Vault, Azure Container Registry, and Application Insights. You're responsible for configuring high-availability settings for these resources.
40
+
***Associated resources**: Resources provisioned in your subscription during Azure Machine Learning workspace creation. These resources include Azure Storage, Azure Key Vault, Azure Container Registry, and Application Insights.
40
41
* Default storage has data such as model, training log data, and dataset.
41
42
* Key Vault has credentials for Azure Storage, Container Registry, and data stores.
42
43
* Container Registry has a Docker image for training and inferencing environments.
@@ -136,7 +137,10 @@ By keeping your data storage isolated from the default storage the workspace use
136
137
* Attach the same storage instances as datastores to the primary and secondary workspaces.
137
138
* Make use of geo-replication for data storage accounts and maximize your uptime.
138
139
139
-
### Manage machine learning artifacts as code
140
+
### Manage machine learning assets as code
141
+
142
+
> [!NOTE]
143
+
> Backup and restore of workspace metadata such as run history, models and environments is unavailable. Specifying assets and configurations as code using YAML specs, will help you re-recreate assets across workspaces in case of a disaster.
140
144
141
145
Jobs in Azure Machine Learning are defined by a job specification. This specification includes dependencies on input artifacts that are managed on a workspace-instance level, including environments, datasets, and compute. For multi-region job submission and deployments, we recommend the following practices:
142
146
@@ -198,4 +202,4 @@ If you accidentally deleted your workspace it is currently not possible to recov
198
202
199
203
## Next steps
200
204
201
-
To deploy Azure Machine Learning with associated resources with your high-availability settings, use an [Azure Resource Manager template](https://github.com/Azure/azure-quickstart-templates/tree/master/quickstarts/microsoft.machinelearningservices/).
205
+
To learn about repeatable infrastructure deployments with Azure Machine Learning, use an [Azure Resource Manager template](https://docs.microsoft.com/azure/machine-learning/tutorial-create-secure-workspace-template).
0 commit comments