Skip to content

Commit f296a3d

Browse files
committed
New article start
1 parent 41437de commit f296a3d

File tree

1 file changed

+43
-0
lines changed

1 file changed

+43
-0
lines changed
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
---
2+
title: Distributed training methods
3+
titleSuffix: Azure Machine Learning
4+
description:
5+
services: machine-learning
6+
ms.service: machine-learning
7+
author: nibaccam
8+
ms.author: nibaccam
9+
ms.subservice: core
10+
ms.topic: conceptual
11+
ms.date: 03/23/2020
12+
---
13+
14+
# Distributed training with Azure Machine Learning
15+
16+
What is distributed training
17+
18+
19+
Distributed training for machine learning models refers to the ability to scale trainign and share the load across multiple GPU or CPU nodes in parallel in order to facilitate and accelerate the training of machine learning models.
20+
Azure Machine learning is integrated with two common open source frameworks that support distributed training, PyTorch and Tensorflow.
21+
22+
multiple GPUs on one machine
23+
24+
The different types of distributed
25+
26+
If you are looking to simply train your model without leveraging distributed training, see [Azure Machine Learning SDK for Python](#python-sdk) for the different ways to train models.
27+
28+
When to use distributed training
29+
30+
Use distributed training when your data strains your single machine compute power.
31+
If your data is too large to store in your local RAM
32+
If your data loading becomes time consuming and cumbersome
33+
34+
35+
What does Azure Machine Learning support
36+
Azure Machine Learning offers estimators with PyTorch and Tensorflow which support distributed training
37+
38+
how-to-train-tensorflow.md#distributed-training
39+
how-to-train-pytorch.md#distributed-training
40+
41+
## Next steps
42+
43+
Learn how to [Set up training environments](how-to-set-up-training-targets.md).

0 commit comments

Comments
 (0)