Skip to content

Commit f4912bf

Browse files
committed
PM + team feedback
1 parent 3026a6d commit f4912bf

File tree

1 file changed

+3
-2
lines changed

1 file changed

+3
-2
lines changed

articles/machine-learning/concept-distributed-training.md

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,13 +19,13 @@ In distributed training the workload to train a model is split up and shared amo
1919

2020
## Deep learning and distributed training
2121

22-
There are two main types of distributed training: [data parallelism](#data-parallelism) and [model parallelism](#model-parallelism). For distributed training on deep learning models, the [Azure Machine Learning SDK in Python](https://docs.microsoft.com/python/api/overview/azure/ml/intro?view=azure-ml-py) supports integrations with popular frameworks, PyTorch and TensorFlow. Both frameworks employ data parallelism for distributed training, and leverage [horovod](https://horovod.readthedocs.io/en/latest/summary_include.html) for optimizing compute speeds.
22+
There are two main types of distributed training: [data parallelism](#data-parallelism) and [model parallelism](#model-parallelism). For distributed training on deep learning models, the [Azure Machine Learning SDK in Python](https://docs.microsoft.com/python/api/overview/azure/ml/intro?view=azure-ml-py) supports integrations with popular frameworks, PyTorch and TensorFlow. Both frameworks employ data parallelism for distributed training, and can leverage [horovod](https://horovod.readthedocs.io/en/latest/summary_include.html) for optimizing compute speeds.
2323

2424
* [Distributed training with PyTorch](how-to-train-pytorch.md#distributed-training)
2525

2626
* [Distributed training with TensorFlow](how-to-train-tensorflow.md#distributed-training)
2727

28-
For training ML models that don't require distributed training, see [train models with Azure Machine Learning](concept-train-machine-learning-model.md#python-sdk) for the different ways to train models using the Python SDK.
28+
For ML models that don't require distributed training, see [train models with Azure Machine Learning](concept-train-machine-learning-model.md#python-sdk) for the different ways to train models using the Python SDK.
2929

3030
## Data parallelism
3131

@@ -42,5 +42,6 @@ In model parallelism, worker nodes only need to synchronize the shared parameter
4242
## Next steps
4343

4444
* Learn how to [set up training environments](how-to-set-up-training-targets.md) with the Python SDK.
45+
* For a technical example, see the [reference architecture scenario](https://docs.microsoft.com/azure/architecture/reference-architectures/ai/training-deep-learning).
4546
* [Train ML models with TensorFlow](how-to-train-tensorflow.md).
4647
* [Train ML models with PyTorch](how-to-train-pytorch.md).

0 commit comments

Comments
 (0)