Skip to content

Commit f61a1aa

Browse files
authored
Merge pull request #114372 from hyoshioka0128/patch-700
Typo "Tensorflow"→"TensorFlow"
2 parents f84d3b5 + 380c540 commit f61a1aa

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

articles/synapse-analytics/machine-learning/tutorial-horovod-tensorflow.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
2-
title: 'Tutorial: Distributed training with Horovod and Tensorflow'
3-
description: Tutorial on how to run distributed training with the Horovod Runner and Tensorflow
2+
title: 'Tutorial: Distributed training with Horovod and TensorFlow'
3+
description: Tutorial on how to run distributed training with the Horovod Runner and TensorFlow
44
ms.service: synapse-analytics
55
ms.subservice: machine-learning
66
ms.topic: tutorial
@@ -9,11 +9,11 @@ author: midesa
99
ms.author: midesa
1010
---
1111

12-
# Tutorial: Distributed Training with Horovod Runner and Tensorflow (Preview)
12+
# Tutorial: Distributed Training with Horovod Runner and TensorFlow (Preview)
1313

1414
[Horovod](https://github.com/horovod/horovod) is a distributed training framework for libraries like TensorFlow and PyTorch. With Horovod, users can scale up an existing training script to run on hundreds of GPUs in just a few lines of code.
1515

16-
Within Azure Synapse Analytics, users can quickly get started with Horovod using the default Apache Spark 3 runtime.For Spark ML pipeline applications using Tensorflow, users can use ```HorovodRunner```. This notebook uses an Apache Spark dataframe to perform distributed training of a distributed neural network (DNN) model on MNIST dataset. This tutorial leverages Tensorflow and the ```HorovodRunner``` to run the training process.
16+
Within Azure Synapse Analytics, users can quickly get started with Horovod using the default Apache Spark 3 runtime.For Spark ML pipeline applications using TensorFlow, users can use ```HorovodRunner```. This notebook uses an Apache Spark dataframe to perform distributed training of a distributed neural network (DNN) model on MNIST dataset. This tutorial leverages TensorFlow and the ```HorovodRunner``` to run the training process.
1717

1818
## Prerequisites
1919

@@ -22,7 +22,7 @@ Within Azure Synapse Analytics, users can quickly get started with Horovod using
2222

2323
## Configure the Apache Spark session
2424

25-
At the start of the session, we will need to configure a few Apache Spark settings. In most cases, we only needs to set the ```numExecutors``` and ```spark.rapids.memory.gpu.reserve```. For very large models, users may also need to configure the ```spark.kryoserializer.buffer.max``` setting. For Tensorflow models, users will need to set the ```spark.executorEnv.TF_FORCE_GPU_ALLOW_GROWTH``` to be true.
25+
At the start of the session, we will need to configure a few Apache Spark settings. In most cases, we only needs to set the ```numExecutors``` and ```spark.rapids.memory.gpu.reserve```. For very large models, users may also need to configure the ```spark.kryoserializer.buffer.max``` setting. For TensorFlow models, users will need to set the ```spark.executorEnv.TF_FORCE_GPU_ALLOW_GROWTH``` to be true.
2626

2727
In the example below, you can see how the Spark configurations can be passed with the ```%%configure``` command. The detailed meaning of each parameter is explained in the [Apache Spark configuration documentation](https://spark.apache.org/docs/latest/configuration.html). The values provided below are the suggested, best practice values for Azure Synapse GPU-large pools.
2828

@@ -126,7 +126,7 @@ def get_dataset(rank=0, size=1):
126126

127127
## Define DNN model
128128

129-
Once we have finished processing our dataset, we can now define our Tensorflow model. The same code could also be used to train a single-node Tensorflow model.
129+
Once we have finished processing our dataset, we can now define our TensorFlow model. The same code could also be used to train a single-node TensorFlow model.
130130

131131
```python
132132
# Define the TensorFlow model without any Horovod-specific parameters
@@ -153,7 +153,7 @@ def get_model():
153153

154154
## Define a training function for a single node
155155

156-
First, we will train our Tensorflow model on the driver node of the Apache Spark pool. Once we have finished the training process, we will evaluate the model and print the loss and accuracy scores.
156+
First, we will train our TensorFlow model on the driver node of the Apache Spark pool. Once we have finished the training process, we will evaluate the model and print the loss and accuracy scores.
157157

158158
```python
159159

@@ -346,4 +346,4 @@ To ensure the Spark instance is shut down, end any connected sessions(notebooks)
346346
## Next steps
347347

348348
* [Check out Synapse sample notebooks](https://github.com/Azure-Samples/Synapse/tree/main/MachineLearning)
349-
* [Learn more about GPU-enabled Apache Spark pools](../spark/apache-spark-gpu-concept.md)
349+
* [Learn more about GPU-enabled Apache Spark pools](../spark/apache-spark-gpu-concept.md)

0 commit comments

Comments
 (0)