Skip to content

Commit 79ac4b2

Browse files
Merge pull request #6333 from s-polly/main
Freshness on tensorboard article
2 parents 08c95ba + dce08dc commit 79ac4b2

File tree

1 file changed

+31
-32
lines changed

1 file changed

+31
-32
lines changed

articles/machine-learning/v1/how-to-monitor-tensorboard.md

Lines changed: 31 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -5,15 +5,14 @@ description: Launch TensorBoard to visualize experiment job histories and identi
55
services: machine-learning
66
ms.service: azure-machine-learning
77
ms.subservice: mlops
8-
ms.author: amipatel
9-
author: amibp
8+
ms.author: scottpolly
9+
author: s-polly
1010
ms.reviewer: ssalgado
11-
ms.date: 10/21/2021
11+
ms.date: 07/31/2025
1212
ms.topic: how-to
1313
ms.custom: UpdateFrequency5, sdkv1
1414
---
1515

16-
[//]: # (needs PM review; Do URL Links names change if it includes 'Run')
1716

1817
# Visualize experiment jobs and metrics with TensorBoard and Azure Machine Learning
1918

@@ -26,37 +25,36 @@ In this article, you learn how to view your experiment jobs and metrics in Tenso
2625
[TensorBoard](/python/api/azureml-tensorboard/azureml.tensorboard) is a suite of web applications for inspecting and understanding your experiment structure and performance.
2726

2827
How you launch TensorBoard with Azure Machine Learning experiments depends on the type of experiment:
29-
+ If your experiment natively outputs log files that are consumable by TensorBoard, such as PyTorch, Chainer and TensorFlow experiments, then you can [launch TensorBoard directly](#launch-tensorboard) from experiment's job history.
28+
+ If your experiment natively outputs log files that are consumable by TensorBoard, such as PyTorch, Chainer, and TensorFlow experiments, then you can [launch TensorBoard directly](#launch-tensorboard) from the experiment's job history.
3029

31-
+ For experiments that don't natively output TensorBoard consumable files, such as like Scikit-learn or Azure Machine Learning experiments, use [the `export_to_tensorboard()` method](#option-2-export-history-as-log-to-view-in-tensorboard) to export the job histories as TensorBoard logs and launch TensorBoard from there.
30+
+ For experiments that don't natively output TensorBoard-consumable files, such as Scikit-learn or Azure Machine Learning experiments, use [the `export_to_tensorboard()` method](#option-2-export-history-as-log-to-view-in-tensorboard) to export the job histories as TensorBoard logs and launch TensorBoard from there.
3231

3332
> [!TIP]
34-
> The information in this document is primarily for data scientists and developers who want to monitor the model training process. If you are an administrator interested in monitoring resource usage and events from Azure Machine Learning, such as quotas, completed training jobs, or completed model deployments, see [Monitoring Azure Machine Learning](../monitor-azure-machine-learning.md).
33+
> The information in this document is primarily for data scientists and developers who want to monitor the model training process. If you're an administrator interested in monitoring resource usage and events from Azure Machine Learning, such as quotas, completed training jobs, or completed model deployments, see [Monitoring Azure Machine Learning](../monitor-azure-machine-learning.md).
3534
3635
## Prerequisites
3736

38-
* To launch TensorBoard and view your experiment job histories, your experiments need to have previously enabled logging to track its metrics and performance.
37+
* To launch TensorBoard and view your experiment job histories, your experiments need to have previously enabled logging to track their metrics and performance.
3938
* The code in this document can be run in either of the following environments:
4039
* Azure Machine Learning compute instance - no downloads or installation necessary
41-
* Complete [Create resources to get started](../quickstart-create-resources.md) to create a dedicated notebook server pre-loaded with the SDK and the sample repository.
42-
* In the samples folder on the notebook server, find two completed and expanded notebooks by navigating to these directories:
40+
* Complete [Create resources to get started](../quickstart-create-resources.md) to create a dedicated notebook server preloaded with the SDK and the sample repository.
41+
* In the samples folder on the notebook server, find two completed and expanded notebooks by navigating to these directories:
4342
* **SDK v1 > how-to-use-azureml > track-and-monitor-experiments > tensorboard > export-run-history-to-tensorboard > export-run-history-to-tensorboard.ipynb**
4443
* **SDK v1 > how-to-use-azureml > track-and-monitor-experiments > tensorboard > tensorboard > tensorboard.ipynb**
4544
* Your own Jupyter notebook server
46-
* [Install the Azure Machine Learning SDK](/python/api/overview/azure/ml/install) with the `tensorboard` extra
45+
* [Install the Azure Machine Learning SDK](/python/api/overview/azure/ml/install) with the `tensorboard` extra
4746
* [Create an Azure Machine Learning workspace](../quickstart-create-resources.md).
4847
* [Create a workspace configuration file](how-to-configure-environment.md).
4948

5049
## Option 1: Directly view job history in TensorBoard
5150

52-
This option works for experiments that natively outputs log files consumable by TensorBoard, such as PyTorch, Chainer, and TensorFlow experiments. If that is not the case of your experiment, use [the `export_to_tensorboard()` method](#option-2-export-history-as-log-to-view-in-tensorboard) instead.
51+
This option works for experiments that natively output log files consumable by TensorBoard, such as PyTorch, Chainer, and TensorFlow experiments. If that isn't the case for your experiment, use [the `export_to_tensorboard()` method](#option-2-export-history-as-log-to-view-in-tensorboard) instead.
5352

54-
The following example code uses the [MNIST demo experiment](https://raw.githubusercontent.com/tensorflow/tensorflow/r1.8/tensorflow/examples/tutorials/mnist/mnist_with_summaries.py) from TensorFlow's repository in a remote compute target, Azure Machine Learning Compute. Next, we will configure and start a job for training the TensorFlow model, and then
55-
start TensorBoard against this TensorFlow experiment.
53+
The following example code uses the [MNIST demo experiment](https://raw.githubusercontent.com/tensorflow/tensorflow/r1.8/tensorflow/examples/tutorials/mnist/mnist_with_summaries.py) from TensorFlow's repository in a remote compute target, Azure Machine Learning Compute. Next, we configure and start a job for training the TensorFlow model, and then start TensorBoard against this TensorFlow experiment.
5654

5755
### Set experiment name and create project folder
5856

59-
Here we name the experiment and create its folder.
57+
Here we name the experiment and create its folder.
6058

6159
```python
6260
from os import path, makedirs
@@ -72,7 +70,7 @@ if not path.exists(exp_dir):
7270

7371
### Download TensorFlow demo experiment code
7472

75-
TensorFlow's repository has an MNIST demo with extensive TensorBoard instrumentation. We do not, nor need to, alter any of this demo's code for it to work with Azure Machine Learning. In the following code, we download the MNIST code and save it in our newly created experiment folder.
73+
TensorFlow's repository has an MNIST demo with extensive TensorBoard instrumentation. We don't need to alter any of this demo's code for it to work with Azure Machine Learning. In the following code, we download the MNIST code and save it in our newly created experiment folder.
7674

7775
```python
7876
import requests
@@ -82,13 +80,13 @@ tf_code = requests.get("https://raw.githubusercontent.com/tensorflow/tensorflow/
8280
with open(os.path.join(exp_dir, "mnist_with_summaries.py"), "w") as file:
8381
file.write(tf_code.text)
8482
```
85-
Throughout the MNIST code file, mnist_with_summaries.py, notice that there are lines that call `tf.summary.scalar()`, `tf.summary.histogram()`, `tf.summary.FileWriter()` etc. These methods group, log, and tag key metrics of your experiments into job history. The `tf.summary.FileWriter()` is especially important as it serializes the data from your logged experiment metrics, which allows for TensorBoard to generate visualizations off of them.
83+
Throughout the MNIST code file, mnist_with_summaries.py, notice that there are lines that call `tf.summary.scalar()`, `tf.summary.histogram()`, `tf.summary.FileWriter()` etc. These methods group, log, and tag key metrics of your experiments into job history. The `tf.summary.FileWriter()` is especially important as it serializes the data from your logged experiment metrics, which allows TensorBoard to generate visualizations from them.
8684

8785
### Configure experiment
8886

89-
In the following, we configure our experiment and set up directories for logs and data. These logs will be uploaded to the job history, which TensorBoard accesses later.
87+
In the following, we configure our experiment and set up directories for logs and data. These logs are uploaded to the job history, which TensorBoard accesses later.
9088

91-
> [!Note]
89+
> [!NOTE]
9290
> For this TensorFlow example, you will need to install TensorFlow on your local machine. Further, the TensorBoard module (that is, the one included with TensorFlow) must be accessible to this notebook's kernel, as the local machine is what runs TensorBoard.
9391
9492
```Python
@@ -116,7 +114,8 @@ exp = Experiment(ws, experiment_name)
116114
```
117115

118116
### Create a cluster for your experiment
119-
We create an AmlCompute cluster for this experiment, however your experiments can be created in any environment and you are still able to launch TensorBoard against the experiment job history.
117+
118+
We create an AmlCompute cluster for this experiment, however your experiments can be created in any environment and you can still launch TensorBoard against the experiment job history.
120119

121120
```Python
122121
from azureml.core.compute import ComputeTarget, AmlCompute
@@ -164,14 +163,14 @@ run = exp.submit(src)
164163

165164
### Launch TensorBoard
166165

167-
You can launch TensorBoard during your run or after it completes. In the following, we create a TensorBoard object instance, `tb`, that takes the experiment job history loaded in the `job`, and then launches TensorBoard with the `start()` method.
166+
You can launch TensorBoard during your run or after it completes. In the following, we create a TensorBoard object instance, `tb`, that takes the experiment job history loaded in the `run`, and then launch TensorBoard with the `start()` method.
168167

169-
The [TensorBoard constructor](/python/api/azureml-tensorboard/azureml.tensorboard.tensorboard) takes an array of jobs, so be sure and pass it in as a single-element array.
168+
The [TensorBoard constructor](/python/api/azureml-tensorboard/azureml.tensorboard.tensorboard) takes an array of runs, so be sure to pass it in as a single-element array.
170169

171170
```python
172171
from azureml.tensorboard import Tensorboard
173172

174-
tb = Tensorboard([job])
173+
tb = Tensorboard([run])
175174

176175
# If successful, start() returns a string with the URI of the instance.
177176
tb.start()
@@ -180,7 +179,7 @@ tb.start()
180179
tb.stop()
181180
```
182181

183-
> [!Note]
182+
> [!NOTE]
184183
> While this example used TensorFlow, TensorBoard can be used as easily with PyTorch or Chainer. TensorFlow must be available on the machine running TensorBoard, but is not necessary on the machine doing PyTorch or Chainer computations.
185184
186185

@@ -203,7 +202,7 @@ exp = Experiment(ws, experiment_name)
203202
root_run = exp.start_logging()
204203
```
205204

206-
Here we load the diabetes dataset-- a built-in small dataset that comes with scikit-learn, and split it into test and training sets.
205+
Here we load the diabetes dataseta built-in small dataset that comes with scikit-learnand split it into test and training sets.
207206

208207
```Python
209208
from sklearn.datasets import load_diabetes
@@ -245,9 +244,9 @@ for alpha in tqdm(alphas):
245244

246245
### Export jobs to TensorBoard
247246

248-
With the SDK's [export_to_tensorboard()](/python/api/azureml-tensorboard/azureml.tensorboard.export) method, we can export the job history of our Azure machine learning experiment into TensorBoard logs, so we can view them via TensorBoard.
247+
With the SDK's [export_to_tensorboard()](/python/api/azureml-tensorboard/azureml.tensorboard.export) method, we can export the job history of our Azure Machine Learning experiment into TensorBoard logs, so we can view them via TensorBoard.
249248

250-
In the following code, we create the folder `logdir` in our current working directory. This folder is where we will export our experiment job history and logs from `root_run` and then mark that job as completed.
249+
In the following code, we create the folder `logdir` in our current working directory. This folder is where we export our experiment job history and logs from `root_run` and then mark that job as completed.
251250

252251
```Python
253252
from azureml.tensorboard.export import export_to_tensorboard
@@ -267,7 +266,7 @@ export_to_tensorboard(root_run, logdir)
267266
root_run.complete()
268267
```
269268

270-
> [!Note]
269+
> [!NOTE]
271270
> You can also export a particular run to TensorBoard by specifying the name of the run `export_to_tensorboard(run_name, logdir)`
272271
273272
### Start and stop TensorBoard
@@ -276,22 +275,22 @@ Once our job history for this experiment is exported, we can launch TensorBoard
276275
```Python
277276
from azureml.tensorboard import Tensorboard
278277

279-
# The TensorBoard constructor takes an array of jobs, so be sure and pass it in as a single-element array here
278+
# The TensorBoard constructor takes an array of jobs, so be sure to pass it in as a single-element array here
280279
tb = Tensorboard([], local_root=logdir, port=6006)
281280

282281
# If successful, start() returns a string with the URI of the instance.
283282
tb.start()
284283
```
285284

286-
When you're done, make sure to call the [stop()](/python/api/azureml-tensorboard/azureml.tensorboard.tensorboard#stop--) method of the TensorBoard object. Otherwise, TensorBoard will continue to run until you shut down the notebook kernel.
285+
When you're done, make sure to call the [stop()](/python/api/azureml-tensorboard/azureml.tensorboard.tensorboard#stop--) method of the TensorBoard object. Otherwise, TensorBoard continues to run until you shut down the notebook kernel.
287286

288287
```python
289288
tb.stop()
290289
```
291290

292291
## Next steps
293292

294-
In this how-to you, created two experiments and learned how to launch TensorBoard against their job histories to identify areas for potential tuning and retraining.
293+
In this how-to, you created two experiments and learned how to launch TensorBoard against their job histories to identify areas for potential tuning and retraining.
295294

296-
* If you are satisfied with your model, head over to our [How to deploy a model](how-to-deploy-and-where.md) article.
295+
* If you're satisfied with your model, head over to our [How to deploy a model](how-to-deploy-and-where.md) article.
297296
* Learn more about [hyperparameter tuning](../how-to-tune-hyperparameters.md).

0 commit comments

Comments
 (0)