You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-interactive-jobs.md
+6-2Lines changed: 6 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -27,6 +27,7 @@ Interactive training is supported on **Azure Machine Learning Compute Clusters**
27
27
28
28
## Prerequisites
29
29
- Review [getting started with training on Azure Machine Learning](./how-to-train-model.md).
30
+
- To use this feature in Azure Machine Learning Studio, enable the "Debug & monitor your training jobs" flight via the [preview panel](./how-to-enable-preview-features.md#how-do-i-enable-preview-features).
30
31
- To use **VS Code**, [follow this guide](how-to-setup-vs-code.md) to set up the Azure Machine Learning extension.
31
32
- Make sure your job environment has the `openssh-server` and `ipykernel ~=6.0` packages installed (all Azure Machine Learning curated training environments have these packages installed by default).
32
33
- Interactive applications can't be enabled on distributed training runs where the distribution type is anything other than Pytorch, Tensorflow or MPI. Custom distributed training setup (configuring multi-node training without using the above distribution frameworks) is not currently supported.
@@ -69,6 +70,7 @@ By specifying interactive applications at job creation, you can connect directly
69
70
70
71
6. Review and create the job.
71
72
73
+
If you don't see the above options, make sure you have enabled the "Debug & monitor your training jobs" flight via the [preview panel](./how-to-enable-preview-features.md#how-do-i-enable-preview-features).
72
74
73
75
# [Python SDK](#tab/python)
74
76
1. Define the interactive services you want to use for your job. Make sure to replace `your compute name` with your own value. If you want to use your own custom environment, follow the examples in [this tutorial](how-to-manage-environments-v2.md) to create a custom environment.
@@ -183,7 +185,7 @@ Clicking the applications in the panel opens a new tab for the applications. You
183
185
184
186
:::image type="content"source="media/interactive-jobs/interactive-jobs-right-panel.png" alt-text="Screenshot of interactive jobs right panel information. Information content will vary depending on the users data":::
185
187
186
-
It might take a few minutes to start the job and the training applications specified during job creation.
188
+
It might take a few minutes to start the job and the training applications specified during job creation. If you don't see the above options, make sure you have enabled the "Debug & monitor your training jobs" flight via the [preview panel](./how-to-enable-preview-features.md#how-do-i-enable-preview-features).
187
189
188
190
# [Python SDK](#tab/python)
189
191
- Once the job is submitted, you can use `ml_client.jobs.show_services("<job name>", <compute node index>)` to view the interactive service endpoints.
@@ -208,7 +210,7 @@ You can access the applications only when they are in **Running** status and onl
208
210
### Interact with the applications
209
211
When you click on the endpoints to interact when your job, you're taken to the user container under your working directory, where you can access your code, inputs, outputs, and logs. If you run into any issues while connecting to the applications, the interactive capability and applications logs can be found from **system_logs->interactive_capability** under **Outputs + logs** tab.
210
212
211
-
:::image type="content"source="./media/interactive-jobs/interactive-jobs-logs.png" alt-text="Screenshot of interactive jobs interactive logs panel location.":::
213
+
:::image type="content"source="./media/interactive-jobs/interactive-logs.png" alt-text="Screenshot of interactive jobs interactive logs panel location.":::
212
214
213
215
- You can open a terminal from Jupyter Lab and start interacting within the job container. You can also directly iterate on your training script with Jupyter Lab.
214
216
@@ -221,6 +223,8 @@ When you click on the endpoints to interact when your job, you're taken to the u
221
223
- If you have logged tensorflow events for your job, you can use TensorBoard to monitor the metrics when your job is running.
222
224
223
225
:::image type="content"source="./media/interactive-jobs/tensorboard-open.png" alt-text="Screenshot of interactive jobs tensorboard panel when first opened. This information will vary depending upon customer data":::
226
+
227
+
If you don't see the above options, make sure you have enabled the "Debug & monitor your training jobs" flight via the [preview panel](./how-to-enable-preview-features.md#how-do-i-enable-preview-features).
224
228
225
229
### End job
226
230
Once you're done with the interactive training, you can also go to the job details page to cancel the job which will release the compute resource. Alternatively, use `az ml job cancel -n <your job name>` in the CLI or `ml_client.job.cancel("<job name>")` in the SDK.
0 commit comments