Skip to content

Commit 0ed2f91

Browse files
Merge pull request #222751 from ishinzhang/patch-1
make minor changes to profiling doc
2 parents 7444b4e + f2408a8 commit 0ed2f91

File tree

1 file changed

+2
-5
lines changed

1 file changed

+2
-5
lines changed

articles/machine-learning/how-to-use-pipeline-ui.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -246,12 +246,9 @@ Status and definitions:
246246
|------|--------------|-------------|----------|
247247
| Not started | Job is submitted from client side and accepted in Azure ML services. Time spent in this stage is mainly in Azure ML service scheduling and preprocessing. | If there's no backend service issue, this time should be very short.| Open support case via Azure portal. |
248248
|Preparing | In this status, job is pending for some preparation on job dependencies, for example, environment image building.| If you're using curated or registered custom environment, this time should be very short. | Check image building log. |
249-
|Inqueue | Job is pending for compute resource allocation. Time spent in this stage is mainly depending on the status of your compute cluster or job yield policy for scope job.| If you're using a cluster with enough compute resource, this time should be short. | Check with workspace admin whether to increase the max nodes of the target compute or change the job to another less busy compute. |
249+
|Inqueue | Job is pending for compute resource allocation. Time spent in this stage is mainly depending on the status of your compute cluster.| If you're using a cluster with enough compute resource, this time should be short. | Check with workspace admin whether to increase the max nodes of the target compute or change the job to another less busy compute. |
250250
|Running | Job is executing on remote compute. Time spent in this stage is mainly in two parts: <br> Runtime preparation: image pulling, docker starting and data preparation (mount or download). <br> User script execution. | This status is expected to be most time consuming one. | 1. Go to the source code check if there's any user error. <br> 2. View the monitoring tab of compute metrics (CPU, memory, networking etc.) to identify the bottleneck. <br> 3. Try online debug with [interactive endpoints](how-to-interactive-jobs.md) if the job is running or locally debug of your code. |
251-
| Finalizing | Job is in post processing after execution complete. Time spent in this stage is mainly for some post processes like: output uploading, metric/logs uploading and resources clean up.| It will be short for command job. However, might be very long for PRS/MPI job because for a distributed job, the finalizing status is from the first node starting finalizing to the last node done finalizing. | Change your step run output mode from upload to mount if you find unexpected long finalizing time, or open support case via Azure portal. |
252-
253-
254-
Along with the profiling, you can also use the *Output + logs* (on the details page), the Common Runtime enabled monitoring metric for PRS/MPI jobs.
251+
| Finalizing | Job is in post processing after execution complete. Time spent in this stage is mainly for some post processes like: output uploading, metric/logs uploading and resources clean up.| It will be short for command job. However, might be very long for PRS/MPI job because for a distributed job, the finalizing status is from the first node starting finalizing to the last node done finalizing. | Change your step job output mode from upload to mount if you find unexpected long finalizing time, or open support case via Azure portal. |
255252

256253
### Different view of Gantt chart
257254

0 commit comments

Comments
 (0)