You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
nodes="all"# For distributed jobs, use the `nodes` property to pick which node you want to enable interactive services on. If `nodes` are not selected, by default, interactive applications are only enabled on the head node. Values are "all", or compute node index (for ex. "0", "1" etc.)
90
89
),
91
-
"My_vscode": JobService(
92
-
job_service_type="vs_code",
90
+
"My_vscode": VsCodeJobService(
93
91
nodes="all"
94
92
),
95
-
"My_tensorboard": JobService(
96
-
job_service_type="tensor_board",
93
+
"My_tensorboard": TensorBoardJobService(
97
94
nodes="all",
98
-
properties={
99
-
"logDir": "output/tblogs"# relative path of Tensorboard logs (same as in your training script)
95
+
logDir="output/tblogs"# relative path of Tensorboard logs (same as in your training script)
100
96
}
101
97
),
102
-
"My_ssh": JobService(
103
-
job_service_type="ssh",
104
-
sshPublicKeys="<add-public-key>",
98
+
"My_ssh": SshJobService(
99
+
ssh_Public_Keys="<add-public-key>",
105
100
nodes="all"
106
-
properties={
107
-
"sshPublicKeys":"<add-public-key>"
108
101
}
109
102
),
110
103
}
@@ -131,43 +124,43 @@ If you don't see the above options, make sure you have enabled the "Debug & moni
131
124
132
125
# [Azure CLI](#tab/azurecli)
133
126
134
-
1. 1. Create a job yaml `job.yaml`with below sample content. Make sure to replace `your compute name`with your own value. If you want to use custom environment, follow the examples in [this tutorial](how-to-manage-environments-v2.md) to create a custom environment.
127
+
1. Create a job yaml `job.yaml`with below sample content. Make sure to replace `your compute name`with your own value. If you want to use custom environment, follow the examples in [this tutorial](how-to-manage-environments-v2.md) to create a custom environment.
135
128
```dotnetcli
136
-
code: src
137
-
command:
138
-
python train.py
139
-
# you can add a command like "sleep 1h" to reserve the compute resource is reserved after the script finishes running.
nodes: all# For distributed jobs, use the `nodes` property to pick which node you want to enable interactive services on. If `nodes` are not selected, by default, interactive applications are only enabled on the head node. Values are "all", or compute node index (for ex. "0", "1" etc.)
146
-
my_tensor_board:
147
-
job_service_type: tensor_board
148
-
log_dir: "output/tblogs"# relative path of Tensorboard logs (same as in your training script)
149
-
nodes: all
150
-
my_jupyter_lab:
151
-
job_service_type: jupyter_lab
152
-
nodes: all
153
-
my_ssh:
154
-
job_service_type: ssh
155
-
ssh_public_keys: <paste the entire pub key content>
156
-
nodes: all
157
-
```
158
-
159
-
The `services` section specifies the training applications you want to interact with.
160
-
161
-
You can put `sleep <specific time>` at the end of the command to specify the amount of time you want to reserve the compute resource. The format follows:
162
-
* sleep 1s
163
-
* sleep 1m
164
-
* sleep 1h
165
-
* sleep 1d
166
-
167
-
You can also use the `sleep infinity` command that would keep the job alive indefinitely.
129
+
code: src
130
+
command:
131
+
python train.py
132
+
# you can add a command like "sleep 1h" to reserve the compute resource is reserved after the script finishes running.
nodes: all# For distributed jobs, use the `nodes` property to pick which node you want to enable interactive services on. If `nodes` are not selected, by default, interactive applications are only enabled on the head node. Values are "all", or compute node index (for ex. "0", "1" etc.)
139
+
my_tensor_board:
140
+
job_service_type: tensor_board
141
+
log_dir: "output/tblogs"# relative path of Tensorboard logs (same as in your training script)
142
+
nodes: all
143
+
my_jupyter_lab:
144
+
job_service_type: jupyter_lab
145
+
nodes: all
146
+
my_ssh:
147
+
job_service_type: ssh
148
+
ssh_public_keys: <paste the entire pub key content>
149
+
nodes: all
150
+
```
151
+
152
+
The `services` section specifies the training applications you want to interact with.
153
+
154
+
You can put `sleep <specific time>` at the end of the command to specify the amount of time you want to reserve the compute resource. The format follows:
155
+
* sleep 1s
156
+
* sleep 1m
157
+
* sleep 1h
158
+
* sleep 1d
159
+
160
+
You can also use the `sleep infinity` command that would keep the job alive indefinitely.
168
161
169
-
> [!NOTE]
170
-
> If you use `sleep infinity`, you will need to manually [cancel the job](./how-to-interactive-jobs.md#end-job) to let go of the compute resource (and stop billing).
162
+
> [!NOTE]
163
+
> If you use `sleep infinity`, you will need to manually [cancel the job](./how-to-interactive-jobs.md#end-job) to let go of the compute resource (and stop billing).
171
164
172
165
2. Run command `az ml job create --file<path to your job yaml file>--workspace-name <your workspace name>--resource-group <your resource group name>--subscription <sub-id> `to submit your training job. For more details on running a job via CLIv2, check out this [article](./how-to-train-model.md).
173
166
@@ -264,4 +257,4 @@ To submit a job with a debugger attached and the execution paused, you can use d
264
257
265
258
## Next steps
266
259
267
-
+ Learn more about [how and where to deploy a model](./how-to-deploy-online-endpoints.md).
260
+
+ Learn more about [how and where to deploy a model](./how-to-deploy-online-endpoints.md).
0 commit comments