You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/concept-train-model-git-integration.md
+57-54Lines changed: 57 additions & 54 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,7 +9,7 @@ ms.topic: conceptual
9
9
author: ositanachi
10
10
ms.author: osiotugo
11
11
ms.reviewer: larryfr
12
-
ms.date: 06/11/2024
12
+
ms.date: 06/12/2024
13
13
ms.custom: sdkv2, build-2023
14
14
---
15
15
# Git integration for Azure Machine Learning
@@ -18,13 +18,14 @@ ms.custom: sdkv2, build-2023
18
18
19
19
Azure Machine Learning fully supports Git repositories for tracking work. You can clone repositories directly onto your shared workspace file system, use Git on your local workstation, or use Git from a continuous integration and continuous deployment (CI/CD) pipeline.
20
20
21
-
When you submit an Azure Machine Learning training job that has source files from a local Git repository, information about the repo is tracked as part of the training job. Because it's tracked from the local Git repo, the Git information isn't tied to any specific central repository. Your repository can be cloned from any Git-compatible service, such as GitHub, GitLab, Bitbucket, or Azure DevOps.
21
+
When you submit an Azure Machine Learning training job that has source files from a local Git repository, information about the repo is tracked as part of the training job. Because the information comes from the local Git repo, it isn't tied to any specific central repository. Your repository can be cloned from any Git-compatible service, such as GitHub, GitLab, Bitbucket, or Azure DevOps.
22
22
23
23
> [!TIP]
24
24
> You can use Visual Studio Code to interact with Git through a graphical user interface. To connect to an Azure Machine Learning remote compute instance by using Visual Studio Code, see [Launch Visual Studio Code integrated with Azure Machine Learning (preview)](how-to-launch-vs-code-remote.md).
25
25
>
26
26
> For more information on Visual Studio Code version control features, see [Use Version Control in Visual Studio Code](https://code.visualstudio.com/docs/editor/versioncontrol) and [Work with GitHub in Visual Studio Code](https://code.visualstudio.com/docs/editor/github).
Azure Machine Learning provides a shared file system for all users in a workspace. The best way to clone a Git repository into this file share is to create a compute instance and [open a terminal](./how-to-access-terminal.md). In the terminal, you have access to a full Git client and can clone and work with Git by using the Git CLI. For more information about the Git CLI, see [Git CLI](https://git-scm.com/docs/gitcli).
@@ -33,90 +34,92 @@ You can clone any Git repository you can authenticate to, such as GitHub, Azure
33
34
34
35
There are some differences between cloning to the local file system of the compute instance or cloning to the shared file system, mounted as the *~/cloudfiles/code/* directory. In general, cloning to the local file system provides better performance than cloning to the mounted file system. However, if you delete and recreate the compute instance, the local file system is lost, while the mounted shared file system remains.
35
36
36
-
## Clone Git repositories with SSH
37
+
## Clone a Git repository with SSH
37
38
38
-
You can clone a repo by using Secure Shell Protocol (SSH). The following sections describe how to clone a repo by using SSH. To use SSH, you need to authenticate your Git account with SSH by using an SSH key.
39
+
You can clone a repo by using Secure Shell (SSH) protocol. The following sections describe how to clone a repo by using SSH. To use SSH, you need to authenticate your Git account with SSH by using an SSH key.
39
40
40
41
### Generate and save a new SSH key
41
42
42
-
To generate a new SSH key:
43
+
In the Azure Machine Learning studio **Notebook** page, open a terminal window and run the following command, substituting your email address.
43
44
44
-
1. In the Azure Machine Learning studio **Notebook** page, open a terminal window and run the following command, substituting your email address.
The command returns the output `Generating public/private rsa key pair.` and generates a new SSH key with the provided email as a label.
49
+
The command returns the following output:
51
50
52
-
1. At the following prompt, make sure the location is `/home/azureuser/.ssh`, or specify that location, and then press Enter.
51
+
```bash
52
+
Generating public/private ed25519 key pair.
53
+
Enter file in which to save the key (/home/azureuser/.ssh/id_ed25519):
54
+
```
53
55
54
-
```bash
55
-
Enter a file in which to save the key (/home/azureuser/.ssh/id_rsa): [Press enter]
56
-
```
56
+
Make sure the location in the preceding output is `/home/azureuser/.ssh`, or change it to that location, and then press Enter.
57
57
58
-
The key file saves on the compute instance, and is accessible only to the compute instance owner.
58
+
It's best to add a passphrase to your SSH key for added security. At the following prompts, enter a secure passphrase.
59
59
60
-
1. It's best to add a passphrase to your SSH key for added security. At the following prompt, enter a secure passphrase.
60
+
```bash
61
+
Enter passphrase (empty for no passphrase):
62
+
Enter same passphrase again:
63
+
```
61
64
62
-
```bash
63
-
> Enter passphrase (empty for no passphrase): [Type a passphrase]
64
-
> Enter same passphrase again: [Type passphrase again]
65
-
```
65
+
When you press Enter, the `ssh-keygen` command generates a new SSH key with the provided email address as a label. The key file saves on the compute instance, and is accessible only to the compute instance owner.
66
66
67
67
### Add the public key to your Git account
68
68
69
-
1.In your terminal window, run the following command to copy the contents of your public key file. If you renamed the key, replace `id_rsa.pub` with the public key file name.
69
+
In your terminal window, run the following command to display the contents of your public key file. If you renamed the key, replace `id_ed25519.pub` with the public key file name.
70
70
71
-
```bash
72
-
cat ~/.ssh/id_rsa.pub
73
-
```
71
+
```bash
72
+
cat ~/.ssh/id_ed25519.pub
73
+
```
74
74
75
-
1. To add the SSH key to your Git account, refer to the following instructions depending on your Git service:
Git clones the repo and sets up the origin remote to connect with SSH for future Git commands.
101
-
102
-
#### Verify fingerprint
102
+
1. SSH might display the server's SSH fingerprint and ask you to verify it, as in the following example.
103
103
104
-
SSH might display the server's SSH fingerprint and ask you to verify it, as in the following example.
104
+
```bash
105
+
The authenticity of host 'github.com (000.00.000.0)' can't be established.
106
+
ECDSA key fingerprint is SHA256:0000000000000000000/00000000/00000000.
107
+
Are you sure you want to continue connecting (yes/no/[fingerprint])?
108
+
```
105
109
106
-
```bash
107
-
The authenticity of host 'example.com (000.00.255.112)' can't be established.
108
-
RSA key fingerprint is SHA256:000000000000000000000000000000000.
109
-
Are you sure you want to continue connecting (yes/no)? yes
110
-
Warning: Permanently added 'github.com,000.00.255.112' (RSA) to the list of known hosts.
111
-
```
110
+
SSH displays this fingerprint when it connects to an unknown host to protect you from [man-in-the-middle attacks](/previous-versions/windows/it-pro/windows-2000-server/cc959354(v=technet.10)#man-in-the-middle-attack). You should verify that the displayed fingerprint matches one of the fingerprints in the SSH public keys page, and then respond *yes*.
112
111
113
-
SSH displays this fingerprint when it connects to an unknown host to protect you from [man-in-the-middle attacks](/previous-versions/windows/it-pro/windows-2000-server/cc959354(v=technet.10)#man-in-the-middle-attack). You should verify that the displayed fingerprint matches one of the fingerprints in the SSH public keys page.
112
+
1. SSH displays a response like the following example:
114
113
115
-
When you're asked if you want to continue connecting, enter *yes*. Once you accept the host's fingerprint, SSH doesn't prompt you again unless the fingerprint changes.
114
+
```bash
115
+
Warning: Permanently added 'github.com,000.00.000.0' (ECDSA) to the list of known hosts.
116
+
Enter passphrase for key '/home/azureuser/.ssh/id_ed25519':
117
+
```
118
+
1. Enter your passphrase. Git clones the repo and sets up the origin remote to connect with SSH for future Git commands. Once you accept the host's fingerprint, SSH doesn't prompt you again unless the fingerprint changes.
116
119
117
120
## Track code that comes from Git repositories
118
121
119
-
When you submit a training job from the Python SDK or Machine Learning CLI, the files needed to train the model are uploaded to your workspace. If the `git`command is available on your development environment, the upload process checks if the files are stored in a Git repository, and uploads any Git repository information as part of the training job.
122
+
When you submit a training job from the Python SDK or Machine Learning CLI, the files needed to train the model are uploaded to your workspace. If the `git` command is available on your development environment, the upload process checks if the source files are stored in a Git repository, and if so, uploads any Git repository information as part of the training job.
120
123
121
124
The following information is sent for jobs that use an estimator, machine learning pipeline, or script run. The information is stored in the following training job properties:
122
125
@@ -130,7 +133,7 @@ The following information is sent for jobs that use an estimator, machine learni
130
133
If the `git` command isn't available on your development environment, or your training files aren't located in a Git repository, no Git-related information is tracked.
131
134
132
135
> [!TIP]
133
-
> To check if the `git`command is available on your development environment, run the `git --version`commandin a command line interface. If Git is installed and in your path, you receive a response similar to `git version 2.4.1`. For information on installing Git on your development environment, see the [Git website](https://git-scm.com/).
136
+
> To check if the `git` command is available on your development environment, run the `git --version` command in a command line interface. If Git is installed and in your path, you receive a response similar to `git version 2.43.0`. For information on installing Git on your development environment, see the [Git website](https://git-scm.com/).
134
137
135
138
## View Git information
136
139
@@ -147,7 +150,7 @@ In your Azure Machine Learning workspace in Azure Machine Learning studio:
147
150
1. Expand **logs** > **azureml**.
148
151
1. Select the link that begins with **###_azure**.
149
152
150
-
The logged information contains text similar to the following JSON code:
153
+
The logged information contains JSON code similar to the following example:
151
154
152
155
```json
153
156
"properties": {
@@ -168,23 +171,23 @@ The logged information contains text similar to the following JSON code:
168
171
169
172
### Python SDK V2
170
173
171
-
After you submit a training run, a [Job](/python/api/azure-ai-ml/azure.ai.ml.entities.job) object is returned. The `properties` attribute of this object contains the logged Git information. For example, the following code retrieves the commit hash:
174
+
After you submit a training run, a [Job](/python/api/azure-ai-ml/azure.ai.ml.entities.job) object is returned. The `properties` attribute of this object contains the logged Git information. For example, you can run the following command to retrieve the commit hash:
172
175
173
176
```python
174
177
job.properties["azureml.git.commit"]
175
178
```
176
179
177
180
### Azure CLI V2
178
181
179
-
Run the `az ml job show`command to display the `GitCommit` property. For example:
182
+
Run the `az ml job show` command with the `--query` argument to display the Git information. For example, the following query retrieves the `GitCommit` property:
180
183
181
184
```azurecli
182
-
az ml job show --name my_job_id --query "{GitCommit:properties.azureml.git.commit}"
185
+
az ml job show --name my-job-id --query "{GitCommit:properties.azureml.git.commit} --resource-group my-resource-group --workspace-name my-workspace"
183
186
```
184
187
185
188
## Related content
186
189
187
190
- [Access a compute instance terminal in your workspace](how-to-access-terminal.md)
188
191
- [Launch Visual Studio Code integrated with Azure Machine Learning (preview)](how-to-launch-vs-code-remote.md)
189
-
- [Use Version Control inVS Code](https://code.visualstudio.com/docs/editor/versioncontrol)
190
-
- [Work with GitHub inVS Code](https://code.visualstudio.com/docs/editor/github)
192
+
- [Use Version Control in Visual Studio Code](https://code.visualstudio.com/docs/editor/versioncontrol)
193
+
- [Work with GitHub in Visual Studio Code](https://code.visualstudio.com/docs/editor/github)
0 commit comments