You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/concept-train-model-git-integration.md
+35-47Lines changed: 35 additions & 47 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -18,40 +18,38 @@ ms.custom: sdkv2, build-2023
18
18
19
19
Azure Machine Learning fully supports Git repositories for tracking work. You can clone repositories directly onto your shared workspace file system, use Git on your local workstation, or use Git from a continuous integration and continuous deployment (CI/CD) pipeline.
20
20
21
-
When you submit a job to Azure Machine Learning, if source files are stored in a local git repository, information about the repo is tracked as part of the training process. Because Azure Machine Learning tracks the information from the local git repo, it isn't tied to any specific central repository. Your repository can be cloned from GitHub, GitLab, Bitbucket, Azure DevOps, or any other Git-compatible service.
21
+
When you submit an Azure Machine Learning training job that has source files from a local Git repository, information about the repo is tracked as part of the training job. Because it's tracked from the local Git repo, the Git information isn't tied to any specific central repository. Your repository can be cloned from any Git-compatible service, such as GitHub, GitLab, Bitbucket, or Azure DevOps.
22
22
23
23
> [!TIP]
24
24
> You can use Visual Studio Code to interact with Git through a graphical user interface. To connect to an Azure Machine Learning remote compute instance by using Visual Studio Code, see [Launch Visual Studio Code integrated with Azure Machine Learning (preview)](how-to-launch-vs-code-remote.md).
25
25
>
26
-
> For more information on Visual Studio Code version control features, see [Use Version Control in VS Code](https://code.visualstudio.com/docs/editor/versioncontrol) and [Work with GitHub in VS Code](https://code.visualstudio.com/docs/editor/github).
26
+
> For more information on Visual Studio Code version control features, see [Use Version Control in Visual Studio Code](https://code.visualstudio.com/docs/editor/versioncontrol) and [Work with GitHub in Visual Studio Code](https://code.visualstudio.com/docs/editor/github).
27
27
28
-
## Clone Git repositories in a workspace file system
28
+
## Git repositories in a workspace file system
29
29
30
-
Azure Machine Learning provides a shared file system for all users in a workspace. To clone a Git repository into this file share, you can create a compute instance and open a terminal. Once you open the terminal, you have access to a full Git client and can clone and work with Git via the Git CLI experience.
30
+
Azure Machine Learning provides a shared file system for all users in a workspace. The best way to clone a Git repository into this file share is to create a compute instance and [open a terminal](./how-to-access-terminal.md). In the terminal, you have access to a full Git client and can clone and work with Git by using the Git CLI. For more information about the Git CLI, see [Git CLI](https://git-scm.com/docs/gitcli).
31
31
32
-
You can clone any Git repository you can authenticate to, such as a GitHub, Azure Repos, or BitBucket repo. It's best to clone the repository into your user directory, so that other users don't collide directly on your working branch.
32
+
You can clone any Git repository you can authenticate to, such as GitHub, Azure Repos, or BitBucket repos. It's best to clone the repository into your user directory, so that other users don't collide directly on your working branch.
33
33
34
-
There's a performance difference between cloning to the local file system of the compute instance or cloning to the filesystem mounted as the *~/cloudfiles/code* directory. In general, cloning to the local filesystem provides better performance than cloning to the mounted filesystem. However, if you delete and recreate the compute instance, the local filesystem is lost, whereas the mounted filesystem is kept.
35
-
36
-
For more information about the Git CLI, see [Git CLI](https://git-scm.com/docs/gitcli).
34
+
There are some differences between cloning to the local file system of the compute instance or cloning to the shared file system, mounted as the *~/cloudfiles/code/* directory. In general, cloning to the local file system provides better performance than cloning to the mounted file system. However, if you delete and recreate the compute instance, the local file system is lost, while the mounted shared file system remains.
37
35
38
36
## Clone Git repositories with SSH
39
37
40
-
You can clone a repo by using HTTPS or SSH. The following sections describe how to clone a repo by using SSH. To use SSH, you need to authenticate your Git account with SSH by using an SSH key.
38
+
You can clone a repo by using Secure Shell Protocol (SSH). The following sections describe how to clone a repo by using SSH. To use SSH, you need to authenticate your Git account with SSH by using an SSH key.
41
39
42
40
### Generate and save a new SSH key
43
41
44
42
To generate a new SSH key:
45
43
46
-
1. In the Azure Machine Learning studio **Notebook** page, [open a terminal window](./how-to-access-terminal.md) and run the following command, substituting your email address.
44
+
1. In the Azure Machine Learning studio **Notebook** page, open a terminal window and run the following command, substituting your email address.
The command returns the output `Generating public/private rsa key pair.` and generates a new SSH key with the provided email as a label.
53
51
54
-
1. At the following prompt, make sure the default location is `/home/azureuser/.ssh` or specify that location, and then press Enter.
52
+
1. At the following prompt, make sure the location is `/home/azureuser/.ssh`, or specify that location, and then press Enter.
55
53
56
54
```bash
57
55
Enter a file in which to save the key (/home/azureuser/.ssh/id_rsa): [Press enter]
@@ -61,25 +59,25 @@ To generate a new SSH key:
61
59
62
60
1. It's best to add a passphrase to your SSH key for added security. At the following prompt, enter a secure passphrase.
63
61
64
-
```bash
65
-
> Enter passphrase (empty for no passphrase): [Type a passphrase]
66
-
> Enter same passphrase again: [Type passphrase again]
67
-
```
62
+
```bash
63
+
> Enter passphrase (empty for no passphrase): [Type a passphrase]
64
+
> Enter same passphrase again: [Type passphrase again]
65
+
```
68
66
69
67
### Add the public key to your Git account
70
68
71
-
1. In your terminal window, copy the contents of your public key file. If you renamed the key, replace `id_rsa.pub` with the public key file name.
69
+
1. In your terminal window, run the following command to copy the contents of your public key file. If you renamed the key, replace `id_rsa.pub` with the public key file name.
72
70
73
-
```bash
74
-
cat ~/.ssh/id_rsa.pub
75
-
```
71
+
```bash
72
+
cat ~/.ssh/id_rsa.pub
73
+
```
76
74
77
75
1. To add the SSH key to your Git account, refer to the following instructions depending on your Git service:
> To copy and paste in the terminal window, use these keyboard shortcuts depending on your operating system:
@@ -106,44 +104,33 @@ Git clones the repo and sets up the origin remote to connect with SSH for future
106
104
SSH might display the server's SSH fingerprint and ask you to verify it, as in the following example.
107
105
108
106
```bash
109
-
The authenticity of host 'example.com (192.30.255.112)' can't be established.
110
-
RSA key fingerprint is SHA256:nThbg6kXUpJWGl7E1IGOCspRomTxdCARLviKw6E5SY8.
107
+
The authenticity of host 'example.com (000.00.255.112)' can't be established.
108
+
RSA key fingerprint is SHA256:000000000000000000000000000000000.
111
109
Are you sure you want to continue connecting (yes/no)? yes
112
-
Warning: Permanently added 'github.com,192.30.255.112' (RSA) to the list of known hosts.
110
+
Warning: Permanently added 'github.com,000.00.255.112' (RSA) to the list of known hosts.
113
111
```
114
112
115
-
SSH displays this fingerprint when it connects to an unknown host to protect you from [man-in-the-middle attacks](/previous-versions/windows/it-pro/windows-2000-server/cc959354(v=technet.10)). You should verify that the displayed fingerprint matches one of the fingerprints in the SSH public keys page.
113
+
SSH displays this fingerprint when it connects to an unknown host to protect you from [man-in-the-middle attacks](/previous-versions/windows/it-pro/windows-2000-server/cc959354(v=technet.10)#man-in-the-middle-attack). You should verify that the displayed fingerprint matches one of the fingerprints in the SSH public keys page.
116
114
117
115
When you're asked if you want to continue connecting, enter *yes*. Once you accept the host's fingerprint, SSH doesn't prompt you again unless the fingerprint changes.
118
116
119
117
## Track code that comes from Git repositories
120
118
121
-
When you submit a training job from the Python SDK or Machine Learning CLI, the files needed to train the model are uploaded to your workspace. If the `git`command is available on your development environment, the upload process checks if the files are stored in a Git repository, and uploads information from the Git repository as part of the training job.
119
+
When you submit a training job from the Python SDK or Machine Learning CLI, the files needed to train the model are uploaded to your workspace. If the `git`command is available on your development environment, the upload process checks if the files are stored in a Git repository, and uploads any Git repository information as part of the training job.
122
120
123
-
The following information is sent forjobs that use an estimator, machine learning pipeline, or script run. The information is storedin the following properties for the training job:
121
+
The following information is sent forjobs that use an estimator, machine learning pipeline, or script run. The information is storedin the following training job properties:
124
122
125
123
| Property | Git command used to get the value | Description |
126
124
| ----- | ----- | ----- |
127
-
|`azureml.git.repository_uri`|`git ls-remote --get-url`| The URI that your repository was cloned from. |
128
-
|`azureml.git.branch`|`git symbolic-ref --short HEAD`| The active branch when the job was submitted. |
129
-
|`azureml.git.commit`|`git rev-parse HEAD`| The commit hash of the code that was submitted for the job. |
125
+
|`azureml.git.repository_uri`or `mlflow.source.git.repoURL`|`git ls-remote --get-url`| The URI that your repository was cloned from. |
126
+
|`azureml.git.branch`or `mlflow.source.git.branch`|`git symbolic-ref --short HEAD`| The active branch when the job was submitted. |
127
+
|`azureml.git.commit`or `mlflow.source.git.commit`|`git rev-parse HEAD`| The commit hash of the code that was submitted for the job. |
130
128
|`azureml.git.dirty`|`git status --porcelain .`|`True`if the branch or commit is dirty, otherwise `false`. |
131
-
|`mlflow.source.git.repoURL`|`git ls-remote --get-url`| The URI that your repository was cloned from. |
132
-
|`mlflow.source.git.branch`|`git symbolic-ref --short HEAD`| The active branch when the job was submitted. |
133
-
|`mlflow.source.git.commit`|`git rev-parse HEAD`| The commit hash of the code that was submitted for the job. |
134
129
135
-
If your training files aren't located in a Git repository on your development environment, or the `git` command isn't available, no Git-related information is tracked.
130
+
If the `git`command isn't available on your development environment, or your training files aren't located in a Git repository, no Git-related information is tracked.
136
131
137
132
> [!TIP]
138
-
> To check if the `git`command is available on your development environment, run the following commandin a command line interface:
139
-
>
140
-
>```
141
-
> git --version
142
-
>```
143
-
>
144
-
> If Git is installed and in your path, you receive a response similar to `git version 2.4.1`.
145
-
146
-
For more information on installing Git on your development environment, see the [Git website](https://git-scm.com/).
133
+
> To check if the `git`command is available on your development environment, run the `git --version`commandin a command line interface. If Git is installed and in your path, you receive a response similar to `git version 2.4.1`. For information on installing Git on your development environment, see the [Git website](https://git-scm.com/).
147
134
148
135
## View Git information
149
136
@@ -156,7 +143,8 @@ In your Azure Machine Learning workspace in Azure Machine Learning studio:
156
143
1. Select the **Jobs** page.
157
144
1. Select an experiment.
158
145
1. Select a job from the **Display name** column.
159
-
1. Select **Outputs + logs**, from the top menu, and then expand the **logs** and **azureml** entries.
146
+
1. Select **Outputs + logs** from the top menu.
147
+
1. Expand **logs**>**azureml**.
160
148
1. Select the link that begins with **###_azure**.
161
149
162
150
The logged information contains text similar to the following JSON code:
0 commit comments