Skip to content

Commit e8a1a6d

Browse files
committed
edits
1 parent 7ea9af4 commit e8a1a6d

File tree

1 file changed

+35
-47
lines changed

1 file changed

+35
-47
lines changed

articles/machine-learning/concept-train-model-git-integration.md

Lines changed: 35 additions & 47 deletions
Original file line numberDiff line numberDiff line change
@@ -18,40 +18,38 @@ ms.custom: sdkv2, build-2023
1818

1919
Azure Machine Learning fully supports Git repositories for tracking work. You can clone repositories directly onto your shared workspace file system, use Git on your local workstation, or use Git from a continuous integration and continuous deployment (CI/CD) pipeline.
2020

21-
When you submit a job to Azure Machine Learning, if source files are stored in a local git repository, information about the repo is tracked as part of the training process. Because Azure Machine Learning tracks the information from the local git repo, it isn't tied to any specific central repository. Your repository can be cloned from GitHub, GitLab, Bitbucket, Azure DevOps, or any other Git-compatible service.
21+
When you submit an Azure Machine Learning training job that has source files from a local Git repository, information about the repo is tracked as part of the training job. Because it's tracked from the local Git repo, the Git information isn't tied to any specific central repository. Your repository can be cloned from any Git-compatible service, such as GitHub, GitLab, Bitbucket, or Azure DevOps.
2222

2323
> [!TIP]
2424
> You can use Visual Studio Code to interact with Git through a graphical user interface. To connect to an Azure Machine Learning remote compute instance by using Visual Studio Code, see [Launch Visual Studio Code integrated with Azure Machine Learning (preview)](how-to-launch-vs-code-remote.md).
2525
>
26-
> For more information on Visual Studio Code version control features, see [Use Version Control in VS Code](https://code.visualstudio.com/docs/editor/versioncontrol) and [Work with GitHub in VS Code](https://code.visualstudio.com/docs/editor/github).
26+
> For more information on Visual Studio Code version control features, see [Use Version Control in Visual Studio Code](https://code.visualstudio.com/docs/editor/versioncontrol) and [Work with GitHub in Visual Studio Code](https://code.visualstudio.com/docs/editor/github).
2727
28-
## Clone Git repositories in a workspace file system
28+
## Git repositories in a workspace file system
2929

30-
Azure Machine Learning provides a shared file system for all users in a workspace. To clone a Git repository into this file share, you can create a compute instance and open a terminal. Once you open the terminal, you have access to a full Git client and can clone and work with Git via the Git CLI experience.
30+
Azure Machine Learning provides a shared file system for all users in a workspace. The best way to clone a Git repository into this file share is to create a compute instance and [open a terminal](./how-to-access-terminal.md). In the terminal, you have access to a full Git client and can clone and work with Git by using the Git CLI. For more information about the Git CLI, see [Git CLI](https://git-scm.com/docs/gitcli).
3131

32-
You can clone any Git repository you can authenticate to, such as a GitHub, Azure Repos, or BitBucket repo. It's best to clone the repository into your user directory, so that other users don't collide directly on your working branch.
32+
You can clone any Git repository you can authenticate to, such as GitHub, Azure Repos, or BitBucket repos. It's best to clone the repository into your user directory, so that other users don't collide directly on your working branch.
3333

34-
There's a performance difference between cloning to the local file system of the compute instance or cloning to the filesystem mounted as the *~/cloudfiles/code* directory. In general, cloning to the local filesystem provides better performance than cloning to the mounted filesystem. However, if you delete and recreate the compute instance, the local filesystem is lost, whereas the mounted filesystem is kept.
35-
36-
For more information about the Git CLI, see [Git CLI](https://git-scm.com/docs/gitcli).
34+
There are some differences between cloning to the local file system of the compute instance or cloning to the shared file system, mounted as the *~/cloudfiles/code/* directory. In general, cloning to the local file system provides better performance than cloning to the mounted file system. However, if you delete and recreate the compute instance, the local file system is lost, while the mounted shared file system remains.
3735

3836
## Clone Git repositories with SSH
3937

40-
You can clone a repo by using HTTPS or SSH. The following sections describe how to clone a repo by using SSH. To use SSH, you need to authenticate your Git account with SSH by using an SSH key.
38+
You can clone a repo by using Secure Shell Protocol (SSH). The following sections describe how to clone a repo by using SSH. To use SSH, you need to authenticate your Git account with SSH by using an SSH key.
4139

4240
### Generate and save a new SSH key
4341

4442
To generate a new SSH key:
4543

46-
1. In the Azure Machine Learning studio **Notebook** page, [open a terminal window](./how-to-access-terminal.md) and run the following command, substituting your email address.
44+
1. In the Azure Machine Learning studio **Notebook** page, open a terminal window and run the following command, substituting your email address.
4745

4846
```bash
4947
ssh-keygen -t rsa -b 4096 -C "[email protected]"
5048
```
5149

5250
The command returns the output `Generating public/private rsa key pair.` and generates a new SSH key with the provided email as a label.
5351

54-
1. At the following prompt, make sure the default location is `/home/azureuser/.ssh` or specify that location, and then press Enter.
52+
1. At the following prompt, make sure the location is `/home/azureuser/.ssh`, or specify that location, and then press Enter.
5553

5654
```bash
5755
Enter a file in which to save the key (/home/azureuser/.ssh/id_rsa): [Press enter]
@@ -61,25 +59,25 @@ To generate a new SSH key:
6159

6260
1. It's best to add a passphrase to your SSH key for added security. At the following prompt, enter a secure passphrase.
6361

64-
```bash
65-
> Enter passphrase (empty for no passphrase): [Type a passphrase]
66-
> Enter same passphrase again: [Type passphrase again]
67-
```
62+
```bash
63+
> Enter passphrase (empty for no passphrase): [Type a passphrase]
64+
> Enter same passphrase again: [Type passphrase again]
65+
```
6866

6967
### Add the public key to your Git account
7068

71-
1. In your terminal window, copy the contents of your public key file. If you renamed the key, replace `id_rsa.pub` with the public key file name.
69+
1. In your terminal window, run the following command to copy the contents of your public key file. If you renamed the key, replace `id_rsa.pub` with the public key file name.
7270

73-
```bash
74-
cat ~/.ssh/id_rsa.pub
75-
```
71+
```bash
72+
cat ~/.ssh/id_rsa.pub
73+
```
7674

7775
1. To add the SSH key to your Git account, refer to the following instructions depending on your Git service:
7876

79-
- [GitHub](https://docs.github.com/github/authenticating-to-github/adding-a-new-ssh-key-to-your-github-account)
77+
- [GitHub](https://docs.github.com/github/authenticating-to-github/adding-a-new-ssh-key-to-your-github-account#adding-a-new-ssh-key-to-your-account)
8078
- [GitLab](https://docs.gitlab.com/ee/user/ssh.html#add-an-ssh-key-to-your-gitlab-account)
81-
- [Azure DevOps](/azure/devops/repos/git/use-ssh-keys-to-authenticate#step-2--add-the-public-key-to-azure-devops-servicestfs) Start at **Step 2**.
82-
- [BitBucket](https://support.atlassian.com/bitbucket-cloud/docs/set-up-an-ssh-key/#SetupanSSHkey-ssh2). Follow **Step 4**.
79+
- [Azure DevOps](/azure/devops/repos/git/use-ssh-keys-to-authenticate#step-2-add-the-public-key-to-azure-devops)
80+
- [BitBucket](https://support.atlassian.com/bitbucket-cloud/docs/configure-ssh-and-two-step-verification/)
8381

8482
> [!TIP]
8583
> To copy and paste in the terminal window, use these keyboard shortcuts depending on your operating system:
@@ -106,44 +104,33 @@ Git clones the repo and sets up the origin remote to connect with SSH for future
106104
SSH might display the server's SSH fingerprint and ask you to verify it, as in the following example.
107105

108106
```bash
109-
The authenticity of host 'example.com (192.30.255.112)' can't be established.
110-
RSA key fingerprint is SHA256:nThbg6kXUpJWGl7E1IGOCspRomTxdCARLviKw6E5SY8.
107+
The authenticity of host 'example.com (000.00.255.112)' can't be established.
108+
RSA key fingerprint is SHA256:000000000000000000000000000000000.
111109
Are you sure you want to continue connecting (yes/no)? yes
112-
Warning: Permanently added 'github.com,192.30.255.112' (RSA) to the list of known hosts.
110+
Warning: Permanently added 'github.com,000.00.255.112' (RSA) to the list of known hosts.
113111
```
114112
115-
SSH displays this fingerprint when it connects to an unknown host to protect you from [man-in-the-middle attacks](/previous-versions/windows/it-pro/windows-2000-server/cc959354(v=technet.10)). You should verify that the displayed fingerprint matches one of the fingerprints in the SSH public keys page.
113+
SSH displays this fingerprint when it connects to an unknown host to protect you from [man-in-the-middle attacks](/previous-versions/windows/it-pro/windows-2000-server/cc959354(v=technet.10)#man-in-the-middle-attack). You should verify that the displayed fingerprint matches one of the fingerprints in the SSH public keys page.
116114
117115
When you're asked if you want to continue connecting, enter *yes*. Once you accept the host's fingerprint, SSH doesn't prompt you again unless the fingerprint changes.
118116

119117
## Track code that comes from Git repositories
120118

121-
When you submit a training job from the Python SDK or Machine Learning CLI, the files needed to train the model are uploaded to your workspace. If the `git` command is available on your development environment, the upload process checks if the files are stored in a Git repository, and uploads information from the Git repository as part of the training job.
119+
When you submit a training job from the Python SDK or Machine Learning CLI, the files needed to train the model are uploaded to your workspace. If the `git` command is available on your development environment, the upload process checks if the files are stored in a Git repository, and uploads any Git repository information as part of the training job.
122120

123-
The following information is sent for jobs that use an estimator, machine learning pipeline, or script run. The information is stored in the following properties for the training job:
121+
The following information is sent for jobs that use an estimator, machine learning pipeline, or script run. The information is stored in the following training job properties:
124122

125123
| Property | Git command used to get the value | Description |
126124
| ----- | ----- | ----- |
127-
| `azureml.git.repository_uri` | `git ls-remote --get-url` | The URI that your repository was cloned from. |
128-
| `azureml.git.branch` | `git symbolic-ref --short HEAD` | The active branch when the job was submitted. |
129-
| `azureml.git.commit` | `git rev-parse HEAD` | The commit hash of the code that was submitted for the job. |
125+
| `azureml.git.repository_uri` or `mlflow.source.git.repoURL` | `git ls-remote --get-url` | The URI that your repository was cloned from. |
126+
| `azureml.git.branch` or `mlflow.source.git.branch` | `git symbolic-ref --short HEAD` | The active branch when the job was submitted. |
127+
| `azureml.git.commit` or `mlflow.source.git.commit` | `git rev-parse HEAD` | The commit hash of the code that was submitted for the job. |
130128
| `azureml.git.dirty` | `git status --porcelain .` | `True` if the branch or commit is dirty, otherwise `false`. |
131-
| `mlflow.source.git.repoURL` | `git ls-remote --get-url` | The URI that your repository was cloned from. |
132-
| `mlflow.source.git.branch` | `git symbolic-ref --short HEAD` | The active branch when the job was submitted. |
133-
| `mlflow.source.git.commit` | `git rev-parse HEAD` | The commit hash of the code that was submitted for the job. |
134129

135-
If your training files aren't located in a Git repository on your development environment, or the `git` command isn't available, no Git-related information is tracked.
130+
If the `git` command isn't available on your development environment, or your training files aren't located in a Git repository, no Git-related information is tracked.
136131

137132
> [!TIP]
138-
> To check if the `git` command is available on your development environment, run the following command in a command line interface:
139-
>
140-
> ```
141-
> git --version
142-
> ```
143-
>
144-
> If Git is installed and in your path, you receive a response similar to `git version 2.4.1`.
145-
146-
For more information on installing Git on your development environment, see the [Git website](https://git-scm.com/).
133+
> To check if the `git` command is available on your development environment, run the `git --version` command in a command line interface. If Git is installed and in your path, you receive a response similar to `git version 2.4.1`. For information on installing Git on your development environment, see the [Git website](https://git-scm.com/).
147134

148135
## View Git information
149136

@@ -156,7 +143,8 @@ In your Azure Machine Learning workspace in Azure Machine Learning studio:
156143
1. Select the **Jobs** page.
157144
1. Select an experiment.
158145
1. Select a job from the **Display name** column.
159-
1. Select **Outputs + logs**, from the top menu, and then expand the **logs** and **azureml** entries.
146+
1. Select **Outputs + logs** from the top menu.
147+
1. Expand **logs** > **azureml**.
160148
1. Select the link that begins with **###_azure**.
161149

162150
The logged information contains text similar to the following JSON code:
@@ -188,10 +176,10 @@ job.properties["azureml.git.commit"]
188176

189177
### Azure CLI V2
190178

191-
Run the `az ml job show` command to display the `GitCommit:properties`. For example:
179+
Run the `az ml job show` command to display the `GitCommit` property. For example:
192180

193181
```azurecli
194-
az ml job show --name my_job_id --query "{GitCommit:properties."""azureml.git.commit"""}"
182+
az ml job show --name my_job_id --query "{GitCommit:properties.azureml.git.commit}"
195183
```
196184

197185
## Related content

0 commit comments

Comments
 (0)