Skip to content

Commit eb90f64

Browse files
committed
edits
1 parent e8a1a6d commit eb90f64

File tree

1 file changed

+57
-54
lines changed

1 file changed

+57
-54
lines changed

articles/machine-learning/concept-train-model-git-integration.md

Lines changed: 57 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ ms.topic: conceptual
99
author: ositanachi
1010
ms.author: osiotugo
1111
ms.reviewer: larryfr
12-
ms.date: 06/11/2024
12+
ms.date: 06/12/2024
1313
ms.custom: sdkv2, build-2023
1414
---
1515
# Git integration for Azure Machine Learning
@@ -18,13 +18,14 @@ ms.custom: sdkv2, build-2023
1818

1919
Azure Machine Learning fully supports Git repositories for tracking work. You can clone repositories directly onto your shared workspace file system, use Git on your local workstation, or use Git from a continuous integration and continuous deployment (CI/CD) pipeline.
2020

21-
When you submit an Azure Machine Learning training job that has source files from a local Git repository, information about the repo is tracked as part of the training job. Because it's tracked from the local Git repo, the Git information isn't tied to any specific central repository. Your repository can be cloned from any Git-compatible service, such as GitHub, GitLab, Bitbucket, or Azure DevOps.
21+
When you submit an Azure Machine Learning training job that has source files from a local Git repository, information about the repo is tracked as part of the training job. Because the information comes from the local Git repo, it isn't tied to any specific central repository. Your repository can be cloned from any Git-compatible service, such as GitHub, GitLab, Bitbucket, or Azure DevOps.
2222

2323
> [!TIP]
2424
> You can use Visual Studio Code to interact with Git through a graphical user interface. To connect to an Azure Machine Learning remote compute instance by using Visual Studio Code, see [Launch Visual Studio Code integrated with Azure Machine Learning (preview)](how-to-launch-vs-code-remote.md).
2525
>
2626
> For more information on Visual Studio Code version control features, see [Use Version Control in Visual Studio Code](https://code.visualstudio.com/docs/editor/versioncontrol) and [Work with GitHub in Visual Studio Code](https://code.visualstudio.com/docs/editor/github).
2727
28+
<a name="#clone-git-repositories-into-your-workspace-file-system"></a>
2829
## Git repositories in a workspace file system
2930

3031
Azure Machine Learning provides a shared file system for all users in a workspace. The best way to clone a Git repository into this file share is to create a compute instance and [open a terminal](./how-to-access-terminal.md). In the terminal, you have access to a full Git client and can clone and work with Git by using the Git CLI. For more information about the Git CLI, see [Git CLI](https://git-scm.com/docs/gitcli).
@@ -33,90 +34,92 @@ You can clone any Git repository you can authenticate to, such as GitHub, Azure
3334

3435
There are some differences between cloning to the local file system of the compute instance or cloning to the shared file system, mounted as the *~/cloudfiles/code/* directory. In general, cloning to the local file system provides better performance than cloning to the mounted file system. However, if you delete and recreate the compute instance, the local file system is lost, while the mounted shared file system remains.
3536

36-
## Clone Git repositories with SSH
37+
## Clone a Git repository with SSH
3738

38-
You can clone a repo by using Secure Shell Protocol (SSH). The following sections describe how to clone a repo by using SSH. To use SSH, you need to authenticate your Git account with SSH by using an SSH key.
39+
You can clone a repo by using Secure Shell (SSH) protocol. The following sections describe how to clone a repo by using SSH. To use SSH, you need to authenticate your Git account with SSH by using an SSH key.
3940

4041
### Generate and save a new SSH key
4142

42-
To generate a new SSH key:
43+
In the Azure Machine Learning studio **Notebook** page, open a terminal window and run the following command, substituting your email address.
4344

44-
1. In the Azure Machine Learning studio **Notebook** page, open a terminal window and run the following command, substituting your email address.
45-
46-
```bash
47-
ssh-keygen -t rsa -b 4096 -C "[email protected]"
48-
```
45+
```bash
46+
ssh-keygen -t ed25519 -C "[email protected]"
47+
```
4948

50-
The command returns the output `Generating public/private rsa key pair.` and generates a new SSH key with the provided email as a label.
49+
The command returns the following output:
5150

52-
1. At the following prompt, make sure the location is `/home/azureuser/.ssh`, or specify that location, and then press Enter.
51+
```bash
52+
Generating public/private ed25519 key pair.
53+
Enter file in which to save the key (/home/azureuser/.ssh/id_ed25519):
54+
```
5355

54-
```bash
55-
Enter a file in which to save the key (/home/azureuser/.ssh/id_rsa): [Press enter]
56-
```
56+
Make sure the location in the preceding output is `/home/azureuser/.ssh`, or change it to that location, and then press Enter.
5757

58-
The key file saves on the compute instance, and is accessible only to the compute instance owner.
58+
It's best to add a passphrase to your SSH key for added security. At the following prompts, enter a secure passphrase.
5959

60-
1. It's best to add a passphrase to your SSH key for added security. At the following prompt, enter a secure passphrase.
60+
```bash
61+
Enter passphrase (empty for no passphrase):
62+
Enter same passphrase again:
63+
```
6164

62-
```bash
63-
> Enter passphrase (empty for no passphrase): [Type a passphrase]
64-
> Enter same passphrase again: [Type passphrase again]
65-
```
65+
When you press Enter, the `ssh-keygen` command generates a new SSH key with the provided email address as a label. The key file saves on the compute instance, and is accessible only to the compute instance owner.
6666

6767
### Add the public key to your Git account
6868

69-
1. In your terminal window, run the following command to copy the contents of your public key file. If you renamed the key, replace `id_rsa.pub` with the public key file name.
69+
In your terminal window, run the following command to display the contents of your public key file. If you renamed the key, replace `id_ed25519.pub` with the public key file name.
7070

71-
```bash
72-
cat ~/.ssh/id_rsa.pub
73-
```
71+
```bash
72+
cat ~/.ssh/id_ed25519.pub
73+
```
7474

75-
1. To add the SSH key to your Git account, refer to the following instructions depending on your Git service:
76-
77-
- [GitHub](https://docs.github.com/github/authenticating-to-github/adding-a-new-ssh-key-to-your-github-account#adding-a-new-ssh-key-to-your-account)
78-
- [GitLab](https://docs.gitlab.com/ee/user/ssh.html#add-an-ssh-key-to-your-gitlab-account)
79-
- [Azure DevOps](/azure/devops/repos/git/use-ssh-keys-to-authenticate#step-2-add-the-public-key-to-azure-devops)
80-
- [BitBucket](https://support.atlassian.com/bitbucket-cloud/docs/configure-ssh-and-two-step-verification/)
75+
Copy the output.
8176

8277
> [!TIP]
8378
> To copy and paste in the terminal window, use these keyboard shortcuts depending on your operating system:
8479
>
85-
> - Windows: Ctrl+Insert to copy, Ctrl+Shift+V or Ctrl+Shift+Insert to paste.
80+
> - Windows: Ctrl+C or Ctrl+Insert to copy, Ctrl+V or Ctrl+Shift+V to paste.
8681
> - MacOS: Cmd+C to copy and Cmd+V to paste.
8782
>
8883
> Some browsers might not support clipboard permissions properly.
8984
85+
Add the SSH key to your Git account by using the following instructions, depending on your Git service:
86+
87+
- [GitHub](https://docs.github.com/github/authenticating-to-github/adding-a-new-ssh-key-to-your-github-account#adding-a-new-ssh-key-to-your-account)
88+
- [GitLab](https://docs.gitlab.com/ee/user/ssh.html#add-an-ssh-key-to-your-gitlab-account)
89+
- [Azure DevOps](/azure/devops/repos/git/use-ssh-keys-to-authenticate#step-2-add-the-public-key-to-azure-devops)
90+
- [BitBucket](https://support.atlassian.com/bitbucket-cloud/docs/configure-ssh-and-two-step-verification/)
91+
9092
### Clone the Git repository with SSH
9193

92-
1. Copy the SSH Git clone URL from the Git repo.
94+
1. Copy the SSH Git clone URL from the Git repo you want to clone.
9395

94-
1. Run the following `git clone` command, using your SSH Git repo URL. For example:
96+
1. Run the following `git clone` command, using your SSH Git clone URL. For example:
9597

9698
```bash
9799
git clone [email protected]:GitUser/azureml-example.git
98100
```
99101

100-
Git clones the repo and sets up the origin remote to connect with SSH for future Git commands.
101-
102-
#### Verify fingerprint
102+
1. SSH might display the server's SSH fingerprint and ask you to verify it, as in the following example.
103103

104-
SSH might display the server's SSH fingerprint and ask you to verify it, as in the following example.
104+
```bash
105+
The authenticity of host 'github.com (000.00.000.0)' can't be established.
106+
ECDSA key fingerprint is SHA256:0000000000000000000/00000000/00000000.
107+
Are you sure you want to continue connecting (yes/no/[fingerprint])?
108+
```
105109
106-
```bash
107-
The authenticity of host 'example.com (000.00.255.112)' can't be established.
108-
RSA key fingerprint is SHA256:000000000000000000000000000000000.
109-
Are you sure you want to continue connecting (yes/no)? yes
110-
Warning: Permanently added 'github.com,000.00.255.112' (RSA) to the list of known hosts.
111-
```
110+
SSH displays this fingerprint when it connects to an unknown host to protect you from [man-in-the-middle attacks](/previous-versions/windows/it-pro/windows-2000-server/cc959354(v=technet.10)#man-in-the-middle-attack). You should verify that the displayed fingerprint matches one of the fingerprints in the SSH public keys page, and then respond *yes*.
112111
113-
SSH displays this fingerprint when it connects to an unknown host to protect you from [man-in-the-middle attacks](/previous-versions/windows/it-pro/windows-2000-server/cc959354(v=technet.10)#man-in-the-middle-attack). You should verify that the displayed fingerprint matches one of the fingerprints in the SSH public keys page.
112+
1. SSH displays a response like the following example:
114113
115-
When you're asked if you want to continue connecting, enter *yes*. Once you accept the host's fingerprint, SSH doesn't prompt you again unless the fingerprint changes.
114+
```bash
115+
Warning: Permanently added 'github.com,000.00.000.0' (ECDSA) to the list of known hosts.
116+
Enter passphrase for key '/home/azureuser/.ssh/id_ed25519':
117+
```
118+
1. Enter your passphrase. Git clones the repo and sets up the origin remote to connect with SSH for future Git commands. Once you accept the host's fingerprint, SSH doesn't prompt you again unless the fingerprint changes.
116119
117120
## Track code that comes from Git repositories
118121
119-
When you submit a training job from the Python SDK or Machine Learning CLI, the files needed to train the model are uploaded to your workspace. If the `git` command is available on your development environment, the upload process checks if the files are stored in a Git repository, and uploads any Git repository information as part of the training job.
122+
When you submit a training job from the Python SDK or Machine Learning CLI, the files needed to train the model are uploaded to your workspace. If the `git` command is available on your development environment, the upload process checks if the source files are stored in a Git repository, and if so, uploads any Git repository information as part of the training job.
120123
121124
The following information is sent for jobs that use an estimator, machine learning pipeline, or script run. The information is stored in the following training job properties:
122125
@@ -130,7 +133,7 @@ The following information is sent for jobs that use an estimator, machine learni
130133
If the `git` command isn't available on your development environment, or your training files aren't located in a Git repository, no Git-related information is tracked.
131134
132135
> [!TIP]
133-
> To check if the `git` command is available on your development environment, run the `git --version` command in a command line interface. If Git is installed and in your path, you receive a response similar to `git version 2.4.1`. For information on installing Git on your development environment, see the [Git website](https://git-scm.com/).
136+
> To check if the `git` command is available on your development environment, run the `git --version` command in a command line interface. If Git is installed and in your path, you receive a response similar to `git version 2.43.0`. For information on installing Git on your development environment, see the [Git website](https://git-scm.com/).
134137
135138
## View Git information
136139
@@ -147,7 +150,7 @@ In your Azure Machine Learning workspace in Azure Machine Learning studio:
147150
1. Expand **logs** > **azureml**.
148151
1. Select the link that begins with **###_azure**.
149152
150-
The logged information contains text similar to the following JSON code:
153+
The logged information contains JSON code similar to the following example:
151154
152155
```json
153156
"properties": {
@@ -168,23 +171,23 @@ The logged information contains text similar to the following JSON code:
168171
169172
### Python SDK V2
170173
171-
After you submit a training run, a [Job](/python/api/azure-ai-ml/azure.ai.ml.entities.job) object is returned. The `properties` attribute of this object contains the logged Git information. For example, the following code retrieves the commit hash:
174+
After you submit a training run, a [Job](/python/api/azure-ai-ml/azure.ai.ml.entities.job) object is returned. The `properties` attribute of this object contains the logged Git information. For example, you can run the following command to retrieve the commit hash:
172175
173176
```python
174177
job.properties["azureml.git.commit"]
175178
```
176179
177180
### Azure CLI V2
178181
179-
Run the `az ml job show` command to display the `GitCommit` property. For example:
182+
Run the `az ml job show` command with the `--query` argument to display the Git information. For example, the following query retrieves the `GitCommit` property:
180183
181184
```azurecli
182-
az ml job show --name my_job_id --query "{GitCommit:properties.azureml.git.commit}"
185+
az ml job show --name my-job-id --query "{GitCommit:properties.azureml.git.commit} --resource-group my-resource-group --workspace-name my-workspace"
183186
```
184187
185188
## Related content
186189
187190
- [Access a compute instance terminal in your workspace](how-to-access-terminal.md)
188191
- [Launch Visual Studio Code integrated with Azure Machine Learning (preview)](how-to-launch-vs-code-remote.md)
189-
- [Use Version Control in VS Code](https://code.visualstudio.com/docs/editor/versioncontrol)
190-
- [Work with GitHub in VS Code](https://code.visualstudio.com/docs/editor/github)
192+
- [Use Version Control in Visual Studio Code](https://code.visualstudio.com/docs/editor/versioncontrol)
193+
- [Work with GitHub in Visual Studio Code](https://code.visualstudio.com/docs/editor/github)

0 commit comments

Comments
 (0)