Skip to content

Commit 7f5d6e4

Browse files
committed
edits for clarity
1 parent 40faa95 commit 7f5d6e4

File tree

1 file changed

+8
-8
lines changed

1 file changed

+8
-8
lines changed

articles/machine-learning/tutorial-explore-data.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -9,15 +9,15 @@ ms.topic: tutorial
99
ms.reviewer: None
1010
author: s-polly
1111
ms.author: scottpolly
12-
ms.date: 07/25/2024
12+
ms.date: 08/05/2025
1313
#Customer intent: As a data scientist, I want to know how to prototype and develop machine learning models on a cloud workstation.
1414
---
1515

1616
# Tutorial: Upload, access, and explore your data in Azure Machine Learning
1717

1818
[!INCLUDE [sdk v2](includes/machine-learning-sdk-v2.md)]
1919

20-
In this tutorial, you learn how to:
20+
In this tutorial, you:
2121

2222
> [!div class="checklist"]
2323
> * Upload your data to cloud storage
@@ -43,7 +43,7 @@ A machine learning project typically starts with exploratory data analysis (EDA)
4343

4444
## Download the data used in this tutorial
4545

46-
For data ingestion, Azure Data Explorer handles raw data in [these formats](/azure/data-explorer/ingestion-supported-formats). This tutorial uses a [CSV-format credit card client data sample](https://azuremlexamples.blob.core.windows.net/datasets/credit_card/default_of_credit_card_clients.csv). The steps take place in an Azure Machine Learning resource. In that resource, you'll create a local folder with the suggested name of **data**, directly under the folder where this notebook is located.
46+
For data ingestion, Azure Data Explorer handles raw data in [these formats](/azure/data-explorer/ingestion-supported-formats). This tutorial uses a [CSV-format credit card client data sample](https://azuremlexamples.blob.core.windows.net/datasets/credit_card/default_of_credit_card_clients.csv). The steps take place in an Azure Machine Learning resource. In that resource, you create a local folder with the suggested name of **data**, directly under the folder where this notebook is located.
4747

4848
> [!NOTE]
4949
> This tutorial depends on data placed in an Azure Machine Learning resource folder location. For this tutorial, 'local' means a folder location in that Azure Machine Learning resource.
@@ -72,7 +72,7 @@ For more information about the data in the UC Irvine Machine Learning Repository
7272

7373
## Create a handle to the workspace
7474

75-
Before you explore the code, you need a way to reference your workspace. You'll create `ml_client` as a handle to the workspace. You then use `ml_client` to manage resources and jobs.
75+
Before you explore the code, you need a way to reference your workspace. You create `ml_client` as a handle to the workspace. You then use `ml_client` to manage resources and jobs.
7676

7777
In the next cell, enter your Subscription ID, Resource Group name, and Workspace name. To find these values:
7878

@@ -99,7 +99,7 @@ ml_client = MLClient(
9999
```
100100

101101
> [!NOTE]
102-
> Creating MLClient will not connect to the workspace. The client initialization is lazy and waits for the first time it needs to make a call. This happens in the next code cell.
102+
> Creating MLClient won't connect to the workspace. The client initialization is lazy and waits for the first time it needs to make a call. This happens in the next code cell.
103103
104104
## Upload data to cloud storage
105105
@@ -117,11 +117,11 @@ Data asset creation also creates a *reference* to the data source location, alon
117117

118118
The next notebook cell creates the data asset. The code sample uploads the raw data file to the designated cloud storage resource.
119119

120-
Each time you create a data asset, you need a unique version for it. If the version already exists, you'll get an error. In this code, you use "initial" for the first read of the data. If that version already exists, the code doesn't recreate it.
120+
Each time you create a data asset, you need a unique version for it. If the version already exists, you get an error. In this code, you use "initial" for the first read of the data. If that version already exists, the code doesn't recreate it.
121121
122122
You can also omit the **version** parameter. In this case, a version number is generated for you, starting with 1 and incrementing from there.
123123
124-
This tutorial uses the name "initial" as the first version. The [Create production machine learning pipelines](tutorial-pipeline-python-sdk.md) tutorial also uses this version of the data, so you use a value that you'll see again in that tutorial.
124+
This tutorial uses the name "initial" as the first version. The [Create production machine learning pipelines](tutorial-pipeline-python-sdk.md) tutorial also uses this version of the data, so you use a value that you see again in that tutorial.
125125
126126
```python
127127
from azure.ai.ml.entities import Data
@@ -183,7 +183,7 @@ df = pd.read_csv("azureml://subscriptions/<subid>/resourcegroups/<rgname>/worksp
183183
184184
However, as mentioned previously, it can become difficult to remember these URIs. Additionally, you must manually substitute all **<_substring_>** values in the **pd.read_csv** command with the real values for your resources.
185185
186-
You'll want to create data assets for frequently accessed data. Here's an easier way to access the CSV file in Pandas:
186+
You want to create data assets for frequently accessed data. Here's an easier way to access the CSV file in Pandas:
187187
188188
> [!IMPORTANT]
189189
> In a notebook cell, execute this code to install the `azureml-fsspec` Python library in your Jupyter kernel:

0 commit comments

Comments
 (0)