MicrosoftDocs
diff --git a/‎articles/machine-learning/how-to-migrate-from-v1.md
Lines changed: 25 additions & 2 deletions b/‎articles/machine-learning/how-to-migrate-from-v1.md
Lines changed: 25 additions & 2 deletions
diff --git a/‎articles/machine-learning/migrate-to-v2-assets-data.md
Lines changed: 226 additions & 0 deletions b/‎articles/machine-learning/migrate-to-v2-assets-data.md
Lines changed: 226 additions & 0 deletions
@@ -8,7 +8,7 @@ ms.subservice: core
 ms.topic: how-to
 author: s-polly
 ms.author: scottpolly
-ms.date: 06/01/2022
+ms.date: 09/23/2022
 ms.reviewer: blackmist
 ms.custom: devx-track-azurecli, devplatv2
 ---
@@ -84,11 +84,14 @@ Do consider migrating the code for creating a workspace to v2. Typically Azure r
 > [!IMPORTANT]
 > If your workspace uses a private endpoint, it will automatically have the `v1_legacy_mode` flag enabled, preventing usage of v2 APIs. See [how to configure network isolation with v2](how-to-configure-network-isolation-with-v2.md) for details.
 
+
 ### Connection (workspace connection in v1)
 
 Workspace connections from v1 are persisted on the workspace, and fully available with v2.
 
 We recommend migrating the code for creating connections to v2.
+For a comparison of SDK v1 and v2 code, see [Migrate workspace management from SDK v1 to SDK v2](migrate-to-v2-resource-workspace.md).
+
 
 ### Datastore
 
@@ -114,6 +117,8 @@ You can continue using your existing v1 model deployments. For new model deploym
 |Azure Kubernetes Service (AKS)|ACI, AKS|Manage your own AKS cluster(s) for model deployment, giving flexibility and granular control at the cost of IT overhead.|
 |Azure Arc Kubernetes|N/A|Manage your own Kubernetes cluster(s) in other clouds or on-premises, giving flexibility and granular control at the cost of IT overhead.|
 
+For a comparison of SDK v1 and v2 code, see [Migrate deployment endpoints from SDK v1 to SDK v2](migrate-to-v2-deploy-endpoints.md).
+
 ### Jobs (experiments, runs, pipelines in v1)
 
 In v2, "experiments", "runs", and "pipelines" are consolidated into jobs. A job has a type. Most jobs are `command` jobs that run a command, like `python main.py`. What runs in a job is agnostic to any programming language, so you can run `bash` scripts, invoke `python` interpreters, run a bunch of `curl` commands, or anything else. Another common type of job is `pipeline`, which defines child jobs that may have input/output relationships, forming a directed acyclic graph (DAG).
@@ -126,6 +131,8 @@ What you run *within* the job does not need to be migrated to v2. However, it is
 
 We recommend migrating the code for creating jobs to v2. You can see [how to train models with the CLI (v2)](how-to-train-cli.md) and the [job YAML references](reference-yaml-job-command.md) for authoring jobs in v2 YAMLs.
 
+For a comparison of SDK v1 and v2 code, see [Migrate script run from SDK v1 to SDK v2](migrate-to-v2-command-job.md).
+
 ### Data (datasets in v1)
 
 Datasets are renamed to data assets. Interoperability between v1 datasets and v2 data assets is the most complex of any entity in Azure ML.
@@ -136,18 +143,34 @@ For details on data in v2, see the [data concept article](concept-data.md).
 
 We recommend migrating the code for [creating data assets](how-to-create-data-assets.md) to v2.
 
+For a comparison of SDK v1 and v2 code, see [Migrate data management from SDK v1 to v2](migrate-to-v2-assets-data.md).
+
+
 ### Model
 
 Models created from v1 can be used in v2. In v2, explicit model types are introduced. Similar to data assets, it may be easier to re-create a v1 model as a v2 model, setting the type appropriately.
 
 We recommend migrating the code for creating models with [SDK](how-to-train-sdk.md) or [CLI](how-to-train-cli.md) to v2.
 
+For a comparison of SDK v1 and v2 code, see 
+
+* [Migrate model management from SDK v1 to SDK v2](migrate-to-v2-assets-model.md)
+* [Migrate AutoML from SDK v1 to SDK v2](migrate-to-v2-execution-automl.md)
+* [Migrate hyperparameter tuning from SDK v1 to SDK v2](migrate-to-v2-execution-hyperdrive.md)
+* [Migrate parallel run step from SDK v1 to SDK v2](migrate-to-v2-execution-parallel-run-step.md)
+
 ### Environment
 
 Environments created from v1 can be used in v2. In v2, environments have new features like creation from a local Docker context.
 
 We recommend migrating the code for creating environments to v2.
 
+## Managing secrets
+
+The management of Key Vault secrets differs significantly in V2 compared to V1. The V1 set_secret and get_secret SDK methods are not available in V2. Instead, direct access using Key Vault client libraries should be used.
+
+For details about Key Vault, see [Use authentication credential secrets in Azure Machine Learning training jobs](how-to-use-secrets-in-runs.md).
+
 ## Scenarios across the machine learning lifecycle
 
 There are a few scenarios that are common across the machine learning lifecycle using Azure ML. We'll look at a few and give general recommendations for migrating to v2.
@@ -182,7 +205,7 @@ A MLOps workflow typically involves CI/CD through an external tool. It's recomme
 
 The solution accelerator for MLOps with v2 is being developed at https://github.com/Azure/mlops-v2 and can be used as reference or adopted for setup and automation of the machine learning lifecycle.
 
-#### A note on GitOps with v2
+### A note on GitOps with v2
 
 A key paradigm with v2 is serializing machine learning entities as YAML files for source control with `git`, enabling better GitOps approaches than were possible with v1. For instance, you could enforce policy by which only a service principal used in CI/CD pipelines can create/update/delete some or all entities, ensuring changes go through a governed process like pull requests with required reviewers. Since the files in source control are YAML, they're easy to diff and track changes over time. You and your team may consider shifting to this paradigm as you migrate to v2.
 
 
@@ -0,0 +1,226 @@
+---
+title: 'Migrate data management from SDK v1 to v2'
+titleSuffix: Azure Machine Learning
+description: Migrate data management from v1 to v2 of Azure Machine Learning SDK
+services: machine-learning
+ms.service: machine-learning
+ms.subservice: mldata
+ms.topic: reference
+author: SturgeonMi
+ms.author: xunwan
+ms.date: 09/16/2022
+ms.reviewer: sgilley
+ms.custom: migration
+---
+
+# Migrate data management from SDK v1 to v2
+
+In V1, an AzureML dataset can either be a `Filedataset` or a `Tabulardataset`.
+In V2, an AzureML data asset can be a `uri_folder`, `uri_file` or `mltable`.
+You can conceptually map `Filedataset` to `uri_folder` and `uri_file`, `Tabulardataset` to `mltable`.
+
+* URIs (`uri_folder`, `uri_file`) - a Uniform Resource Identifier that is a reference to a storage location on your local computer or in the cloud that makes it easy to access data in your jobs.
+* MLTable - a method to abstract the schema definition for tabular data so that it's easier for consumers of the data to materialize the table into a Pandas/Dask/Spark dataframe.
+
+This article gives a comparison of data scenario(s) in SDK v1 and SDK v2.
+
+## Create a `filedataset`/ uri type of data asset
+
+* SDK v1 - Create a `Filedataset`
+
+    ```python
+    from azureml.core import Workspace, Datastore, Dataset
+    
+    # create a FileDataset pointing to files in 'animals' folder and its subfolders recursively
+    datastore_paths = [(datastore, 'animals')]
+    animal_ds = Dataset.File.from_files(path=datastore_paths)
+    
+    # create a FileDataset from image and label files behind public web urls
+    web_paths = ['https://azureopendatastorage.blob.core.windows.net/mnist/train-images-idx3-ubyte.gz',
+                 'https://azureopendatastorage.blob.core.windows.net/mnist/train-labels-idx1-ubyte.gz']
+    mnist_ds = Dataset.File.from_files(path=web_paths)
+    ```
+    
+* SDK v2
+    * Create a `URI_FOLDER` type data asset
+
+        ```python
+        from azure.ai.ml.entities import Data
+        from azure.ai.ml.constants import AssetTypes
+        
+        # Supported paths include:
+        # local: './<path>'
+        # blob:  'https://<account_name>.blob.core.windows.net/<container_name>/<path>'
+        # ADLS gen2: 'abfss://<file_system>@<account_name>.dfs.core.windows.net/<path>/'
+        # Datastore: 'azureml://datastores/<data_store_name>/paths/<path>'
+        
+        my_path = '<path>'
+        
+        my_data = Data(
+            path=my_path,
+            type=AssetTypes.URI_FOLDER,
+            description="<description>",
+            name="<name>",
+            version='<version>'
+        )
+        
+        ml_client.data.create_or_update(my_data)
+        ```
+
+    * Create a `URI_FILE` type data asset.
+        ```python
+        from azure.ai.ml.entities import Data
+        from azure.ai.ml.constants import AssetTypes
+        
+        # Supported paths include:
+        # local: './<path>/<file>'
+        # blob:  'https://<account_name>.blob.core.windows.net/<container_name>/<path>/<file>'
+        # ADLS gen2: 'abfss://<file_system>@<account_name>.dfs.core.windows.net/<path>/<file>'
+        # Datastore: 'azureml://datastores/<data_store_name>/paths/<path>/<file>'
+        my_path = '<path>'
+        
+        my_data = Data(
+            path=my_path,
+            type=AssetTypes.URI_FILE,
+            description="<description>",
+            name="<name>",
+            version="<version>"
+        )
+        
+        ml_client.data.create_or_update(my_data)
+        ```
+
+## Create a tabular dataset/data asset
+
+* SDK v1
+
+    ```python
+    from azureml.core import Workspace, Datastore, Dataset
+    
+    datastore_name = 'your datastore name'
+    
+    # get existing workspace
+    workspace = Workspace.from_config()
+        
+    # retrieve an existing datastore in the workspace by name
+    datastore = Datastore.get(workspace, datastore_name)
+    
+    # create a TabularDataset from 3 file paths in datastore
+    datastore_paths = [(datastore, 'weather/2018/11.csv'),
+                       (datastore, 'weather/2018/12.csv'),
+                       (datastore, 'weather/2019/*.csv')]
+    
+    weather_ds = Dataset.Tabular.from_delimited_files(path=datastore_paths)
+    ```
+
+* SDK v2 - Create `mltable` data asset via yaml definition
+
+    ```yaml
+    type: mltable
+    
+    paths:
+      - pattern: ./*.txt
+    transformations:
+      - read_delimited:
+          delimiter: ,
+          encoding: ascii
+          header: all_files_same_headers
+    ```
+    
+    ```python
+    from azure.ai.ml.entities import Data
+    from azure.ai.ml.constants import AssetTypes
+    
+    # my_path must point to folder containing MLTable artifact (MLTable file + data
+    # Supported paths include:
+    # local: './<path>'
+    # blob:  'https://<account_name>.blob.core.windows.net/<container_name>/<path>'
+    # ADLS gen2: 'abfss://<file_system>@<account_name>.dfs.core.windows.net/<path>/'
+    # Datastore: 'azureml://datastores/<data_store_name>/paths/<path>'
+    
+    my_path = '<path>'
+    
+    my_data = Data(
+        path=my_path,
+        type=AssetTypes.MLTABLE,
+        description="<description>",
+        name="<name>",
+        version='<version>'
+    )
+    
+    ml_client.data.create_or_update(my_data)
+    ```
+
+## Use data in an experiment/job
+
+* SDK v1
+
+    ```python
+    from azureml.core import ScriptRunConfig
+    
+    src = ScriptRunConfig(source_directory=script_folder,
+                          script='train_titanic.py',
+                          # pass dataset as an input with friendly name 'titanic'
+                          arguments=['--input-data', titanic_ds.as_named_input('titanic')],
+                          compute_target=compute_target,
+                          environment=myenv)
+                                 
+    # Submit the run configuration for your training run
+    run = experiment.submit(src)
+    run.wait_for_completion(show_output=True)
+    ```
+
+* SDK v2
+
+    ```python
+    from azure.ai.ml import command
+    from azure.ai.ml.entities import Data
+    from azure.ai.ml import Input, Output
+    from azure.ai.ml.constants import AssetTypes
+    
+    # Possible Asset Types for Data:
+    # AssetTypes.URI_FILE
+    # AssetTypes.URI_FOLDER
+    # AssetTypes.MLTABLE
+    
+    # Possible Paths for Data:
+    # Blob: https://<account_name>.blob.core.windows.net/<container_name>/<folder>/<file>
+    # Datastore: azureml://datastores/paths/<folder>/<file>
+    # Data Asset: azureml:<my_data>:<version>
+    
+    my_job_inputs = {
+        "raw_data": Input(type=AssetTypes.URI_FOLDER, path="<path>")
+    }
+    
+    my_job_outputs = {
+        "prep_data": Output(type=AssetTypes.URI_FOLDER, path="<path>")
+    }
+    
+    job = command(
+        code="./src",  # local path where the code is stored
+        command="python process_data.py --raw_data ${{inputs.raw_data}} --prep_data ${{outputs.prep_data}}",
+        inputs=my_job_inputs,
+        outputs=my_job_outputs,
+        environment="<environment_name>:<version>",
+        compute="cpu-cluster",
+    )
+    
+    # submit the command
+    returned_job = ml_client.create_or_update(job)
+    # get a URL for the status of the job
+    returned_job.services["Studio"].endpoint
+    ```
+
+## Mapping of key functionality in SDK v1 and SDK v2
+
+|Functionality in SDK v1|Rough mapping in SDK v2|
+|-|-|
+|[Method/API in SDK v1](/python/api/azurzeml-core/azureml.datadisplayname: migration, v1, v2)|[Method/API in SDK v2](/python/api/azure-ai-ml/azure.ai.ml.entities)|
+
+## Next steps
+
+For more information, see the documentation here:
+* [Data in Azure Machine Learning](concept-data.md?tabs=uri-file-example%2Ccli-data-create-example)
+* [Create data_assets](how-to-create-data-assets.md?tabs=CLI)
+* [Read and write data in a job](how-to-read-write-data-v2.md)
+* [V2 datastore operations](/python/api/azure-ai-ml/azure.ai.ml.operations.datastoreoperations)