Skip to content

Commit 41ea9b7

Browse files
eedorenkodtzar
authored andcommitted
2.0.0 (#50)
Major upgrade please see release notes
1 parent b013780 commit 41ea9b7

File tree

76 files changed

+920
-2897
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

76 files changed

+920
-2897
lines changed

.env.example

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
# Azure Subscription Variables
2+
WORKSPACE_NAME = ''
3+
RESOURCE_GROUP = ''
4+
SUBSCRIPTION_ID = ''
5+
LOCATION = ''
6+
TENANT_ID = ''
7+
8+
# Azure ML Workspace Variables
9+
EXPERIMENT_NAME = ''
10+
SCRIPT_FOLDER = './'
11+
BLOB_STORE_NAME = ''
12+
# Remote VM Config
13+
REMOTE_VM_NAME = ''
14+
REMOTE_VM_USERNAME = ''
15+
REMOTE_VM_PASSWORD = ''
16+
REMOTE_VM_IP = ''
17+
# AML Compute Cluster Config
18+
AML_CLUSTER_NAME = ''
19+
AML_CLUSTER_VM_SIZE = ''
20+
AML_CLUSTER_MAX_NODES = ''
21+
AML_CLUSTER_MIN_NODES = ''
22+
AML_CLUSTER_PRIORITY = 'lowpriority'
23+
# Training Config
24+
MODEL_NAME = ''
25+
# AML Pipeline Config
26+
TRAINING_PIPELINE_NAME = ''
27+
PIPELINE_CONDA_PATH = 'aml_config/conda_dependencies.yml'
28+
MODEL_PATH = ''
29+
# Image config
30+
IMAGE_NAME = ''
31+
IMAGE_DESCRIPTION = ''
32+
IMAGE_VERSION = ''
33+
# ACI Config
34+
ACI_CPU_CORES = ''
35+
ACI_MEM_GB = ''
36+
ACI_DESCRIPTION = ''

.gitignore

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,3 +103,5 @@ venv.bak/
103103

104104
# mypy
105105
.mypy_cache/
106+
107+
.DS_Store

.pipelines/azdo-base-pipeline.yml

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
# this pipeline should be ignored for now
2+
parameters:
3+
pipelineType: 'training'
4+
5+
steps:
6+
- script: |
7+
flake8 --output-file=$(Build.BinariesDirectory)/lint-testresults.xml --format junit-xml
8+
workingDirectory: '$(Build.SourcesDirectory)'
9+
displayName: 'Run code quality tests'
10+
enabled: 'true'
11+
12+
- script: |
13+
pytest --junitxml=$(Build.BinariesDirectory)/unit-testresults.xml $(Build.SourcesDirectory)/tests/unit
14+
displayName: 'Run unit tests'
15+
enabled: 'true'
16+
env:
17+
SP_APP_SECRET: '$(SP_APP_SECRET)'
18+
19+
- task: PublishTestResults@2
20+
condition: succeededOrFailed()
21+
inputs:
22+
testResultsFiles: '$(Build.BinariesDirectory)/*-testresults.xml'
23+
testRunTitle: 'Linting & Unit tests'
24+
failTaskOnFailedTests: true
25+
displayName: 'Publish linting and unit test results'
26+
enabled: 'true'

.pipelines/azdo-ci-build-train.yml

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
pr: none
2+
trigger:
3+
branches:
4+
include:
5+
- master
6+
7+
pool:
8+
vmImage: 'ubuntu-latest'
9+
10+
container: mcr.microsoft.com/mlops/python:latest
11+
12+
13+
variables:
14+
- group: devopsforai-aml-vg
15+
16+
17+
steps:
18+
- template: azdo-base-pipeline.yml
19+
20+
- bash: |
21+
# Invoke the Python building and publishing a training pipeline
22+
python3 $(Build.SourcesDirectory)/ml_service/pipelines/build_train_pipeline.py
23+
failOnStderr: 'false'
24+
env:
25+
SP_APP_SECRET: '$(SP_APP_SECRET)'
26+
displayName: 'Train model using AML with Remote Compute'
27+
enabled: 'true'
28+
29+
- task: CopyFiles@2
30+
displayName: 'Copy Files to: $(Build.ArtifactStagingDirectory)'
31+
inputs:
32+
SourceFolder: '$(Build.SourcesDirectory)'
33+
TargetFolder: '$(Build.ArtifactStagingDirectory)'
34+
Contents: |
35+
ml_service/pipelines/?(run_train_pipeline.py|*.json)
36+
code/scoring/**
37+
38+
39+
- task: PublishBuildArtifacts@1
40+
displayName: 'Publish Artifact'
41+
inputs:
42+
ArtifactName: 'mlops-pipelines'
43+
publishLocation: 'container'
44+
pathtoPublish: '$(Build.ArtifactStagingDirectory)'
45+
TargetPath: '$(Build.ArtifactStagingDirectory)'

.pipelines/azdo-pr-build-train.yml

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
trigger: none
2+
pr:
3+
branches:
4+
include:
5+
- master
6+
7+
pool:
8+
vmImage: 'ubuntu-latest'
9+
10+
container: mcr.microsoft.com/mlops/python:latest
11+
12+
13+
variables:
14+
- group: devopsforai-aml-vg
15+
16+
17+
steps:
18+
- template: azdo-base-pipeline.yml

README.md

Lines changed: 14 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,18 @@
1+
---
2+
page_type: sample
3+
languages:
4+
- python
5+
products:
6+
- azure
7+
- azure-machine-learning-service
8+
- azure-devops
9+
---
10+
111
# MLOps with Azure ML
212

313

414
[![Build Status](https://dev.azure.com/customai/DevopsForAI-AML/_apis/build/status/Microsoft.MLOpsPython?branchName=master)](https://dev.azure.com/customai/DevopsForAI-AML/_build/latest?definitionId=25&branchName=master)
515

6-
### Author: Praneet Solanki | Richin Jain
716

817
MLOps will help you to understand how to build the Continuous Integration and Continuous Delivery pipeline for a ML/AI project. We will be using the Azure DevOps Project for build and release/deployment pipelines along with Azure ML services for model retraining pipeline, model management and operationalization.
918

@@ -25,20 +34,15 @@ To deploy this solution in your subscription, follow the manual instructions in
2534

2635
This reference architecture shows how to implement continuous integration (CI), continuous delivery (CD), and retraining pipeline for an AI application using Azure DevOps and Azure Machine Learning. The solution is built on the scikit-learn diabetes dataset but can be easily adapted for any AI scenario and other popular build systems such as Jenkins and Travis.
2736

28-
![Architecture](/docs/images/Architecture_DevOps_AI.png)
37+
![Architecture](/docs/images/main-flow.png)
2938

3039

3140
## Architecture Flow
3241

3342
### Train Model
3443
1. Data Scientist writes/updates the code and push it to git repo. This triggers the Azure DevOps build pipeline (continuous integration).
35-
2. Once the Azure DevOps build pipeline is triggered, it runs following types of tasks:
36-
- Run for new code: Every time new code is committed to the repo, the build pipeline performs data sanity tests and unit tests on the new code.
37-
- One-time run: These tasks runs only for the first time the build pipeline runs. It will programatically create an [Azure ML Service Workspace](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace), provision [Azure ML Compute](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-set-up-training-targets#amlcompute) (used for model training compute), and publish an [Azure ML Pipeline](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-ml-pipelines). This published Azure ML pipeline is the model training/retraining pipeline.
38-
39-
> Note: The Publish Azure ML pipeline task currently runs for every code change
40-
41-
3. The Azure ML Retraining pipeline is triggered once the Azure DevOps build pipeline completes. All the tasks in this pipeline runs on Azure ML Compute created earlier. Following are the tasks in this pipeline:
44+
2. Once the Azure DevOps build pipeline is triggered, it performs code quality checks, data sanity tests, unit tests, builds an [Azure ML Pipeline](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-ml-pipelines) and publishes it in an [Azure ML Service Workspace](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace).
45+
3. The [Azure ML Pipeline](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-ml-pipelines) is triggered once the Azure DevOps build pipeline completes. All the tasks in this pipeline runs on Azure ML Compute. Following are the tasks in this pipeline:
4246

4347
- **Train Model** task executes model training script on Azure ML Compute. It outputs a [model](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#model) file which is stored in the [run history](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#run).
4448

@@ -50,16 +54,8 @@ This reference architecture shows how to implement continuous integration (CI),
5054

5155
Once you have registered your ML model, you can use Azure ML + Azure DevOps to deploy it.
5256

53-
The **Package Model** task packages the new model along with the scoring file and its python dependencies into a [docker image](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#image) and pushes it to [Azure Container Registry](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-intro). This image is used to deploy the model as [web service](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#web-service).
54-
55-
The **Deploy Model** task handles deploying your Azure ML model to the cloud (ACI or AKS).
56-
This pipeline deploys the model scoring image into Staging/QA and PROD environments.
57-
58-
In the Staging/QA environment, one task creates an [Azure Container Instance](https://docs.microsoft.com/en-us/azure/container-instances/container-instances-overview) and deploys the scoring image as a [web service](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#web-service) on it.
59-
60-
The second task invokes the web service by calling its REST endpoint with dummy data.
57+
[Azure DevOps release pipeline](https://docs.microsoft.com/en-us/azure/devops/pipelines/release/?view=azure-devops) packages the new model along with the scoring file and its python dependencies into a [docker image](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#image) and pushes it to [Azure Container Registry](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-intro). This image is used to deploy the model as [web service](https://docs.microsoft.com/en-us/azure/machine-learning/service/concept-azure-machine-learning-architecture#web-service) across QA and Prod environments. The QA environment is running on top of [Azure Container Instances (ACI)](https://azure.microsoft.com/en-us/services/container-instances/) and the Prod environemt is built with [Azure Kubernetes Service (AKS)](https://docs.microsoft.com/en-us/azure/aks/intro-kubernetes).
6158

62-
5. The deployment in production is a [gated release](https://docs.microsoft.com/en-us/azure/devops/pipelines/release/approvals/gates?view=azure-devops). This means that once the model web service deployment in the Staging/QA environment is successful, a notification is sent to approvers to manually review and approve the release. Once the release is approved, the model scoring web service is deployed to [Azure Kubernetes Service(AKS)](https://docs.microsoft.com/en-us/azure/aks/intro-kubernetes) and the deployment is tested.
6359

6460
### Repo Details
6561

aml_config/conda_dependencies.yml

Lines changed: 0 additions & 50 deletions
This file was deleted.

aml_config/config.json

Lines changed: 0 additions & 6 deletions
This file was deleted.

aml_config/security_config.json

Lines changed: 0 additions & 15 deletions
This file was deleted.

aml_service/00-WorkSpace.py

Lines changed: 0 additions & 64 deletions
This file was deleted.

0 commit comments

Comments
 (0)