Skip to content

Commit 312dfdf

Browse files
authored
remove alpha from pipeline parameters (#169)
1 parent 8ae6701 commit 312dfdf

File tree

5 files changed

+31
-26
lines changed

5 files changed

+31
-26
lines changed

.pipelines/diabetes_regression-ci-build-train.yml

Lines changed: 1 addition & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -62,30 +62,20 @@ stages:
6262
echo "##vso[task.setvariable variable=AMLPIPELINEID;isOutput=true]$AMLPIPELINEID"
6363
name: 'getpipelineid'
6464
displayName: 'Get Pipeline ID'
65-
- bash: |
66-
# Generate a hyperparameter value as a random number between 0 and 1.
67-
# A random value is used here to make the Azure ML dashboards "interesting" when testing
68-
# the solution sample.
69-
alpha=$(printf "0.%03d\n" $((($RANDOM*1000)/32767)))
70-
echo "Alpha: $alpha"
71-
echo "##vso[task.setvariable variable=ALPHA;isOutput=true]$alpha"
72-
name: 'getalpha'
73-
displayName: 'Generate random value for hyperparameter alpha'
7465
- job: "Run_ML_Pipeline"
7566
dependsOn: "Get_Pipeline_ID"
7667
displayName: "Trigger ML Training Pipeline"
7768
pool: server
7869
variables:
7970
AMLPIPELINE_ID: $[ dependencies.Get_Pipeline_ID.outputs['getpipelineid.AMLPIPELINEID'] ]
80-
ALPHA: $[ dependencies.Get_Pipeline_ID.outputs['getalpha.ALPHA'] ]
8171
steps:
8272
- task: ms-air-aiagility.vss-services-azureml.azureml-restApi-task.MLPublishedPipelineRestAPITask@0
8373
displayName: 'Invoke ML pipeline'
8474
inputs:
8575
azureSubscription: '$(WORKSPACE_SVC_CONNECTION)'
8676
PipelineId: '$(AMLPIPELINE_ID)'
8777
ExperimentName: '$(EXPERIMENT_NAME)'
88-
PipelineParameters: '"ParameterAssignments": {"model_name": "$(MODEL_NAME)", "hyperparameter_alpha": "$(ALPHA)"}'
78+
PipelineParameters: '"ParameterAssignments": {"model_name": "$(MODEL_NAME)"}'
8979
- job: "Training_Run_Report"
9080
dependsOn: "Run_ML_Pipeline"
9181
condition: always()

diabetes_regression/config.json

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
{
2+
"training":
3+
{
4+
"alpha": 0.4
5+
},
6+
"evaluation":
7+
{
8+
9+
},
10+
"scoring":
11+
{
12+
13+
}
14+
}

diabetes_regression/training/train.py

Lines changed: 12 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@
3232
from sklearn.metrics import mean_squared_error
3333
from sklearn.model_selection import train_test_split
3434
from sklearn.externals import joblib
35+
import json
3536

3637

3738
def train_model(run, data, alpha):
@@ -62,13 +63,6 @@ def main():
6263
help="Name of the Model",
6364
default="sklearn_regression_model.pkl",
6465
)
65-
parser.add_argument(
66-
"--alpha",
67-
type=float,
68-
default=0.5,
69-
help=("Ridge regression regularization strength hyperparameter; "
70-
"must be a positive float.")
71-
)
7266

7367
parser.add_argument(
7468
"--dataset_name",
@@ -79,14 +73,23 @@ def main():
7973

8074
print("Argument [build_id]: %s" % args.build_id)
8175
print("Argument [model_name]: %s" % args.model_name)
82-
print("Argument [alpha]: %s" % args.alpha)
8376
print("Argument [dataset_name]: %s" % args.dataset_name)
8477

8578
model_name = args.model_name
8679
build_id = args.build_id
87-
alpha = args.alpha
8880
dataset_name = args.dataset_name
8981

82+
print("Getting training parameters")
83+
84+
with open("config.json") as f:
85+
pars = json.load(f)
86+
try:
87+
alpha = pars["training"]["alpha"]
88+
except KeyError:
89+
alpha = 0.5
90+
91+
print("Parameter alpha: %s" % alpha)
92+
9093
run = Run.get_context()
9194
ws = run.experiment.workspace
9295

docs/getting_started.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -86,6 +86,8 @@ For instructions on how to set up a local development environment, refer to the
8686

8787
For using Azure DevOps Pipelines all other variables are stored in the file `.pipelines/diabetes_regression-variables.yml`. Using the default values as a starting point, adjust the variables to suit your requirements.
8888

89+
**Note:** In `diabetes_regression` folder you can find `config.json` file that we would recommend to use in order to provide parameters for training, evaluation and scoring scripts. An example of a such parameter is a hyperparameter of a training algorithm: in our case it's the ridge regression [*alpha* hyperparameter](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html). We don't provide any special serializers for this config file. So, it's up to you which template to support there.
90+
8991
Up until now you should have:
9092

9193
* Forked (or cloned) the repo
@@ -120,7 +122,7 @@ Check out the newly created resources in the [Azure Portal](portal.azure.com):
120122
(Optional) To remove the resources created for this project you can use the [/environment_setup/iac-remove-environment.yml](../environment_setup/iac-remove-environment.yml) definition or you can just delete the resource group in the [Azure Portal](portal.azure.com).
121123

122124
**Note:** The training ML pipeline uses a [sample diabetes dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_diabetes.html) as training data. If you want to use your own dataset, you need to [create and register a datastore](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-access-data#azure-machine-learning-studio) in your ML workspace and upload the datafile (e.g. [diabetes.csv](./data/diabetes.csv)) to the corresponding blob container. You can also define a datastore in the ML Workspace with [az cli](https://docs.microsoft.com/en-us/cli/azure/ext/azure-cli-ml/ml/datastore?view=azure-cli-latest#ext-azure-cli-ml-az-ml-datastore-attach-blob).
123-
You'll also need to configure DATASTORE_NAME and DATAFILE_NAME variables in ***devopsforai-aml-vg*** variable group.
125+
You'll also need to configure DATASTORE_NAME and DATAFILE_NAME variables in ***devopsforai-aml-vg*** variable group.
124126

125127

126128
## Create an Azure DevOps Azure ML Workspace Service Connection
@@ -187,7 +189,7 @@ specified).
187189
**Note:** If the model evaluation determines that the new model does not perform better than the previous one then the new model will not be registered and the pipeline will be cancelled.
188190

189191
* The third stage of the pipeline, **Deploy to ACI**, deploys the model to the QA environment in [Azure Container Instances](https://azure.microsoft.com/en-us/services/container-instances/). It then runs a *smoke test* to validate the deployment, i.e. sends a sample query to the scoring web service and verifies that it returns a response in the expected format.
190-
192+
191193
Wait until the pipeline finishes and verify that there is a new model in the **ML Workspace**:
192194

193195
![trained model](./images/trained-model.png)
@@ -247,7 +249,6 @@ Make sure your webapp has the credentials to pull the image from the Azure Conta
247249

248250
* The provided pipeline definition YAML file is a sample starting point, which you should tailor to your processes and environment.
249251
* You should edit the pipeline definition to remove unused stages. For example, if you are deploying to ACI and AKS, you should delete the unused `Deploy_Webapp` stage.
250-
* The sample pipeline generates a random value for a model hyperparameter (ridge regression [*alpha*](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.Ridge.html)) to generate 'interesting' charts when testing the sample. In a real application you should use fixed hyperparameter values. You can [tune hyperparameter values using Azure ML](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-tune-hyperparameters), and manage their values in Azure DevOps Variable Groups.
251252
* You may wish to enable [manual approvals](https://docs.microsoft.com/en-us/azure/devops/pipelines/process/approvals) before the deployment stages.
252253
* You can install additional Conda or pip packages by modifying the YAML environment configurations under the `diabetes_regression` directory. Make sure to use fixed version numbers for all packages to ensure reproducibility, and use the same versions across environments.
253254
* You can explore aspects of model observability in the solution, such as:

ml_service/pipelines/diabetes_regression_build_train_pipeline.py

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -44,8 +44,6 @@ def main():
4444
name="model_name", default_value=e.model_name)
4545
build_id_param = PipelineParameter(
4646
name="build_id", default_value=e.build_id)
47-
hyperparameter_alpha_param = PipelineParameter(
48-
name="hyperparameter_alpha", default_value=0.5)
4947

5048
dataset_name = ""
5149
if (e.datastore_name is not None and e.datafile_name is not None):
@@ -66,7 +64,6 @@ def main():
6664
arguments=[
6765
"--build_id", build_id_param,
6866
"--model_name", model_name_param,
69-
"--alpha", hyperparameter_alpha_param,
7067
"--dataset_name", dataset_name,
7168
],
7269
runconfig=run_config,

0 commit comments

Comments
 (0)