Fix package dependency issues by ChristopherGS · Pull Request #36 · trainindata/testing-and-monitoring-ml-deployments

ChristopherGS · 2022-04-09T08:05:00Z

Update: For students checking this PR, the fixes have now been applied to the commit history. Please update your local copy from the current master branch

Fix for the Unit Testing a Production ML Model section

…cker

…Model Inputs

ChristopherGS · 2022-04-09T09:12:20Z

packages/gradient_boosting_model/tests/test_pipeline.py

-    # by the `fit` method, since this allows us to access the transformed
-    # dataframe. For other models we could use the `transform` method, but
-    # the GradientBoostingRegressor does not have a `transform` method.
-    X_transformed, _ = pipeline.price_pipe._fit(X_train, y_train)


@solegalli previously I was able to use the _fit method to access the transformed dataframe of the pipeline. This method seems to have been updated (possibly here: scikit-learn/scikit-learn@88ce8cd) and is throwing an error.

Any idea how to access the transformed dataframe for the GradientBoostingRegressor (since it doesn't have a transform method?)

Not straightforward. Here some potential solutions:

https://stackoverflow.com/questions/54332654/how-to-analyse-the-intermediate-steps-of-sklearn-pipeline
https://stackoverflow.com/questions/48743032/get-intermediate-data-state-in-scikit-learn-pipeline?

In short, I think you need to apply all the transformations one after the other, just up to the one before the model.

Ok, I asked the sklearn community through the email list and one of the developers told me to do this:

Using slicing: model[:-1].transform(X)

That worked perfectly :)

from gradient_boosting_model import pipeline from gradient_boosting_model.config.core import config from functools import reduce def test_pipeline_drops_unnecessary_features(pipeline_inputs): # Given X_train, X_test, y_train, y_test = pipeline_inputs assert config.model_config.drop_features in X_train.columns pipeline.price_pipe.fit(X_train, y_train) # When # We access the transformed inputs with slicing transformed_inputs = pipeline.price_pipe[:-1].transform(X_train) # Then assert config.model_config.drop_features in X_train.columns assert config.model_config.drop_features not in transformed_inputs.columns def test_pipeline_transforms_temporal_features(pipeline_inputs): # Given X_train, X_test, y_train, y_test = pipeline_inputs # When # We access the transformed inputs with slicing transformed_inputs = pipeline.price_pipe[:-1].transform(X_train) # Then assert ( transformed_inputs.iloc[0]["YearRemodAdd"] == X_train.iloc[0]["YrSold"] - X_train.iloc[0]["YearRemodAdd"] )

ChristopherGS · 2022-04-09T14:19:29Z

exercise_notebooks/docker_exercise/requirements.txt

@@ -1,2 +1,3 @@
 flask>=1.1.1,<1.2.0
+markupsafe==2.0.1  # https://github.com/aws/aws-sam-cli/issues/3661


Key Section 5 fix

ChristopherGS · 2022-04-09T14:19:59Z

packages/gradient_boosting_model/requirements.txt

-feature_engine>=0.3.1,<0.4.0
-joblib>=0.14.1,<0.15.0
+numpy>=1.20.0,<1.21.0
+pandas>=1.3.5,<1.4.0


Key section 4 fix 1

ChristopherGS · 2022-04-09T14:20:07Z

packages/gradient_boosting_model/test_requirements.txt

 # old model for testing purposes
 # source code: https://github.com/trainindata/deploying-machine-learning-models/tree/master/packages/regression_model
-tid-regression-model>=2.0.20,<2.1.0
+tid-regression-model==3.1.2


Key section 4 fix 2

ChristopherGS · 2022-04-09T14:58:43Z

exercise_notebooks/prometheus_exercise/requirements.txt

@@ -1,4 +1,5 @@
 Flask>=1.1.1,<1.2.0
+markupsafe==2.0.1  # https://github.com/aws/aws-sam-cli/issues/3661


key fix for section 9

ChristopherGS · 2022-04-09T14:59:59Z

packages/ml_api/requirements/requirements.txt

@@ -1,12 +1,13 @@
 # ML Model
-tid-gradient-boosting-model>=0.1.18,<0.2.0
+tid-gradient-boosting-model>=0.3.0,<0.4.0


Key fix for section 9 (2/2)

Note required new publishing of gradient boosting model (so sklearn versions are compatible)

ChristopherGS and others added 29 commits February 1, 2020 10:03

Unit Testing Production ML Code - Test Preprocessing

f8ddadf

Unit Testing Production ML Code - Test Config

dcb0636

Unit Testing Production ML Code - Test Input Validation

fd102c9

Unit Testing Production ML Code - Test Model Quality

2ca8913

Unit Testing Production ML Code - Add Tooling

ebf7691

Integration Testing Production ML Code - Initial Setup

8c6dd59

Integration Testing Production ML Code - Create Integration Tests

031ac57

Advanced Testing Production ML Code - Create Differential Tests

cf7a3f6

Advanced Testing Production ML Code - Create Differential Tests In Do…

110fdc7

…cker

Shadow Mode ML Code - Initial Setup

8891397

Shadow Mode ML Code - Implementation and Tests

33c8318

Shadow Mode ML Code - Asynchronous Implementation

fb83a3e

Shadow Mode ML Code - Populate DB Script

46f8fa4

Shadow Mode ML Code - Analyse Results

c0f42d5

Monitoring with Prometheus - Basic Setup

ae898dd

Monitoring with Prometheus - Add Simple Metrics

9f70b84

Monitoring with Prometheus - Setup Grafana

8750b1a

Monitoring with Prometheus - Basic Infrastructure Metrics

6126865

Monitoring with Prometheus - Instrument Project API

2bd17e4

Monitoring with Prometheus - Build Grafana Dashboards for Model

e794349

Monitoring Logs with the Elastic Stack - Basic ELK Setup

621b892

Monitoring Logs with the Elastic Stack - Integrate with API

3a9bbf4

Monitoring Logs with the Elastic Stack - Create Kibana Dashboard for …

f72cfdb

…Model Inputs

Not Part Of Course - Add CI

0a0f131

Update readme

c4c0bc8

update requirements for research phase and exercise notebooks (#35)

0fc27a1

update dependencies and use updated regression model from deployments

e5f959c

update feature engine api usage

9a134fc

remove failing test

2a01c6c

ChristopherGS commented Apr 9, 2022

View reviewed changes

fix docker compose exercise dependencies

28c56f6

ChristopherGS commented Apr 9, 2022

View reviewed changes

key dependency adjustments

4606a9c

ChristopherGS commented Apr 9, 2022

View reviewed changes

fix elk exercise requirements

b8a73a9

ChristopherGS force-pushed the master branch 7 times, most recently from 1716ef4 to 72ae086 Compare April 16, 2022 20:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix package dependency issues#36

Fix package dependency issues#36
ChristopherGS wants to merge 32 commits intomasterfrom
fix-package-dependency-issues

ChristopherGS commented Apr 9, 2022 •

edited

Loading

Uh oh!

ChristopherGS Apr 9, 2022

Uh oh!

solegalli Apr 11, 2022

Uh oh!

solegalli Apr 12, 2022

Uh oh!

ChristopherGSTest Apr 16, 2022

Uh oh!

ChristopherGS Apr 9, 2022

Uh oh!

ChristopherGS Apr 9, 2022

Uh oh!

ChristopherGS Apr 9, 2022

Uh oh!

ChristopherGS Apr 9, 2022

Uh oh!

ChristopherGS Apr 9, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -1,2 +1,3 @@
		flask>=1.1.1,<1.2.0
		markupsafe==2.0.1 # https://github.com/aws/aws-sam-cli/issues/3661

		@@ -1,4 +1,5 @@
		Flask>=1.1.1,<1.2.0
		markupsafe==2.0.1 # https://github.com/aws/aws-sam-cli/issues/3661

Conversation

ChristopherGS commented Apr 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ChristopherGS commented Apr 9, 2022 •

edited

Loading