Skip to content

Commit c19494f

Browse files
authored
Merge pull request #7 from MicrosoftCloudEssentials-LearningHub/step4-inprogress
finishing model
2 parents 6a07ef6 + 8fd3c63 commit c19494f

File tree

3 files changed

+1045
-44
lines changed

3 files changed

+1045
-44
lines changed

README.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Costa Rica
55
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
66
[brown9804](https://github.com/brown9804)
77

8-
Last updated: 2025-04-29
8+
Last updated: 2025-05-06
99

1010
------------------------------------------
1111

@@ -15,7 +15,7 @@ Last updated: 2025-04-29
1515
- Terraform [Demonstration: Deploying Azure Resources for a Data Platform (Microsoft Fabric)](./infrastructure/msFabric/)
1616
- Terraform [Demonstration: Deploying Azure Resources for an ML Platform](./infrastructure/azMachineLearning/)
1717
- [Demostration: How to integrate AI in Microsoft Fabric](./msFabric-AI_integration/)
18-
- [Demostration: Creating a Machine Learning Model](./azML-modelcreation/)
18+
- [Demostration: Creating a Machine Learning Model](./azML-modelcreation/) - in progress
1919

2020
> Azure Machine Learning (PaaS) is a cloud-based platform from Microsoft designed to help `data scientists and machine learning engineers build, train, deploy, and manage machine learning models at scale`. It supports the `entire machine learning lifecycle, from data preparation and experimentation to deployment and monitoring.` It provides powerful tools for `both code-first and low-code users`, including Jupyter notebooks, drag-and-drop interfaces, and automated machine learning (AutoML). `Azure ML integrates seamlessly with other Azure services and supports popular frameworks like TensorFlow, PyTorch, and Scikit-learn.`
2121
@@ -284,9 +284,6 @@ Read more about [Endpoints for inference in production](https://learn.microsoft.
284284
</details>
285285

286286

287-
288-
289-
290287
<div align="center">
291288
<h3 style="color: #4CAF50;">Total Visitors</h3>
292289
<img src="https://profile-counter.glitch.me/brown9804/count.svg" alt="Visitor Count" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/>

azML-modelcreation/README.md

Lines changed: 207 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -5,19 +5,34 @@ Costa Rica
55
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
66
[brown9804](https://github.com/brown9804)
77

8-
Last updated: 2025-04-29
8+
Last updated: 2025-05-06
99

1010
------------------------------------------
1111

1212

1313
<details>
1414
<summary><b>List of References </b> (Click to expand)</summary>
1515

16+
- [AutoML Regression](https://learn.microsoft.com/en-us/azure/machine-learning/component-reference-v2/regression?view=azureml-api-2)
17+
- [Evaluate automated machine learning experiment results](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-understand-automated-ml?view=azureml-api-2)
18+
- [Evaluate Model component](https://learn.microsoft.com/en-us/azure/machine-learning/component-reference/evaluate-model?view=azureml-api-2)
19+
1620
</details>
1721

1822
<details>
1923
<summary><b>Table of Content </b> (Click to expand)</summary>
2024

25+
- [Step 1: Set Up Your Azure ML Workspace](#step-1-set-up-your-azure-ml-workspace)
26+
- [Step 2: Create a Compute Instance](#step-2-create-a-compute-instance)
27+
- [Step 3: Prepare Your Data](#step-3-prepare-your-data)
28+
- [Step 4: Create a New Notebook or Script](#step-4-create-a-new-notebook-or-script)
29+
- [Step 5: Load and Explore the Data](#step-5-load-and-explore-the-data)
30+
- [Step 6: Train Your Model](#step-6-train-your-model)
31+
- [Step 7: Evaluate the Model](#step-7-evaluate-the-model)
32+
- [Step 8: Register the Model](#step-8-register-the-model)
33+
- [Step 9: Deploy the Model](#step-9-deploy-the-model)
34+
- [Step 10: Test the Endpoint](#step-10-test-the-endpoint)
35+
2136
</details>
2237

2338
## Step 1: Set Up Your Azure ML Workspace
@@ -69,86 +84,239 @@ https://github.com/user-attachments/assets/c199156f-96cf-4ed0-a8b5-c88db3e7a552
6984

7085
https://github.com/user-attachments/assets/f8cbd32c-94fc-43d3-a7a8-00f63cdc543d
7186

87+
## Step 4: Create a New Notebook or Script
7288

73-
### **4. Create a New Notebook or Script**
7489
- Use the compute instance to open a **Jupyter notebook** or create a Python script.
7590
- Import necessary libraries:
91+
7692
```python
7793
import pandas as pd
7894
from sklearn.model_selection import train_test_split
7995
from sklearn.ensemble import RandomForestClassifier
8096
from sklearn.metrics import accuracy_score
8197
```
8298

83-
---
99+
https://github.com/user-attachments/assets/16650584-11cb-48fb-928d-c032e519c14b
100+
101+
## Step 5: Load and Explore the Data
102+
103+
> Load the dataset and perform basic EDA (exploratory data analysis):
84104
85-
### **5. Load and Explore the Data**
86-
- Load the dataset and perform basic EDA (exploratory data analysis):
87105
```python
88-
data = pd.read_csv('your_dataset.csv')
89-
print(data.head())
106+
import mltable
107+
from azure.ai.ml import MLClient
108+
from azure.identity import DefaultAzureCredential
109+
110+
ml_client = MLClient.from_config(credential=DefaultAzureCredential())
111+
data_asset = ml_client.data.get("employee_data", version="1")
112+
113+
tbl = mltable.load(f'azureml:/{data_asset.id}')
114+
115+
df = tbl.to_pandas_dataframe()
116+
df
90117
```
91118

92-
---
119+
https://github.com/user-attachments/assets/5fa65d95-8502-4ab7-ba0d-dfda66378cc2
93120

94-
### **6. Train Your Model**
95-
- Split the data and train a model:
96-
```python
97-
X = data.drop('target', axis=1)
98-
y = data['target']
99-
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
121+
## Step 6: Train Your Model
122+
123+
> Split the data and train a model:
100124
101-
model = RandomForestClassifier()
125+
```python
126+
# Step 1: Preprocessing
127+
from sklearn.preprocessing import LabelEncoder, StandardScaler
128+
129+
# Encode categorical columns
130+
label_encoder = LabelEncoder()
131+
df['Department'] = label_encoder.fit_transform(df['Department'])
132+
133+
# Drop non-informative or high-cardinality columns
134+
if 'Name' in df.columns:
135+
df = df.drop(columns=['Name']) # 'Name' is likely not predictive
136+
137+
# Optional: Check for missing values
138+
if df.isnull().sum().any():
139+
df = df.dropna() # or use df.fillna(method='ffill') for imputation
140+
141+
# Step 2: Define Features and Target
142+
X = df.drop('Salary', axis=1) # Features: Age and Department
143+
y = df['Salary'] # Target: Salary
144+
145+
# Optional: Feature Scaling (especially useful for models sensitive to scale)
146+
scaler = StandardScaler()
147+
X_scaled = scaler.fit_transform(X)
148+
149+
# Step 3: Split the Data
150+
from sklearn.model_selection import train_test_split
151+
152+
X_train, X_test, y_train, y_test = train_test_split(
153+
X_scaled, y, test_size=0.2, random_state=42
154+
)
155+
156+
# Step 4: Train a Regression Model
157+
from sklearn.ensemble import RandomForestRegressor
158+
159+
model = RandomForestRegressor(
160+
n_estimators=100,
161+
max_depth=None,
162+
random_state=42,
163+
n_jobs=-1 # Use all available cores
164+
)
102165
model.fit(X_train, y_train)
103166
```
104167

105-
---
168+
https://github.com/user-attachments/assets/2176c795-5fda-4746-93c7-8b137b526a09
169+
170+
## Step 7: Evaluate the Model
171+
172+
> Check performance:
106173
107-
### **7. Evaluate the Model**
108-
- Check performance:
109174
```python
175+
# Step 5: Make Predictions
110176
predictions = model.predict(X_test)
111-
print("Accuracy:", accuracy_score(y_test, predictions))
177+
178+
# Step 6: Evaluate the Model
179+
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
180+
import numpy as np
181+
182+
mae = mean_absolute_error(y_test, predictions)
183+
mse = mean_squared_error(y_test, predictions)
184+
rmse = np.sqrt(mse)
185+
r2 = r2_score(y_test, predictions)
186+
187+
print("Model Evaluation Metrics")
188+
print(f"Mean Absolute Error (MAE): {mae:.2f}")
189+
print(f"Mean Squared Error (MSE): {mse:.2f}")
190+
print(f"Root Mean Squared Error (RMSE): {rmse:.2f}")
191+
print(f"R² Score: {r2:.2f}")
112192
```
113193

114-
---
194+
<img width="550" alt="image" src="https://github.com/user-attachments/assets/6aa19680-cadb-4fe4-a419-a626942e15f9" />
195+
196+
> Distribution of prediction errors:
197+
198+
```python
199+
import matplotlib.pyplot as plt
200+
201+
# Plot 1: Distribution of prediction errors
202+
errors = y_test - predictions
203+
plt.figure(figsize=(10, 6))
204+
plt.hist(errors, bins=30, color='skyblue', edgecolor='black')
205+
plt.title('Distribution of Prediction Errors')
206+
plt.xlabel('Prediction Error')
207+
plt.ylabel('Frequency')
208+
plt.grid(True)
209+
plt.show()
210+
211+
# Plot 2: Predicted vs Actual values
212+
plt.figure(figsize=(10, 6))
213+
plt.scatter(y_test, predictions, alpha=0.3, color='darkorange')
214+
plt.plot([y_test.min(), y_test.max()], [y_test.min(), y_test.max()], 'k--', lw=2)
215+
plt.title('Predicted vs Actual Salary')
216+
plt.xlabel('Actual Salary')
217+
plt.ylabel('Predicted Salary')
218+
plt.grid(True)
219+
plt.show()
220+
```
221+
222+
<img width="550" alt="image" src="https://github.com/user-attachments/assets/d8ec1f2c-eb97-4106-9cee-809849d02796">
223+
224+
## Step 8: Register the Model
225+
226+
> Save and register the model in Azure ML:
115227
116-
### **8. Register the Model**
117-
- Save and register the model in Azure ML:
118228
```python
119229
import joblib
120230
joblib.dump(model, 'model.pkl')
121-
231+
122232
from azureml.core import Workspace, Model
123233
ws = Workspace.from_config()
124-
Model.register(workspace=ws, model_path="model.pkl", model_name="my_model")
234+
Model.register(workspace=ws, model_path="model.pkl", model_name="my_model_RegressionModel")
125235
```
126236

127-
---
237+
https://github.com/user-attachments/assets/a82ff03e-437c-41bc-85fa-8b9903384a5b
238+
239+
240+
> [!TIP]
241+
> Click [here](./src/0_ml-model-creation.ipynb) to read the script used.
242+
243+
## Step 9: Deploy the Model
244+
245+
> Create the Scoring Script:
246+
247+
```python
248+
import joblib
249+
import numpy as np
250+
from azureml.core.model import Model
251+
252+
def init():
253+
global model
254+
model_path = Model.get_model_path("my_model_RegressionModel")
255+
model = joblib.load(model_path)
256+
257+
def run(data):
258+
try:
259+
input_data = np.array(data["data"])
260+
result = model.predict(input_data)
261+
return result.tolist()
262+
except Exception as e:
263+
return str(e)
264+
```
265+
266+
https://github.com/user-attachments/assets/cdc64857-3bde-4ec9-957d-5399d9447813
267+
268+
> Create the Environment File (env.yml):
269+
270+
https://github.com/user-attachments/assets/8e7c37a2-e32b-4630-8516-f95926c374c0
271+
272+
> Create a new notebook:
273+
274+
https://github.com/user-attachments/assets/1b3e5602-dc64-4c39-be72-ed1cbd74361e
275+
276+
> Create an **inference configuration** and deploy to a web service:
128277
129-
### **9. Deploy the Model**
130-
- Create an **inference configuration** and deploy to a web service:
131278
```python
279+
from azureml.core import Workspace
132280
from azureml.core.environment import Environment
133-
from azureml.core.model import InferenceConfig
281+
from azureml.core.model import InferenceConfig, Model
134282
from azureml.core.webservice import AciWebservice
135-
136-
env = Environment.from_conda_specification(name="myenv", file_path="env.yml")
283+
284+
# Load the workspace
285+
ws = Workspace.from_config()
286+
287+
# Get the registered model
288+
registered_model = Model(ws, name="my_model_RegressionModel")
289+
290+
# Create environment from requirements.txt (no conda)
291+
env = Environment.from_pip_requirements(
292+
name="regression-env",
293+
file_path="requirements.txt" # Make sure this file exists in your working directory
294+
)
295+
296+
# Define inference configuration
137297
inference_config = InferenceConfig(entry_script="score.py", environment=env)
138-
298+
299+
# Define deployment configuration
139300
deployment_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)
140-
service = Model.deploy(workspace=ws,
141-
name="my-service",
142-
models=[model],
143-
inference_config=inference_config,
144-
deployment_config=deployment_config)
301+
302+
# Deploy the model
303+
service = Model.deploy(
304+
workspace=ws,
305+
name="regression-model-service",
306+
models=[registered_model],
307+
inference_config=inference_config,
308+
deployment_config=deployment_config
309+
)
310+
145311
service.wait_for_deployment(show_output=True)
312+
print(f"Scoring URI: {service.scoring_uri}")
146313
```
147314

148-
---
149315

150-
### **10. Test the Endpoint**
151-
- Once deployed, you can send HTTP requests to the endpoint to get predictions.
316+
317+
## Step 10: Test the Endpoint
318+
319+
> Once deployed, you can send HTTP requests to the endpoint to get predictions.
152320
153321

154322

0 commit comments

Comments
 (0)