Skip to content

Commit e8e346d

Browse files
modls
1 parent f412e5a commit e8e346d

14 files changed

+4294
-0
lines changed

going_modular/05_pytorch_going_modular_cell_mode.ipynb

Lines changed: 1394 additions & 0 deletions
Large diffs are not rendered by default.

going_modular/05_pytorch_going_modular_script_mode.ipynb

Lines changed: 2376 additions & 0 deletions
Large diffs are not rendered by default.

going_modular/README.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# 05. PyTorch Going Modular
2+
3+
The main goal of section [05. PyTorch Going Modular](https://www.learnpytorch.io/05_pytorch_going_modular/) is to: **turn useful notebook code cells into reusable Python scripts (`.py` files)**.
4+
5+
This directory contains all the necessary materials for doing so.
6+
7+
They breakdown as follows:
8+
* `going_modular/` - directory of Python helper scripts for running PyTorch code (generated by `05_pytorch_going_modular_script_mode.ipynb`).
9+
* `models/` - trained PyTorch models that come as a result of running notebook 05. Going Modular Part 1 and Part 2.
10+
* [`05_pytorch_going_modular_cell_mode.ipynb`](https://github.com/mrdbourke/pytorch-deep-learning/blob/main/going_modular/05_pytorch_going_modular_cell_mode.ipynb) - Part 1/2 notebooks for teaching the materials for section 05. This notebook takes the most useful code from notebook 04 and streamlines it.
11+
* [`05_pytorch_going_modular_script_mode.ipynb`](https://github.com/mrdbourke/pytorch-deep-learning/blob/main/going_modular/05_pytorch_going_modular_script_mode.ipynb) - Part 2/2 notebooks for teaching the materials for section 05. This notebooks turns the most useful code cells from Part 1 into the Python scripts contained in `going_modular/`.
12+
13+
For this section, we're going to see how the Part 1 notebook (cell mode) turns into the Part 2 notebook (script mode).
14+
15+
Doing this will result in us having a directory with the same structure as the `going_modular/` directory above.

going_modular/going_modular/README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# Going Modular Scripts
2+
3+
The Python scripts in this directory were generated using the notebook [05. Going Modular Part 2 (script mode)](https://github.com/mrdbourke/pytorch-deep-learning/blob/main/going_modular/05_pytorch_going_modular_script_mode.ipynb).
4+
5+
They breakdown as follows:
6+
* `data_setup.py` - a file to prepare and download data if needed.
7+
* `engine.py` - a file containing various training functions.
8+
* `model_builder.py` - a file to create a PyTorch TinyVGG model.
9+
* `train.py` - a file to leverage all other files and train a target PyTorch model.
10+
* `utils.py` - a file dedicated to helpful utility functions.
11+
* **Extra:** `predictions.py` - a file for making predictions with a trained PyTorch model and input image (the main function, `pred_and_plot_image()` was originally created in [06. PyTorch Transfer Learning section 6](https://www.learnpytorch.io/06_pytorch_transfer_learning/#6-make-predictions-on-images-from-the-test-set)).
12+
13+
For an explanation of how this was done, refer to section [05. PyTorch Going Modular of the learnpytorch.io book](https://www.learnpytorch.io/05_pytorch_going_modular/).
Binary file not shown.
Binary file not shown.
Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
"""
2+
Contains functionality for creating PyTorch DataLoaders for
3+
image classification data.
4+
"""
5+
import os
6+
7+
from torchvision import datasets, transforms
8+
from torch.utils.data import DataLoader
9+
10+
NUM_WORKERS = os.cpu_count()
11+
12+
def create_dataloaders(
13+
train_dir: str,
14+
test_dir: str,
15+
transform: transforms.Compose,
16+
batch_size: int,
17+
num_workers: int=NUM_WORKERS
18+
):
19+
"""Creates training and testing DataLoaders.
20+
21+
Takes in a training directory and testing directory path and turns
22+
them into PyTorch Datasets and then into PyTorch DataLoaders.
23+
24+
Args:
25+
train_dir: Path to training directory.
26+
test_dir: Path to testing directory.
27+
transform: torchvision transforms to perform on training and testing data.
28+
batch_size: Number of samples per batch in each of the DataLoaders.
29+
num_workers: An integer for number of workers per DataLoader.
30+
31+
Returns:
32+
A tuple of (train_dataloader, test_dataloader, class_names).
33+
Where class_names is a list of the target classes.
34+
Example usage:
35+
train_dataloader, test_dataloader, class_names = \
36+
= create_dataloaders(train_dir=path/to/train_dir,
37+
test_dir=path/to/test_dir,
38+
transform=some_transform,
39+
batch_size=32,
40+
num_workers=4)
41+
"""
42+
# Use ImageFolder to create dataset(s)
43+
train_data = datasets.ImageFolder(train_dir, transform=transform)
44+
test_data = datasets.ImageFolder(test_dir, transform=transform)
45+
46+
# Get class names
47+
class_names = train_data.classes
48+
49+
# Turn images into data loaders
50+
train_dataloader = DataLoader(
51+
train_data,
52+
batch_size=batch_size,
53+
shuffle=True,
54+
num_workers=num_workers,
55+
pin_memory=True,
56+
)
57+
test_dataloader = DataLoader(
58+
test_data,
59+
batch_size=batch_size,
60+
shuffle=False,
61+
num_workers=num_workers,
62+
pin_memory=True,
63+
)
64+
65+
return train_dataloader, test_dataloader, class_names

going_modular/going_modular/engine.py

Lines changed: 195 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,195 @@
1+
"""
2+
Contains functions for training and testing a PyTorch model.
3+
"""
4+
import torch
5+
6+
from tqdm.auto import tqdm
7+
from typing import Dict, List, Tuple
8+
9+
def train_step(model: torch.nn.Module,
10+
dataloader: torch.utils.data.DataLoader,
11+
loss_fn: torch.nn.Module,
12+
optimizer: torch.optim.Optimizer,
13+
device: torch.device) -> Tuple[float, float]:
14+
"""Trains a PyTorch model for a single epoch.
15+
16+
Turns a target PyTorch model to training mode and then
17+
runs through all of the required training steps (forward
18+
pass, loss calculation, optimizer step).
19+
20+
Args:
21+
model: A PyTorch model to be trained.
22+
dataloader: A DataLoader instance for the model to be trained on.
23+
loss_fn: A PyTorch loss function to minimize.
24+
optimizer: A PyTorch optimizer to help minimize the loss function.
25+
device: A target device to compute on (e.g. "cuda" or "cpu").
26+
27+
Returns:
28+
A tuple of training loss and training accuracy metrics.
29+
In the form (train_loss, train_accuracy). For example:
30+
31+
(0.1112, 0.8743)
32+
"""
33+
# Put model in train mode
34+
model.train()
35+
36+
# Setup train loss and train accuracy values
37+
train_loss, train_acc = 0, 0
38+
39+
# Loop through data loader data batches
40+
for batch, (X, y) in enumerate(dataloader):
41+
# Send data to target device
42+
X, y = X.to(device), y.to(device)
43+
44+
# 1. Forward pass
45+
y_pred = model(X)
46+
47+
# 2. Calculate and accumulate loss
48+
loss = loss_fn(y_pred, y)
49+
train_loss += loss.item()
50+
51+
# 3. Optimizer zero grad
52+
optimizer.zero_grad()
53+
54+
# 4. Loss backward
55+
loss.backward()
56+
57+
# 5. Optimizer step
58+
optimizer.step()
59+
60+
# Calculate and accumulate accuracy metric across all batches
61+
y_pred_class = torch.argmax(torch.softmax(y_pred, dim=1), dim=1)
62+
train_acc += (y_pred_class == y).sum().item()/len(y_pred)
63+
64+
# Adjust metrics to get average loss and accuracy per batch
65+
train_loss = train_loss / len(dataloader)
66+
train_acc = train_acc / len(dataloader)
67+
return train_loss, train_acc
68+
69+
def test_step(model: torch.nn.Module,
70+
dataloader: torch.utils.data.DataLoader,
71+
loss_fn: torch.nn.Module,
72+
device: torch.device) -> Tuple[float, float]:
73+
"""Tests a PyTorch model for a single epoch.
74+
75+
Turns a target PyTorch model to "eval" mode and then performs
76+
a forward pass on a testing dataset.
77+
78+
Args:
79+
model: A PyTorch model to be tested.
80+
dataloader: A DataLoader instance for the model to be tested on.
81+
loss_fn: A PyTorch loss function to calculate loss on the test data.
82+
device: A target device to compute on (e.g. "cuda" or "cpu").
83+
84+
Returns:
85+
A tuple of testing loss and testing accuracy metrics.
86+
In the form (test_loss, test_accuracy). For example:
87+
88+
(0.0223, 0.8985)
89+
"""
90+
# Put model in eval mode
91+
model.eval()
92+
93+
# Setup test loss and test accuracy values
94+
test_loss, test_acc = 0, 0
95+
96+
# Turn on inference context manager
97+
with torch.inference_mode():
98+
# Loop through DataLoader batches
99+
for batch, (X, y) in enumerate(dataloader):
100+
# Send data to target device
101+
X, y = X.to(device), y.to(device)
102+
103+
# 1. Forward pass
104+
test_pred_logits = model(X)
105+
106+
# 2. Calculate and accumulate loss
107+
loss = loss_fn(test_pred_logits, y)
108+
test_loss += loss.item()
109+
110+
# Calculate and accumulate accuracy
111+
test_pred_labels = test_pred_logits.argmax(dim=1)
112+
test_acc += ((test_pred_labels == y).sum().item()/len(test_pred_labels))
113+
114+
# Adjust metrics to get average loss and accuracy per batch
115+
test_loss = test_loss / len(dataloader)
116+
test_acc = test_acc / len(dataloader)
117+
return test_loss, test_acc
118+
119+
def train(model: torch.nn.Module,
120+
train_dataloader: torch.utils.data.DataLoader,
121+
test_dataloader: torch.utils.data.DataLoader,
122+
optimizer: torch.optim.Optimizer,
123+
loss_fn: torch.nn.Module,
124+
epochs: int,
125+
device: torch.device) -> Dict[str, List]:
126+
"""Trains and tests a PyTorch model.
127+
128+
Passes a target PyTorch models through train_step() and test_step()
129+
functions for a number of epochs, training and testing the model
130+
in the same epoch loop.
131+
132+
Calculates, prints and stores evaluation metrics throughout.
133+
134+
Args:
135+
model: A PyTorch model to be trained and tested.
136+
train_dataloader: A DataLoader instance for the model to be trained on.
137+
test_dataloader: A DataLoader instance for the model to be tested on.
138+
optimizer: A PyTorch optimizer to help minimize the loss function.
139+
loss_fn: A PyTorch loss function to calculate loss on both datasets.
140+
epochs: An integer indicating how many epochs to train for.
141+
device: A target device to compute on (e.g. "cuda" or "cpu").
142+
143+
Returns:
144+
A dictionary of training and testing loss as well as training and
145+
testing accuracy metrics. Each metric has a value in a list for
146+
each epoch.
147+
In the form: {train_loss: [...],
148+
train_acc: [...],
149+
test_loss: [...],
150+
test_acc: [...]}
151+
For example if training for epochs=2:
152+
{train_loss: [2.0616, 1.0537],
153+
train_acc: [0.3945, 0.3945],
154+
test_loss: [1.2641, 1.5706],
155+
test_acc: [0.3400, 0.2973]}
156+
"""
157+
# Create empty results dictionary
158+
results = {"train_loss": [],
159+
"train_acc": [],
160+
"test_loss": [],
161+
"test_acc": []
162+
}
163+
164+
# Make sure model on target device
165+
model.to(device)
166+
167+
# Loop through training and testing steps for a number of epochs
168+
for epoch in tqdm(range(epochs)):
169+
train_loss, train_acc = train_step(model=model,
170+
dataloader=train_dataloader,
171+
loss_fn=loss_fn,
172+
optimizer=optimizer,
173+
device=device)
174+
test_loss, test_acc = test_step(model=model,
175+
dataloader=test_dataloader,
176+
loss_fn=loss_fn,
177+
device=device)
178+
179+
# Print out what's happening
180+
print(
181+
f"Epoch: {epoch+1} | "
182+
f"train_loss: {train_loss:.4f} | "
183+
f"train_acc: {train_acc:.4f} | "
184+
f"test_loss: {test_loss:.4f} | "
185+
f"test_acc: {test_acc:.4f}"
186+
)
187+
188+
# Update results dictionary
189+
results["train_loss"].append(train_loss)
190+
results["train_acc"].append(train_acc)
191+
results["test_loss"].append(test_loss)
192+
results["test_acc"].append(test_acc)
193+
194+
# Return the filled results at the end of the epochs
195+
return results
Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,56 @@
1+
"""
2+
Contains PyTorch model code to instantiate a TinyVGG model.
3+
"""
4+
import torch
5+
from torch import nn
6+
7+
class TinyVGG(nn.Module):
8+
"""Creates the TinyVGG architecture.
9+
10+
Replicates the TinyVGG architecture from the CNN explainer website in PyTorch.
11+
See the original architecture here: https://poloclub.github.io/cnn-explainer/
12+
13+
Args:
14+
input_shape: An integer indicating number of input channels.
15+
hidden_units: An integer indicating number of hidden units between layers.
16+
output_shape: An integer indicating number of output units.
17+
"""
18+
def __init__(self, input_shape: int, hidden_units: int, output_shape: int) -> None:
19+
super().__init__()
20+
self.conv_block_1 = nn.Sequential(
21+
nn.Conv2d(in_channels=input_shape,
22+
out_channels=hidden_units,
23+
kernel_size=3,
24+
stride=1,
25+
padding=0),
26+
nn.ReLU(),
27+
nn.Conv2d(in_channels=hidden_units,
28+
out_channels=hidden_units,
29+
kernel_size=3,
30+
stride=1,
31+
padding=0),
32+
nn.ReLU(),
33+
nn.MaxPool2d(kernel_size=2,
34+
stride=2)
35+
)
36+
self.conv_block_2 = nn.Sequential(
37+
nn.Conv2d(hidden_units, hidden_units, kernel_size=3, padding=0),
38+
nn.ReLU(),
39+
nn.Conv2d(hidden_units, hidden_units, kernel_size=3, padding=0),
40+
nn.ReLU(),
41+
nn.MaxPool2d(2)
42+
)
43+
self.classifier = nn.Sequential(
44+
nn.Flatten(),
45+
# Where did this in_features shape come from?
46+
# It's because each layer of our network compresses and changes the shape of our inputs data.
47+
nn.Linear(in_features=hidden_units*13*13,
48+
out_features=output_shape)
49+
)
50+
51+
def forward(self, x: torch.Tensor):
52+
x = self.conv_block_1(x)
53+
x = self.conv_block_2(x)
54+
x = self.classifier(x)
55+
return x
56+
# return self.classifier(self.block_2(self.block_1(x))) # <- leverage the benefits of operator fusion

0 commit comments

Comments
 (0)