Skip to content

Commit d5d6341

Browse files
authored
Merge branch 'main' into christian/test-model-metric-data
2 parents 1add669 + 365d389 commit d5d6341

37 files changed

+3840
-512
lines changed

.github/workflows/test.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,5 +25,5 @@ jobs:
2525

2626
- name: Run tests
2727
run: |
28-
PYTHONPATH=. pytest tests
28+
PYTHONPATH=. pytest .
2929
shell: bash -el {0}

.gitignore

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,20 @@
11
__pycache__/
22
.ipynb_checkpoints/
3-
Data/
4-
Results/
5-
Experiments/
3+
Data/*
4+
Results/*
5+
Experiments/*
66
_build/
7-
bin/
8-
wandb/
7+
bin/*
8+
wandb/*
99
wandb_api.py
1010
doc/autoapi
1111

12+
#Magnus specific
13+
job*
14+
env2/*
15+
ruffian.sh
16+
localtest.sh
17+
1218
# Byte-compiled / optimized / DLL files
1319
__pycache__/
1420
*.py[cod]
@@ -147,6 +153,7 @@ ENV/
147153
env.bak/
148154
venv.bak/
149155

156+
150157
# Spyder project settings
151158
.spyderproject
152159
.spyproject

.python-version

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
3.12

doc/Magnus_page.md

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
Magnus Individual Task
2+
======================
3+
4+
# Magnus Størdal Individual Task
5+
6+
## Task overview
7+
In addition to the overall task, I was tasked to implement a three layer linear network, a dataset loader for the SVHN dataset, and a entropy metric.
8+
9+
## Network Implementation In-Depth
10+
For the network part I was tasked with making a three-layer linear network where each layer conists of 133 neurons. This is a fairly straightforward implementation where we make a custom class which inherits from the PyTorch Module class. This allows for our class to have two methods. The __init__ method and a forward method. When we make an instance of the class we'll be able to call the instance like we would call a function, and have it run the forward method.
11+
12+
The network is initialized with the following metrics:
13+
* image_shape
14+
* num_classes
15+
* nr_channels
16+
17+
The num_classes argument is used to define the number of output neurons. Each dataset has somewhere between 5 and 10 classes, and as such there isn't a single output size works well.
18+
19+
As each layer is a linear layer we need to initialize the network with respect to the image size. We are working with datasets which are either greyscale or color images, and can be any height and width. Therefore we have the image_shape argument, which provides the information on the image height and width, and the nr_channels argument which states the number of channels we use. With these values we initialize the first layer accordingly, that is: height * width * channels inputsize.
20+
21+
The forward method in this class has an assertion making sure the input has four channels, they being batch size, channels, height and width.
22+
Each input is flattened over the channel, height and width channels. Then they are passed through each layer and the resulting logits are returned.
23+
24+
25+
## SVHN Dataset In-Depth
26+
27+
28+
29+
30+
## Entropy Metric In-Depth
31+
32+
The EntropyPrediction class' main job is to take some inputs and return the Shannon Entropy metric of those inputs. The class has four methods with the following jobs:
33+
* __init__ : Initialize the class.
34+
* __call__ : Main method which is used to calculate and store the batchwise shannon entropy.
35+
* __returnmetric__ : Returns the collected metric.
36+
* __reset__ : Removes all the stored values up until that point. Readies the instance for storing values from a new epoch.
37+
38+
The class is initialized with a single parameter called "averages". This is inspired from other PyTorch and NumPy implementations and controlls how values from different batches or within batches will be combined. The __init__ method checks the value of this argument with an assertion, which must be one of three string. We only allow "mean", "sum" and "none" as methods of combining the different entropy values. We'll come back to the specifics here.
39+
Furthermore, this method will also store the different Shannon Entropy values as we pass values into the __call__ method.
40+
41+
In __call__ we get both true labels and model logit scores for each sample in the batch as input. We're calculating Shannon Entropy, not KL-divergence, so the true labels aren't needed.
42+
With permission I've used the scipy implementation to calculate entropy here. We apply a softmax over the logit values, then calculate the Shannon Entropy, and make sure to remove any NaN or Inf values which might arise from a perfect guess/distribution.
43+
44+
Next we have the __returnmetric__ method which is used to retrive the stored metric. Here the averages argument comes into play.
45+
Depending on what has been chosen as the averaging metric when initializing the class, one of the following operations will be applied to the stored values:
46+
* Mean: Calculate the mean of the stored entropy values.
47+
* Sum: Sum the stored entropy values.
48+
* None: Do nothing with the stored entropy values.
49+
Then the value(s) are returned.
50+
51+
Lastly we have the __reset__ method which simply emptied the variable which stores the entropy values to prepare it for the next epoch.

doc/index.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,4 +12,8 @@ culpa qui officia deserunt mollit anim id est laborum.
1212
:caption: Some caption
1313

1414
about.md
15+
Magnus_page.md
1516
:::
17+
18+
Individual Sections
19+
===================

docker/Dockerfile

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
FROM pytorch/pytorch:2.4.1-cuda11.8-cudnn9-runtime
2+
WORKDIR /tmp/
3+
COPY requirements.txt .
4+
RUN apt-get update
5+
RUN pip install -r requirements.txt
6+
RUN apt-get install ffmpeg libsm6 libxext6 -y git
7+
RUN pip install ftfy regex tqdm

docker/createdocker.sh

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
#!/bin/sh
2+
3+
sudo chmod 666 /var/run/docker.sock
4+
5+
docker build docker -t seilmast/colabexam:latest
6+
docker push seilmast/colabexam:latest

docker/requirements.txt

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
annotated-types==0.7.0
2+
asttokens==3.0.0
3+
certifi==2024.12.14
4+
charset-normalizer==3.4.1
5+
click==8.1.8
6+
comm==0.2.2
7+
debugpy==1.8.12
8+
decorator==5.1.1
9+
docker-pycreds==0.4.0
10+
executing==2.2.0
11+
filelock==3.13.1
12+
fsspec==2024.6.1
13+
gitdb==4.0.12
14+
GitPython==3.1.44
15+
h5py==3.12.1
16+
idna==3.10
17+
iniconfig==2.0.0
18+
ipykernel==6.29.5
19+
ipython==8.31.0
20+
jedi==0.19.2
21+
Jinja2==3.1.4
22+
jupyter_client==8.6.3
23+
jupyter_core==5.7.2
24+
MarkupSafe==2.1.5
25+
matplotlib-inline==0.1.7
26+
mpmath==1.3.0
27+
nest-asyncio==1.6.0
28+
networkx==3.3
29+
numpy==2.1.2
30+
packaging==24.2
31+
parso==0.8.4
32+
pexpect==4.9.0
33+
pillow==11.0.0
34+
platformdirs==4.3.6
35+
pluggy==1.5.0
36+
prompt_toolkit==3.0.50
37+
protobuf==5.29.3
38+
psutil==6.1.1
39+
ptyprocess==0.7.0
40+
pure_eval==0.2.3
41+
pydantic==2.10.6
42+
pydantic_core==2.27.2
43+
Pygments==2.19.1
44+
pytest==8.3.4
45+
python-dateutil==2.9.0.post0
46+
PyYAML==6.0.2
47+
pyzmq==26.2.1
48+
requests==2.32.3
49+
scipy==1.15.1
50+
sentry-sdk==2.20.0
51+
setproctitle==1.3.4
52+
six==1.17.0
53+
smmap==5.0.2
54+
stack-data==0.6.3
55+
sympy==1.13.1
56+
tornado==6.4.2
57+
traitlets==5.14.3
58+
typing_extensions==4.12.2
59+
urllib3==2.3.0
60+
wandb==0.19.5
61+
wcwidth==0.2.13

environment.yml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,8 @@ dependencies:
99
- sphinx-autobuild
1010
- sphinx-rtd-theme
1111
- pip
12-
- h5py
12+
- h5py==3.12.1
13+
- hdf5==1.14.4
1314
- black
1415
- isort
1516
- jupyterlab
@@ -19,6 +20,9 @@ dependencies:
1920
- ruff
2021
- scalene
2122
- tqdm
23+
- scipy
24+
- wandb
25+
- scikit-learn
2226
- pip:
2327
- torch
2428
- torchvision

main.py

Lines changed: 67 additions & 41 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,11 @@
1-
from pathlib import Path
2-
31
import numpy as np
42
import torch as th
53
import torch.nn as nn
64
import wandb
75
from torch.utils.data import DataLoader
86
from torchvision import transforms
97
from tqdm import tqdm
8+
from wandb_api import WANDB_API
109

1110
from utils import MetricWrapper, createfolders, get_args, load_data, load_model
1211

@@ -24,39 +23,45 @@ def main():
2423
------
2524
2625
"""
26+
2727
args = get_args()
2828

2929
createfolders(args.datafolder, args.resultfolder, args.modelfolder)
3030

3131
device = args.device
3232

33-
if args.dataset.lower() in ["usps_0-6", "uspsh5_7_9"]:
34-
augmentations = transforms.Compose(
33+
if "usps" in args.dataset.lower():
34+
transform = transforms.Compose(
3535
[
36-
transforms.Resize((16, 16)),
36+
transforms.Resize((28, 28)),
3737
transforms.ToTensor(),
3838
]
3939
)
4040
else:
41-
augmentations = transforms.Compose([transforms.ToTensor()])
41+
transform = transforms.Compose([transforms.ToTensor()])
4242

43-
# Dataset
44-
traindata = load_data(
45-
args.dataset,
46-
train=True,
47-
data_path=args.datafolder,
48-
download=args.download_data,
49-
transform=augmentations,
50-
)
51-
validata = load_data(
43+
traindata, validata, testdata = load_data(
5244
args.dataset,
53-
train=False,
54-
data_path=args.datafolder,
55-
download=args.download_data,
56-
transform=augmentations,
45+
data_dir=args.datafolder,
46+
transform=transform,
47+
val_size=args.val_size,
5748
)
5849

59-
metrics = MetricWrapper(*args.metric, num_classes=traindata.num_classes)
50+
train_metrics = MetricWrapper(
51+
*args.metric,
52+
num_classes=traindata.num_classes,
53+
macro_averaging=args.macro_averaging,
54+
)
55+
val_metrics = MetricWrapper(
56+
*args.metric,
57+
num_classes=traindata.num_classes,
58+
macro_averaging=args.macro_averaging,
59+
)
60+
test_metrics = MetricWrapper(
61+
*args.metric,
62+
num_classes=traindata.num_classes,
63+
macro_averaging=args.macro_averaging,
64+
)
6065

6166
# Find the shape of the data, if is 2D, add a channel dimension
6267
data_shape = traindata[0][0].shape
@@ -81,6 +86,9 @@ def main():
8186
valiloader = DataLoader(
8287
validata, batch_size=args.batchsize, shuffle=False, pin_memory=True
8388
)
89+
testloader = DataLoader(
90+
testdata, batch_size=args.batchsize, shuffle=False, pin_memory=True
91+
)
8492

8593
criterion = nn.CrossEntropyLoss()
8694
optimizer = th.optim.Adam(model.parameters(), lr=args.learning_rate)
@@ -105,16 +113,20 @@ def main():
105113
optimizer.step()
106114
optimizer.zero_grad(set_to_none=True)
107115

108-
preds = th.argmax(logits, dim=1)
109-
metrics(y, preds)
116+
train_metrics(y, logits)
110117

111118
break
112-
print(metrics.accumulate())
119+
print(train_metrics.accumulate())
113120
print("Dry run completed successfully.")
114-
exit(0)
115-
116-
wandb.login(key=WANDB_API)
117-
wandb.init(entity="ColabCode", project="Jan", tags=[args.modelname, args.dataset])
121+
exit()
122+
123+
# wandb.login(key=WANDB_API)
124+
wandb.init(
125+
entity="ColabCode",
126+
project=args.run_name,
127+
tags=[args.modelname, args.dataset],
128+
config=args,
129+
)
118130
wandb.watch(model)
119131

120132
for epoch in range(args.epoch):
@@ -132,35 +144,49 @@ def main():
132144
optimizer.zero_grad(set_to_none=True)
133145
trainingloss.append(loss.item())
134146

135-
preds = th.argmax(logits, dim=1)
136-
metrics(y, preds)
137-
138-
wandb.log(metrics.accumulate(str_prefix="Train "))
139-
metrics.reset()
147+
train_metrics(y, logits)
140148

141-
evalloss = []
142-
# Eval loop start
149+
valloss = []
150+
# Validation loop start
143151
model.eval()
144152
with th.no_grad():
145153
for x, y in tqdm(valiloader, desc="Validation"):
146154
x, y = x.to(device), y.to(device)
147155
logits = model.forward(x)
148156
loss = criterion(logits, y)
149-
evalloss.append(loss.item())
157+
valloss.append(loss.item())
150158

151-
preds = th.argmax(logits, dim=1)
152-
metrics(y, preds)
153-
154-
wandb.log(metrics.accumulate(str_prefix="Evaluation "))
155-
metrics.reset()
159+
val_metrics(y, logits)
156160

157161
wandb.log(
158162
{
159163
"Epoch": epoch,
160164
"Train loss": np.mean(trainingloss),
161-
"Evaluation Loss": np.mean(evalloss),
165+
"Validation loss": np.mean(valloss),
162166
}
167+
| train_metrics.__getmetrics__(str_prefix="Train ")
168+
| val_metrics.__getmetrics__(str_prefix="Validation ")
163169
)
170+
train_metrics.__resetmetrics__()
171+
val_metrics.__resetmetrics__()
172+
173+
testloss = []
174+
model.eval()
175+
with th.no_grad():
176+
for x, y in tqdm(testloader, desc="Testing"):
177+
x, y = x.to(device), y.to(device)
178+
logits = model.forward(x)
179+
loss = criterion(logits, y)
180+
testloss.append(loss.item())
181+
182+
preds = th.argmax(logits, dim=1)
183+
test_metrics(y, preds)
184+
185+
wandb.log(
186+
{"Epoch": 1, "Test loss": np.mean(testloss)}
187+
| test_metrics.__getmetrics__(str_prefix="Test ")
188+
)
189+
test_metrics.__resetmetrics__()
164190

165191

166192
if __name__ == "__main__":

0 commit comments

Comments
 (0)