SFI-Visual-Intelligence
diff --git a/‎.github/workflows/test.yml‎
Lines changed: 1 addition & 1 deletion b/‎.github/workflows/test.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎.gitignore‎
Lines changed: 4 additions & 1 deletion b/‎.gitignore‎
Lines changed: 4 additions & 1 deletion
diff --git a/‎.python-version‎
Lines changed: 1 addition & 0 deletions b/‎.python-version‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎doc/Magnus_page.md‎
Lines changed: 51 additions & 0 deletions b/‎doc/Magnus_page.md‎
Lines changed: 51 additions & 0 deletions
diff --git a/‎doc/index.md‎
Lines changed: 4 additions & 0 deletions b/‎doc/index.md‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎docker/Dockerfile‎
Lines changed: 7 additions & 0 deletions b/‎docker/Dockerfile‎
Lines changed: 7 additions & 0 deletions
diff --git a/‎docker/createdocker.sh‎
Lines changed: 6 additions & 0 deletions b/‎docker/createdocker.sh‎
Lines changed: 6 additions & 0 deletions
diff --git a/‎docker/requirements.txt‎
Lines changed: 61 additions & 0 deletions b/‎docker/requirements.txt‎
Lines changed: 61 additions & 0 deletions
diff --git a/‎environment.yml‎
Lines changed: 4 additions & 1 deletion b/‎environment.yml‎
Lines changed: 4 additions & 1 deletion
diff --git a/‎main.py‎
Lines changed: 60 additions & 31 deletions b/‎main.py‎
Lines changed: 60 additions & 31 deletions
@@ -25,5 +25,5 @@ jobs:
 
     - name: Run tests
       run: |
-        PYTHONPATH=. pytest tests
+        PYTHONPATH=. pytest 4
       shell: bash -el {0}
@@ -9,8 +9,10 @@ wandb/*
 wandb_api.py
 
 #Magnus specific
-docker/*
 job*
+env2/*
+ruffian.sh 
+localtest.sh
 
 # Byte-compiled / optimized / DLL files
 __pycache__/
@@ -150,6 +152,7 @@ ENV/
 env.bak/
 venv.bak/
 
+
 # Spyder project settings
 .spyderproject
 .spyproject
 
@@ -0,0 +1 @@
+3.12
@@ -0,0 +1,51 @@
+Magnus Individual Task
+======================
+
+# Magnus Størdal Individual Task
+
+## Task overview
+In addition to the overall task, I was tasked to implement a three layer linear network, a dataset loader for the SVHN dataset, and a entropy metric.
+
+## Network Implementation In-Depth
+For the network part I was tasked with making a three-layer linear network where each layer conists of 133 neurons. This is a fairly straightforward implementation where we make a custom class which inherits from the PyTorch Module class. This allows for our class to have two methods. The __init__ method and a forward method. When we make an instance of the class we'll be able to call the instance like we would call a function, and have it run the forward method. 
+
+The network is initialized with the following metrics: 
+* image_shape
+* num_classes
+* nr_channels
+
+The num_classes argument is used to define the number of output neurons. Each dataset has somewhere between 5 and 10 classes, and as such there isn't a single output size works well. 
+
+As each layer is a linear layer we need to initialize the network with respect to the image size. We are working with datasets which are either greyscale or color images, and can be any height and width. Therefore we have the image_shape argument, which provides the information on the image height and width, and the nr_channels argument which states the number of channels we use. With these values we initialize the first layer accordingly, that is: height * width * channels inputsize. 
+
+The forward method in this class has an assertion making sure the input has four channels, they being batch size, channels, height and width. 
+Each input is flattened over the channel, height and width channels. Then they are passed through each layer and the resulting logits are returned.
+
+
+## SVHN Dataset In-Depth
+
+
+
+
+## Entropy Metric In-Depth
+
+The EntropyPrediction class' main job is to take some inputs and return the Shannon Entropy metric of those inputs. The class has four methods with the following jobs: 
+* __init__ : Initialize the class.
+* __call__ : Main method which is used to calculate and store the batchwise shannon entropy.
+* __returnmetric__ : Returns the collected metric. 
+* __reset__ : Removes all the stored values up until that point. Readies the instance for storing values from a new epoch. 
+
+The class is initialized with a single parameter called "averages". This is inspired from other PyTorch and NumPy implementations and controlls how values from different batches or within batches will be combined. The __init__ method checks the value of this argument with an assertion, which must be one of three string. We only allow "mean", "sum" and "none" as methods of combining the different entropy values. We'll come back to the specifics here.  
+Furthermore, this method will also store the different Shannon Entropy values as we pass values into the __call__ method. 
+
+In __call__ we get both true labels and model logit scores for each sample in the batch as input. We're calculating Shannon Entropy, not KL-divergence, so the true labels aren't needed. 
+With permission I've used the scipy implementation to calculate entropy here. We apply a softmax over the logit values, then calculate the Shannon Entropy, and make sure to remove any NaN or Inf values which might arise from a perfect guess/distribution.
+
+Next we have the __returnmetric__ method which is used to retrive the stored metric. Here the averages argument comes into play. 
+Depending on what has been chosen as the averaging metric when initializing the class, one of the following operations will be applied to the stored values:
+* Mean: Calculate the mean of the stored entropy values.
+* Sum: Sum the stored entropy values.
+* None: Do nothing with the stored entropy values. 
+Then the value(s) are returned. 
+
+Lastly we have the __reset__ method which simply emptied the variable which stores the entropy values to prepare it for the next epoch. 
@@ -12,4 +12,8 @@ culpa qui officia deserunt mollit anim id est laborum.
 :caption: Some caption
 
 about.md
+Magnus_page.md
 :::
+
+Individual Sections
+===================
@@ -0,0 +1,7 @@
+FROM pytorch/pytorch:2.4.1-cuda11.8-cudnn9-runtime
+WORKDIR /tmp/
+COPY requirements.txt .
+RUN apt-get update
+RUN pip install -r requirements.txt
+RUN apt-get install ffmpeg libsm6 libxext6 -y git
+RUN pip install  ftfy regex tqdm
@@ -0,0 +1,6 @@
+#!/bin/sh
+
+sudo chmod 666 /var/run/docker.sock
+
+docker build docker -t seilmast/colabexam:latest
+docker push seilmast/colabexam:latest
@@ -0,0 +1,61 @@
+annotated-types==0.7.0
+asttokens==3.0.0
+certifi==2024.12.14
+charset-normalizer==3.4.1
+click==8.1.8
+comm==0.2.2
+debugpy==1.8.12
+decorator==5.1.1
+docker-pycreds==0.4.0
+executing==2.2.0
+filelock==3.13.1
+fsspec==2024.6.1
+gitdb==4.0.12
+GitPython==3.1.44
+h5py==3.12.1
+idna==3.10
+iniconfig==2.0.0
+ipykernel==6.29.5
+ipython==8.31.0
+jedi==0.19.2
+Jinja2==3.1.4
+jupyter_client==8.6.3
+jupyter_core==5.7.2
+MarkupSafe==2.1.5
+matplotlib-inline==0.1.7
+mpmath==1.3.0
+nest-asyncio==1.6.0
+networkx==3.3
+numpy==2.1.2
+packaging==24.2
+parso==0.8.4
+pexpect==4.9.0
+pillow==11.0.0
+platformdirs==4.3.6
+pluggy==1.5.0
+prompt_toolkit==3.0.50
+protobuf==5.29.3
+psutil==6.1.1
+ptyprocess==0.7.0
+pure_eval==0.2.3
+pydantic==2.10.6
+pydantic_core==2.27.2
+Pygments==2.19.1
+pytest==8.3.4
+python-dateutil==2.9.0.post0
+PyYAML==6.0.2
+pyzmq==26.2.1
+requests==2.32.3
+scipy==1.15.1
+sentry-sdk==2.20.0
+setproctitle==1.3.4
+six==1.17.0
+smmap==5.0.2
+stack-data==0.6.3
+sympy==1.13.1
+tornado==6.4.2
+traitlets==5.14.3
+typing_extensions==4.12.2
+urllib3==2.3.0
+wandb==0.19.5
+wcwidth==0.2.13
@@ -9,7 +9,8 @@ dependencies:
   - sphinx-autobuild
   - sphinx-rtd-theme
   - pip
-  - h5py
+  - h5py==3.12.1
+  - hdf5==1.14.4
   - black
   - isort
   - jupyterlab
@@ -20,6 +21,8 @@ dependencies:
   - scalene
   - tqdm
   - scipy
+  - wandb
+  - scikit-learn
   - pip:
     - torch
     - torchvision
 
@@ -5,6 +5,7 @@
 from torch.utils.data import DataLoader
 from torchvision import transforms
 from tqdm import tqdm
+from wandb_api import WANDB_API
 
 from utils import MetricWrapper, createfolders, get_args, load_data, load_model
 
@@ -29,7 +30,9 @@ def main():
 
     device = args.device
 
+
     if "usps" in args.dataset.lower():
+
         transform = transforms.Compose(
             [
                 transforms.Resize((28, 28)),
@@ -39,23 +42,29 @@ def main():
     else:
         transform = transforms.Compose([transforms.ToTensor()])
 
-    # Dataset
-    traindata = load_data(
-        args.dataset,
-        train=True,
-        data_path=args.datafolder,
-        download=args.download_data,
-        transform=transform,
-    )
-    validata = load_data(
+    traindata, validata, testdata = load_data(
         args.dataset,
-        train=False,
-        data_path=args.datafolder,
-        download=args.download_data,
+        data_dir=args.datafolder,
         transform=transform,
+        val_size=args.val_size,
+
     )
 
-    metrics = MetricWrapper(*args.metric, num_classes=traindata.num_classes)
+    train_metrics = MetricWrapper(
+        *args.metric,
+        num_classes=traindata.num_classes,
+        macro_averaging=args.macro_averaging,
+    )
+    val_metrics = MetricWrapper(
+        *args.metric,
+        num_classes=traindata.num_classes,
+        macro_averaging=args.macro_averaging,
+    )
+    test_metrics = MetricWrapper(
+        *args.metric,
+        num_classes=traindata.num_classes,
+        macro_averaging=args.macro_averaging,
+    )
 
     # Find the shape of the data, if is 2D, add a channel dimension
     data_shape = traindata[0][0].shape
@@ -80,6 +89,9 @@ def main():
     valiloader = DataLoader(
         validata, batch_size=args.batchsize, shuffle=False, pin_memory=True
     )
+    testloader = DataLoader(
+        testdata, batch_size=args.batchsize, shuffle=False, pin_memory=True
+    )
 
     criterion = nn.CrossEntropyLoss()
     optimizer = th.optim.Adam(model.parameters(), lr=args.learning_rate)
@@ -104,22 +116,23 @@ def main():
             optimizer.step()
             optimizer.zero_grad(set_to_none=True)
 
-            metrics(y, logits)
+            train_metrics(y, logits)
 
             break
-        print(metrics.accumulate())
+        print(train_metrics.accumulate())
         print("Dry run completed successfully.")
         exit()
 
     # wandb.login(key=WANDB_API)
     wandb.init(
-        entity="ColabCode-org",
-        # entity="FYS-8805 Exam",
-        project="Test",
+        entity="ColabCode",
+        project=args.run_name,
         tags=[args.modelname, args.dataset],
+        config=args,
+
     )
     wandb.watch(model)
-    exit()
+
     for epoch in range(args.epoch):
         # Training loop start
         trainingloss = []
@@ -135,33 +148,49 @@ def main():
             optimizer.zero_grad(set_to_none=True)
             trainingloss.append(loss.item())
 
-            metrics(y, logits)
+            train_metrics(y, logits)
 
-        wandb.log(metrics.accumulate(str_prefix="Train "))
-        metrics.reset()
-
-        evalloss = []
-        # Eval loop start
+        valloss = []
+        # Validation loop start
         model.eval()
         with th.no_grad():
             for x, y in tqdm(valiloader, desc="Validation"):
                 x, y = x.to(device), y.to(device)
                 logits = model.forward(x)
                 loss = criterion(logits, y)
-                evalloss.append(loss.item())
-
-                metrics(y, logits)
+                valloss.append(loss.item())
 
-        wandb.log(metrics.accumulate(str_prefix="Evaluation "))
-        metrics.reset()
+                val_metrics(y, logits)
 
         wandb.log(
             {
                 "Epoch": epoch,
                 "Train loss": np.mean(trainingloss),
-                "Evaluation Loss": np.mean(evalloss),
+                "Validation loss": np.mean(valloss),
             }
+            | train_metrics.__getmetrics__(str_prefix="Train ")
+            | val_metrics.__getmetrics__(str_prefix="Validation ")
         )
+        train_metrics.__resetmetrics__()
+        val_metrics.__resetmetrics__()
+
+    testloss = []
+    model.eval()
+    with th.no_grad():
+        for x, y in tqdm(testloader, desc="Testing"):
+            x, y = x.to(device), y.to(device)
+            logits = model.forward(x)
+            loss = criterion(logits, y)
+            testloss.append(loss.item())
+
+            preds = th.argmax(logits, dim=1)
+            test_metrics(y, preds)
+
+    wandb.log(
+        {"Epoch": 1, "Test loss": np.mean(testloss)}
+        | test_metrics.__getmetrics__(str_prefix="Test ")
+    )
+    test_metrics.__resetmetrics__()
 
 
 if __name__ == "__main__":