Merge branch 'main' of github.com:SFI-Visual-Intelligence/Collaborative-Coding-Exam into johan/devbranch

Johanmkr · Johanmkr · commit 7551c144eafb · 2025-03-02T13:43:15.000+01:00
diff --git a/CITATION.cff b/CITATION.cff
@@ -0,0 +1,22 @@
+cff-version: 1.2.0
+message: "If you use this software, please consider citing it as below."
+authors:
+- family-names: "Thrun"
+  given-names: "Solveig"
+  orcid: "https://orcid.org/0009-0006-7349-9449"
+- family-names: "Salomonsen"
+  given-names: "Christian"
+  orcid: "https://orcid.org/0009-0007-4958-4544"
+- family-names: "Størdal"
+  given-names: "Magnus"
+  orcid: "https://orcid.org/0009-0008-5226-8128"
+- family-names: "Zavadil"
+  given-names: "Jan"
+  orcid: "https://orcid.org/0000-0001-8502-0059"
+- family-names: "Mylius-Kroken"
+  given-names: "Johan"
+  orcid: "https://orcid.org/0009-0005-8580-372X"
+title: "Collaborative Coding Exam"
+version: 1.1.0
+date-released: 2025-02-26
+url: "https://github.com/SFI-Visual-Intelligence/Collaborative-Coding-Exam"
diff --git a/CollaborativeCoding/load_metric.py b/CollaborativeCoding/load_metric.py
@@ -29,7 +29,7 @@ class MetricWrapper(nn.Module):
     Methods
     -------
         __call__(y_true, y_pred)
-            Passes the true and predicted labels to the metric functions.
+            Passes the true and predicted logits to the metric functions.
         getmetrics(str_prefix: str = None)
             Retrieves the dictionary of computed metrics, optionally all keys can be prefixed with a string.
         resetmetric()
@@ -40,10 +40,13 @@ class MetricWrapper(nn.Module):
     >>> from CollaborativeCoding import MetricWrapperProposed
     >>> metrics = MetricWrapperProposed(2, "entropy", "f1", "precision")
     >>> y_true = [0, 1, 0, 1]
-    >>> y_pred = [0, 1, 1, 0]
+    >>> y_pred = [[0.8, -1.9],
+                 [0.1,   9.0],
+                 [-1.9, -0.1],
+                 [1.9,   1.8]]
     >>> metrics(y_true, y_pred)
     >>> metrics.getmetrics()
-    {'entropy': 0.6931471805599453, 'f1': 0.5, 'precision': 0.5}
+    {'entropy': 0.3292665, 'f1': 0.5, 'precision': 0.5}
     >>> metrics.resetmetric()
     >>> metrics.getmetrics()
     {'entropy': [], 'f1': [], 'precision': []}
diff --git a/README.md b/README.md
@@ -3,6 +3,22 @@
 # Collaborative-Coding-Exam
 Repository for final evaluation in the FYS-8805 Reproducible Research and Collaborative coding course
 
+## **Table of Contents**  
+1. [Project Description](#project-description)  
+2. [Installation](#installation)  
+3. [Usage](#usage)  
+4. [Results](#results)  
+5. [Citing](#citing)  
+
+## Project Description
+This project involves collaborative work on a digit classification task, where each participant works on distinct but interconnected components within a shared codebase. <br>
+The main goal is to develop and train digit classification models collaboratively, with a focus on leveraging shared resources and learning efficient experimentation practices.
+### Key Aspects of the Project:
+- **Individual and Joint Tasks:** Each participant has separate tasks, such as implementing a digit classification dataset, a neural network model, and an evaluation metric. However, all models and datasets must be compatible, as we can only train and evaluate using partners' models and datasets.
+- **Shared Environment:** Alongside working on our individual tasks, we collaborate on joint tasks like the main file, and training and evaluation loops. Additionally, we utilize a shared Weights and Biases environment for experiment management.
+- **Documentation and Package Management:** To ensure proper documentation and ease of use, we set up Sphinx documentation and made the repository pip-installable
+- **High-Performance Computing:** A key learning objective of this project is to gain experience with running experiments on high-performance computing (HPC) resources. To this end, we trained all models on a cluster
+
 ## Installation
 
 Install from:
@@ -25,9 +41,37 @@ python -c "import CollaborativeCoding"
 
 ## Usage
 
-TODO: Fill in
+To train a classification model using this code, follow these steps:
+
+### 1) Create a Directory for the reuslts
+Before running the training script, ensure the results directory exists:
+
+ `mkdir -p "<RESULTS_DIRECTORY>"`
+
+### 2) Run the following command for training, evaluation and testing
+
+ `python3 main.py --modelname "<MODEL_NAME>" --dataset "<DATASET_NAME>" --metric "<METRIC_1>" "<METRIC_2>" ... "<METRIC_N>" --resultfolder "<RESULTS_DIRECTORY>" --run_name "<RUN_NAME>" --device "<DEVICE>"`
+<br> Replace placeholders with your desired values:
+
+- `<MODEL_NAME>`: You can choose from different models ( `"MagnusModel", "ChristianModel", "SolveigModel", "JanModel", "JohanModel"`).
+
+
+- `<DATASET_NAME>`: The following datasets are supported (`"svhn", "usps_0-6", "usps_7-9", "mnist_0-3", "mnist_4-9"`)
+
+
+- `<METRIC_1> ... <METRIC_N>`: Specify one or more evaluation metrics (`"entropy", "f1", "recall", "precision", "accuracy"`)
+
+
+- `<RESULTS_DIRECTORY>`: Folder where all model outputs, logs, and checkpoints are saved 
+
+
+- `<RUN_NAME>`: Name for WANDB project
+
 
-### Running on a k8s cluster
+- `<DEVICE>`: `"cuda", "cpu", "mps"`
+
+
+## Running on a k8s cluster
 
 In your job manifest, include:
 
@@ -46,3 +90,52 @@ to pull the latest build, or check the [packages](https://github.com/SFI-Visual-
 
 > [!NOTE]
 > The container is build for a `linux/amd64` architecture to properly build Cuda 12. For other architectures please build the docker image locally.
+
+
+## Results 
+### JanModel & MNIST_0-3
+This section reports the results from using the model "JanModel" and the dataset MNIST_0-3 which contains MNIST digits from 0 to 3 (Four classes total). 
+For this experiment we use all five available metrics, and train for a total of 20 epochs.
+
+We achieve a great fit on the data. Below are the results for the described run:
+
+| Dataset Split | Loss  | Entropy | Accuracy | Precision | Recall | F1    |
+|---------------|-------|---------|----------|-----------|--------|-------|
+| Train         | 0.000 | 0.000   | 1.000    | 1.000     | 1.000  | 1.000 |
+| Validation    | 0.035 | 0.006   | 0.991    | 0.991     | 0.991  | 0.991 |
+| Test          | 0.024 | 0.004   | 0.994    | 0.994     | 0.994  | 0.994 |
+
+
+### MagnusModel & SVHN 
+The MagnusModel was trained on the SVHN dataset, utilizing all five metrics.   
+Employing micro-averaging for the calculation of F1 score, accuracy, recall, and precision, the model was fine-tuned over 20 epochs.   
+A learning rate of 0.001 and a batch size of 64 were selected to optimize the training process. 
+
+The table below presents the detailed results, showcasing the model's performance across these metrics.
+
+
+| Dataset Split | Loss  | Entropy | Accuracy | Precision | Recall | F1    |
+|---------------|-------|---------|----------|-----------|--------|-------|
+| Train         | 1.007 | 0.998   | 0.686    | 0.686     | 0.686  | 0.686 |
+| Validation    | 1.019 | 0.995   | 0.680    | 0.680     | 0.680  | 0.680 |
+| Test          | 1.196 | 0.985   | 0.634    | 0.634     | 0.634  | 0.634 |
+
+## Citing
+Please consider citing this repository if you end up using it for your work. 
+Several citation methods can be found under the "About" section. 
+For BibTeX citation please use
+```
+@software{Thrun_Collaborative_Coding_Exam_2025,
+author = {Thrun, Solveig and Salomonsen, Christian and Størdal, Magnus and Zavadil, Jan and Mylius-Kroken, Johan},
+month = feb,
+title = {{Collaborative Coding Exam}},
+url = {https://github.com/SFI-Visual-Intelligence/Collaborative-Coding-Exam},
+version = {1.1.0},
+year = {2025}
+}
+```
+
+For APA please use
+```
+Thrun, S., Salomonsen, C., Størdal, M., Zavadil, J., & Mylius-Kroken, J. (2025). Collaborative Coding Exam (Version 1.1.0) [Computer software]. https://github.com/SFI-Visual-Intelligence/Collaborative-Coding-Exam
+```
diff --git a/doc/Jan_page.md b/doc/Jan_page.md
@@ -0,0 +1,54 @@
+# Jan Individual Task
+======================
+
+## Task Overview
+In addition to the overall task, I was assigned the implementation of a multi-layer perceptron model, a dataset loader for a subset of the MNIST dataset, and an accuracy metric.
+
+## Network Implementation In-Depth
+For the network part, I was tasked with making a simple MLP network model for image classification tasks. The model consists of two hidden layers with 100 neurons each followed by a leaky-relu activation. This implementation involves creating a custom class that inherits from the PyTorch `nn.Module` class. This allows our class to have two methods: the `__init__` method and a `forward` method. When we create an instance of the class, we can call the instance like a function, which will run the `forward` method.
+
+The network is initialized with the following parameters:
+* `image_shape`
+* `num_classes`
+
+The `image_shape` argument provides the shape of the input image (channels, height, width) which is used to correctly initialize the input size of the first layer. The `num_classes` argument defines the number of output neurons, corresponding to the number of classes in the dataset. 
+
+The forward method in this class processes the input as follows:
+1. Flattens the input image.
+2. Passes the flattened input through the first fully connected layer (`fc1`).
+3. Applies a LeakyReLU activation function.
+4. Passes the result through the second fully connected layer (`fc2`).
+5. Applies another LeakyReLU activation function.
+6. Passes the result through the output layer (`out`).
+
+## MNIST Dataset In-Depth
+For the dataset part, I was tasked with creating a custom dataset class for loading a subset of the MNIST dataset containing digits 0 to 3. This involved creating a class that inherits from the PyTorch `Dataset` class. 
+
+The class is initialized with the following parameters:
+* `data_path`
+* `sample_ids`
+* `train` (optional, default is False)
+* `transform` (optional, default is None)
+* `nr_channels` (optional, default is 1)
+
+The `data_path` argument stores the path to the four binary files containing MNIST dataset. The verification of presence of these files and their download, if necessary, is facilitated by the `Downloader`class. The `sample_ids` parameter contains the indices of images and their respective labels that are to be loaded from MNIST dataset. Filtering and random splitting of these indices is performed within the `load_data`function. `train`is a boolean flag indicating whether to load data from training (for training and validation splits) or from testing (test split) part of the MNIST dataset. `transform` is a callable created with `torch.compose()` to be applied on the images. `nr_channels` is not used in this dataset, only included for compatibility with other functions.
+
+The class has two main methods:
+* `__len__`: Returns the number of samples in the dataset.
+* `__getitem__`: Retrieves the image and label at the specified index.
+
+## Accuracy Metric In-Depth
+For the metric part, I was tasked with creating an accuracy metric class. The `Accuracy` class computes the accuracy of a model's predictions. The class is initialized with the following parameters:
+* `num_classes`
+* `macro_averaging` (optional, default is False)
+
+The `num_classes` argument specifies the number of classes in the classification task. The `macro_averaging`argument is a boolean flag specifying whether to compute the accuracy using micro or macro averaging.
+
+The class has the following methods:
+* `forward`: Stores the true and predicted labels computed on a batch level.
+* `_macro_acc`: Computes the macro-averaged accuracy on stored values.
+* `_micro_acc`: Computes the micro-averaged accuracy on stored values.
+* `__returnmetric__`: Returns the computed accuracy based on the averaging method for all stored predictions.
+* `__reset__`: Resets the stored true and predicted labels.
+
+The `forward` method takes the true labels and predicted labels as input and stores them. The `_macro_acc` method computes the macro-average accuracy by averaging the accuracy for each class. The `_micro_acc` method computes the micro-average accuracy by calculating the overall accuracy. The `__returnmetric__` method returns the computed accuracy based on the averaging method. The `__reset__` method resets the stored true and predicted labels to prepare for the next epoch.
diff --git a/doc/Magnus_page.md b/doc/Magnus_page.md
@@ -21,9 +21,28 @@ Each input is flattened over the channel, height and width channels. Then they a
 
 
 ## SVHN Dataset In-Depth
+The dataloader I was tasked with making is to load the well-known SVHN dataset. This is a RGB dataset with real-life digits taken from house numbers. The class inherits from the torch Dataset class, and has four methods:
+* __init__ : initialized the instance of the class
+* _create_h5py: Creates the h5 object containing data from the downloaded .mat files for ease of use
+* __len__ : Method needed in use of the DataLoader class. Returns length of the dataset
+* __getitem__ : Method needed in use of the DataLoader class. Loads a image - label pair, applies any defined image transformations, and returns both image and label. 
 
 
 
+The __init__ method takes in a few arguments. 
+* data_path (Path): Path where either the data is downloaded or where it is to be downloaded to. 
+* train (bool): Which set to use. If true we use the training set of SVHN, and if false we use the test set of SVHN.
+* transform: The transform functions to be applied to the returned image. 
+* nr_channels: How many channels to use. Can be either 1 or 3, corresponding to either greyscale or RGB images respectively. 
+
+In the init we check for the existence of the SVHN dataset. If it does not exist, then we run the _create_h5py method which will be explained later. Then the labels are loaded into memory as they are needed for the __len__ method among other things. 
+
+The _create_h5py method downloads a given SVHN set (train or test). We also change the label 10 to 0, as the SVHN dataset starts at index 1, with 10 representing images with the digit zero. After the download, we create two .h5 files. One with the labels and one with the images. 
+
+Lastly, in __getitem__ we take index (number between 0 and length of label array). We retrive load the image h5 file, and retrive the row corresponding to the index. 
+We then convert the image to an Pillow Image object, then apply the defined transforms before returning the image and label. 
+
+
 
 ## Entropy Metric In-Depth
 The EntropyPrediction class' main job is to take some inputs from the MetricWrapper class and store the batchwise Shannon Entropy metric of those inputs. The class has four methods with the following jobs: 
@@ -41,4 +60,35 @@ With permission I've used the scipy implementation to calculate entropy here. We
 
 Next we have the __returnmetric__ method which is used to retrive the stored metric. This returns the mean over all stored values. Effectively, this will return the average Shannon Entropy of the dataset. 
 
-Lastly we have the __reset__ method which simply emptied the variable which stores the entropy values to prepare it for the next epoch. 
+Lastly we have the __reset__ method which simply emptied the variable which stores the entropy values to prepare it for the next epoch. 
+
+## More on implementation choices
+It should be noted that a lot of our decisions came from a top-down perspective. Many of our classes have design choices to accomendate the wrappers which handle the initialization and dataflow of the different metrics, dataloaders, and models. 
+All in all, we've made sure you don't really need to interact with the code outside setting up the correct arguments for the run, which is great for consistency. 
+
+
+## Challenges 
+### Running someone elses code
+This section answers the question on what I found easy / difficult running another persons code. 
+
+I found it quite easy to run others code. We had quite good tests, and once every test passed, I only had one error with the F1 score not handeling an unexpected edgecase. To fix this I raised an issue, and it was fixed shortly after. 
+
+One thing I did find a bit difficult was when people would change integral parts of the common code such as wrappers or loader functions (usually for the better), but did not raise an issue or notify about the change. It did cause some moments of questions, but in the end we sorted it out through weekly meetings where we agreed on design choices and how to handle loading of the different modules. 
+
+The issues mentioned above also lead to a week or so where there was always a test failing, and the person whos' code was failing did not have time to work on it for a few days. 
+
+### Someone running my code
+This section answers the question on what I found easy / difficult having someone run my code. 
+
+I did not experience that anyone had issues with my code. After I fixed all issues and tests related to my code, it seems to have run fine, and no issues have been raised to my awareness about this. 
+
+
+## Tools
+This section answers the question of which tools from the course I used during the home-exam. 
+
+For this exam I used quite a few tools from the course. 
+I've never used pytest and test functions while writing code. This was quite fun to learn how to use, and having github actions also run the same tests was a great addition. 
+
+Github actions we used for quite a few things. We checked for code formatting, documentation generation and run the code tests. 
+
+Using sphinx for documentation was also a great tool. Turns out it's possible to write the doc-string in such a way that it automatically generates the documentation for you. This has helped reduce the workload with documentation a lot, and makes writing proper docstrings worthwile. 
diff --git a/doc/about.md b/doc/about.md
@@ -1,3 +1,3 @@
 # About this code
 
-Work is still in progress ...
+This project was created as part of a Collaboratice Coding and Reproducible Research special curriculum, held at UiT in february 2025. 
diff --git a/doc/index.md b/doc/index.md
@@ -8,12 +8,12 @@ fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
 culpa qui officia deserunt mollit anim id est laborum.
 
 :::{toctree}
-:maxdepth: 2
-:caption: Some caption
+:maxdepth: 1
+:caption: Table of contents
 
 about.md
 Magnus_page.md
+Jan_page.md
 :::
 
-Individual Sections
-===================
+
diff --git a/pyproject.toml b/pyproject.toml
@@ -1,14 +1,15 @@
 [project]
 name = "collaborative-coding-exam"
-version = "0.1.0"
+version = "1.1.0"
 description = "Exam project in the collaborative coding course."
 readme = "README.md"
-requires-python = ">=3.11.5"
+requires-python = ">=3.11.5, <3.13"
 dependencies = [
     "black>=25.1.0",
     "h5py>=3.12.1",
     "isort>=6.0.0",
     "jupyterlab>=4.3.5",
+    "myst-parser>=4.0.1",
     "numpy>=2.2.2",
     "pandas>=2.2.3",
     "pip>=25.0",
diff --git a/uv.lock b/uv.lock

Original file line number	Diff line number	Diff line change
`@@ -1,3 +1,3 @@`
`1`	`1`	`# About this code`
`2`	`2`
`3`		`-Work is still in progress ...`
	`3`	`+This project was created as part of a Collaboratice Coding and Reproducible Research special curriculum, held at UiT in february 2025.`