You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This project involves collaborative work on a digit classification task, where each participant works on distinct but interconnected components within a shared codebase. <br>
15
+
The main goal is to develop and train digit classification models collaboratively, with a focus on leveraging shared resources and learning efficient experimentation practices.
16
+
### Key Aspects of the Project:
17
+
-**Individual and Joint Tasks:** Each participant has separate tasks, such as implementing a digit classification dataset, a neural network model, and an evaluation metric. However, all models and datasets must be compatible, as we can only train and evaluate using partners' models and datasets.
18
+
-**Shared Environment:** Alongside working on our individual tasks, we collaborate on joint tasks like the main file, and training and evaluation loops. Additionally, we utilize a shared Weights and Biases environment for experiment management.
19
+
-**Documentation and Package Management:** To ensure proper documentation and ease of use, we set up Sphinx documentation and made the repository pip-installable
20
+
-**High-Performance Computing:** A key learning objective of this project is to gain experience with running experiments on high-performance computing (HPC) resources. To this end, we trained all models on a cluster
<br> Replace placeholders with your desired values:
55
+
56
+
-`<MODEL_NAME>`: You can choose from different models ( `"MagnusModel", "ChristianModel", "SolveigModel", "JanModel", "JohanModel"`).
57
+
58
+
59
+
-`<DATASET_NAME>`: The following datasets are supported (`"svhn", "usps_0-6", "usps_7-9", "mnist_0-3", "mnist_4-9"`)
60
+
61
+
62
+
-`<METRIC_1> ... <METRIC_N>`: Specify one or more evaluation metrics (`"entropy", "f1", "recall", "precision", "accuracy"`)
63
+
64
+
65
+
-`<RESULTS_DIRECTORY>`: Folder where all model outputs, logs, and checkpoints are saved
66
+
43
67
44
-
### Running on a k8s cluster
68
+
-`<RUN_NAME>`: Name for WANDB project
69
+
70
+
71
+
-`<DEVICE>`: `"cuda", "cpu", "mps"`
72
+
73
+
74
+
## Running on a k8s cluster
45
75
46
76
In your job manifest, include:
47
77
@@ -62,14 +92,31 @@ to pull the latest build, or check the [packages](https://github.com/SFI-Visual-
62
92
> The container is build for a `linux/amd64` architecture to properly build Cuda 12. For other architectures please build the docker image locally.
63
93
64
94
65
-
# Results
66
-
## JanModel & MNIST_0-3
95
+
##Results
96
+
###JanModel & MNIST_0-3
67
97
This section reports the results from using the model "JanModel" and the dataset MNIST_0-3 which contains MNIST digits from 0 to 3 (Four classes total).
68
98
For this experiment we use all five available metrics, and train for a total of 20 epochs.
69
99
70
100
We achieve a great fit on the data. Below are the results for the described run:
101
+
71
102
| Dataset Split | Loss | Entropy | Accuracy | Precision | Recall | F1 |
0 commit comments