Learning to Add: A Comparative Study of Machine Learning Methods on Paired MNIST Digits

Author: Jacob Tutt, Department of Physics, University of Cambridge

Description

This repository implements and presents the accuracy of various machine learning approaches. Specifically in the classification of a dataset generated from MNIST images, where two digits are vertically stacked (56x28) and labeled by the sum of the constituent digits. It first evaluates the performance of a fully connected neural network before comparing it to the alternative methods, each with relative strengths and weaknesses in handling the highly-dimensional dataset.

This repository forms part of the submission for the MPhil in Data Intensive Science's M1 Machine Learning Course at the University of Cambridge.

Notebooks

The notebooks in this repository serve as walkthroughs for the analysis performed. They include derivations of the mathematical implementations, explanations of key choices made, and present the main results. Four notebooks are provided:

Notebook	Description
Notebook 1	Constructs the training, validation, and test datasets by combining MNIST digits into structured (56×28) images. Implements a fully connected neural network with hyperparameter optimisation, training history, and performance metrics. This notebook establishes the performance benchmark for all subsequent models.
Notebook 2	Investigates alternative machine learning approaches including Support Vector Machines, Decision Trees, Random Forests, and AdaBoost. Includes hyperparameter tuning and a detailed comparison of classification performance against the neural network baseline.
Notebook 3	Examines the performance of weak linear classifiers trained on both combined (56×28) images and on individual (28×28) digits. The latter approach uses probability convolution to infer the sum.
Notebook 4	Applies dimensionality reduction - namely t-distributed Stochastic Neighbor Embedding (t-SNE) to visualise high-dimensional features. Compares embedding layer representations to visualise the isolation and cohesion of these structures in their original high dimensional space.

Results

The trained models—including the fully connected neural network and the weak linear classifiers—are saved in the Results directory to support reproducibility and enable further analysis or downstream use.

Installation and Usage

To run the notebooks, please follow these steps:

1. Clone the Repository

Clone the repository from the remote repository to your local machine.

git clone https://github.com/JacobTutt/ML_mnist.git

2. Create a Fresh Virtual Environment

Use a clean virtual environment to avoid dependency conflicts.

python -m venv env
source env/bin/activate   # For macOS/Linux
env\Scripts\activate      # For Windows

3. Install the dependencies

Navigate to the repository’s root directory and install the package dependencies:

pip install -r requirements.txt

4. Set Up a Jupyter Notebook Kernel

To ensure the virtual environment is recognised within Jupyter notebooks, set up a kernel:

python -m ipykernel install --user --name=env --display-name "MNIST ML"

5. Run the Notebooks

Open the notebooks and select the created kernel (MNIST ML) to run the code.

For Assessment

The associated project report can be found under Project Report.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

If you have any questions, run into issues, or just want to discuss the project, feel free to:

Open an issue on the GitHub Issues page.
Reach out to me directly via email.

Author

This project is maintained by Jacob Tutt

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
Report		Report
Results		Results
notebooks		notebooks
.gitignore		.gitignore
M1_Coursework.pdf		M1_Coursework.pdf
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning to Add: A Comparative Study of Machine Learning Methods on Paired MNIST Digits

Author: Jacob Tutt, Department of Physics, University of Cambridge

Description

Table of Contents

Notebooks

Results

Installation and Usage

1. Clone the Repository

2. Create a Fresh Virtual Environment

3. Install the dependencies

4. Set Up a Jupyter Notebook Kernel

5. Run the Notebooks

For Assessment

License

Support

Author

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

JacobTutt/ML_mnist

Folders and files

Latest commit

History

Repository files navigation

Learning to Add: A Comparative Study of Machine Learning Methods on Paired MNIST Digits

Author: Jacob Tutt, Department of Physics, University of Cambridge

Description

Table of Contents

Notebooks

Results

Installation and Usage

1. Clone the Repository

2. Create a Fresh Virtual Environment

3. Install the dependencies

4. Set Up a Jupyter Notebook Kernel

5. Run the Notebooks

For Assessment

License

Support

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages