Skip to content

georgeherbert/gpt-mlx

Repository files navigation

GPT MLX

Introduction

This is a simple implementation of a GPT using Apple's new MLX library.

Installation

Ensure Poetry is installed and run:

poetry install

Usage

Training

We train the model on OpenWebText, which is an open-source replication of OpenAI's WebText dataset.

Run the following command to download the dataset, tokenize it and then save it to disk:

poetry run python prepare.py

This will create a data directory with two files: train.bin and validation.bin.

Once complete, run the following command to train the model:

poetry run python train.py

Checkpoints will be saved to the checkpoints directory.

Generation

Run the following command to generate text using the trained model:

poetry run python generate.py

Acknowledgments

This implementation has been inspired by Andrej Karpathy's nanoGPT and minGPT repositories, which are themselves PyTorch reimplementations of GPT-2 with a few modifications.

TODO

  • Configuration improvements (e.g. YAML file)
  • Calculate validation loss
  • Adjust hyperparameters to improve performance

About

Minimal MLX implementation of a GPT

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages