This is a simple implementation of a GPT using Apple's new MLX library.
Ensure Poetry is installed and run:
poetry install
We train the model on OpenWebText, which is an open-source replication of OpenAI's WebText dataset.
Run the following command to download the dataset, tokenize it and then save it to disk:
poetry run python prepare.py
This will create a data
directory with two files: train.bin
and validation.bin
.
Once complete, run the following command to train the model:
poetry run python train.py
Checkpoints will be saved to the checkpoints
directory.
Run the following command to generate text using the trained model:
poetry run python generate.py
This implementation has been inspired by Andrej Karpathy's nanoGPT and minGPT repositories, which are themselves PyTorch reimplementations of GPT-2 with a few modifications.
- Configuration improvements (e.g. YAML file)
- Calculate validation loss
- Adjust hyperparameters to improve performance