BitsyGPT - An experimental decoder only GPT transformer.
- Clone the repo
- Create the python virtual env at the root directory
- pip install torch and numpy
- Train the model using scripts/train.sh. The existing configurations as in the config directoy is for 1.2M params and would take about 35mins on Apple M1 cpu.
- Generate the outputs using scripts/generate.sh
- Change the hyperparameters in the config directoty and play further