Following along Andrej Karpathy's video on building GPT from scratch, but with my own experimentation.
Currently working on medium sized and larger data with other tokenization methods. Plan to eventually restructure the GitHub folder with
/models,/notesand/plotsfolders.
/scripts- Training scripts/data- Datasets or download from Tiny Codes dataset
🔗 Andrej Karpathy's GPT Video - here
🔗 "Attention Is All You Need" Paper - here