Here, we share our ideas and code for building LLMs – including Transformers, GPT-2, and training methods like SFT, DPO, and GRPO – entirely from scratch. We also provide simple mathematical derivations for algorithms such as DPO and GRPO, along with insights into recent research topics in LLMs, such as reasoning. We hope you find these resources helpful.
tianbingsz/LLM
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|