This repository contains a minimal implementation of GPT-2 for training and inference, using PyTorch.
-
Clone the repository
git clone [email protected]:TimilsinaBimal/gpt2.git cd gpt2
-
Install dependencies
Make sure you have Python and pip installed along withtorch,transformersandtiktokeninstalled. -
Prepare your data
Place your training text file (e.g.,tiny_shakespeare.txt) in the desired location. Update thedata_pathin your config if needed.
To train the model:
python -m gpt2.train- The training script will:
- Load and tokenize your dataset.
- Train the GPT-2 model using the specified configuration.
- Save the model weights at the end of each epoch to the path specified in your config.
Configuration:
Edit the gpt2/config.py file to adjust parameters such as data_path, batch_size, seq_length, learning_rate, num_epochs, and model_path.
To run inference with a pretrained model:
- Ensure you have a trained model checkpoint at the path specified in your config.
- Use the inference script:
python -m gpt2.inference- The script will:
- Load the pretrained GPT-2 model.
- Generate text based on your input prompt (modify the script to set your prompt).
- The tokenizer uses
tiktokenwith the GPT-2 vocabulary. - Model checkpoints and configuration files are saved as specified in your config.
- For custom datasets or different model sizes, update the config and scripts accordingly.
gpt2/train.py: Training pipeline.gpt2/inference.py: Inference/generation script.gpt2/model.py: Model definition and loading.gpt2/config.py: Configuration class.
- Ensure your data file path is correct in the config.
- If CUDA is available, set
devicein config to"cuda"for faster training.
Apache 2.0 License
Disclaimer: This README file is generated using AI