Skip to content

Latest commit

Β 

History

History
130 lines (92 loc) Β· 3.22 KB

File metadata and controls

130 lines (92 loc) Β· 3.22 KB

πŸ¦™ Llama Fine-Tuning on Any Dataset [LoRa/DoRA]

Using this repo, you can fine-tune any LLaMA model with LoRA or DoRA in just 6 lines of code.


Features

  • Flexible Dataset Support: Just provide your dataset in JSON format.
  • Configurable Training: Control key parameters via the config file, including:
    • LoRA or DoRa Rank & Alpha
    • Learning Rate
    • Batch Size
    • Sequence Length
    • Epochs
    • Gradient Clipping
    • Mixed Precision (FP16) Support

How to Use This Repository

0. Get the dataset

Just prepare the training data in a json format like this:

Sample Dataset format

[
    {
        "input": "Context: The capital of France is Paris. Question: What is the capital of France?\nAnswer:",
        "label": "Paris"
    },
    {
        "input": "Context: Twenty years from now you will be more disappointed by the things that you didn't do than by the ones you did do. Question: So what u gonna do?\nAnswer:",
        "label": "YES"
    }
]

1. Clone the Repository

git clone https://github.com/seungjun-green/llama-finetuning.git

2. Import Required Modules

import sys
sys.path.append("/path/to/llama-finetuning")
from scripts.finetune import fine_tune

3. Fine-Tune the Model

# To fine-tune the LLaMA model using DoRA, simply replace 'lora' with 'dora' here.
trainer = Finetuner('lora', config_file_path, train_file_path=train_file_path, dev_file_path=dev_file_path)
trainer.train()

Example Usage

Fine tuning Llama-3.2-1B model on SQaUD


Directory Structure

llama-finetuning/
β”œβ”€β”€ configs/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ llama-3.2-1B_finetune_squad.json # configuration for fine-tuning
β”‚   β”œβ”€β”€ finetune_config.py
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ json_data.py  # Data preprocessing for any text data.
β”‚   β”œβ”€β”€ fine_tuned_checkpoints/ # Directory for storing checkpoints
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ base_model.py  # Code for loading base Llama models
β”‚   β”œβ”€β”€ lora.py        # LoRA (Low-Rank Adaptation) implementation
β”‚   β”œβ”€β”€ loss.py        # Loss functions for fine-tuning
β”œβ”€β”€ scripts/
β”‚   β”œβ”€β”€ eval.py        # Script for evaluating fine-tuned models
β”‚   β”œβ”€β”€ fine_tune.py   # Fine-tuning script
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ checkpoint.py  # Utilities for saving/loading checkpoints
β”‚   β”œβ”€β”€ helpers.py     # Helper functions

Experiments

I evaluated the performance of fine-tuning the LLaMA model using two methods: LoRA and DoRA. The results are as follows:

  • LoRA: Achieved a loss score of 1.5245
  • DoRA: Achieved a loss score of 1.3891

As demonstrated in the DoRA paper, fine-tuning the LLaMA model with DoRA resulted in a lower loss score compared to LoRA


Requirements

  • Python 3.8+
  • PyTorch
  • Transformers Library

Install dependencies using:

pip install -r requirements.txt

Contributions

Contributions are welcome! Feel free to open an issue or submit a pull request for suggestions, bug fixes, or new features.


Happy fine-tuning! πŸŽ‰