|
| 1 | +# 🧠 QuantLLM: Lightweight Library for Quantized LLM Fine-Tuning and Deployment |
| 2 | + |
| 3 | +## 📌 Overview |
| 4 | + |
| 5 | +**QuantLLM** is a Python library designed for developers, researchers, and teams who want to fine-tune and deploy large language models (LLMs) **efficiently** using **4-bit and 8-bit quantization** techniques. It provides a modular and flexible framework for: |
| 6 | + |
| 7 | +- **Loading and quantizing models** with advanced configurations |
| 8 | +- **LoRA / QLoRA-based fine-tuning** with customizable parameters |
| 9 | +- **Dataset management** with preprocessing and splitting |
| 10 | +- **Training and evaluation** with comprehensive metrics |
| 11 | +- **Model checkpointing** and versioning |
| 12 | +- **Hugging Face Hub integration** for model sharing |
| 13 | + |
| 14 | +The goal of QuantLLM is to **democratize LLM training**, especially in low-resource environments, while keeping the workflow intuitive, modular, and production-ready. |
| 15 | + |
| 16 | +## 🎯 Key Features |
| 17 | + |
| 18 | +| Feature | Description | |
| 19 | +|----------------------------------|-------------| |
| 20 | +| ✅ Quantized Model Loading | Load any HuggingFace model in 4-bit or 8-bit precision with customizable quantization settings | |
| 21 | +| ✅ Advanced Dataset Management | Load, preprocess, and split datasets with flexible configurations | |
| 22 | +| ✅ LoRA / QLoRA Fine-Tuning | Memory-efficient fine-tuning with customizable LoRA parameters | |
| 23 | +| ✅ Comprehensive Training | Advanced training loop with mixed precision, gradient accumulation, and early stopping | |
| 24 | +| ✅ Model Evaluation | Flexible evaluation with custom metrics and batch processing | |
| 25 | +| ✅ Checkpoint Management | Save, resume, and manage training checkpoints with versioning | |
| 26 | +| ✅ Hub Integration | Push models and checkpoints to Hugging Face Hub with authentication | |
| 27 | +| ✅ Configuration Management | YAML/JSON config support for reproducible experiments | |
| 28 | +| ✅ Logging and Monitoring | Comprehensive logging and Weights & Biases integration | |
| 29 | + |
| 30 | +## 🚀 Getting Started |
| 31 | + |
| 32 | +### 🔧 Installation |
| 33 | + |
| 34 | +```bash |
| 35 | +pip install quantllm |
| 36 | +``` |
| 37 | + |
| 38 | +### 📦 Basic Usage |
| 39 | + |
| 40 | +```python |
| 41 | +from quantllm import ( |
| 42 | + ModelLoader, |
| 43 | + DatasetLoader, |
| 44 | + DatasetPreprocessor, |
| 45 | + DatasetSplitter, |
| 46 | + FineTuningTrainer, |
| 47 | + ModelEvaluator, |
| 48 | + TrainingConfig, |
| 49 | + ModelConfig, |
| 50 | + DatasetConfig |
| 51 | +) |
| 52 | + |
| 53 | +# Initialize logger |
| 54 | +from quantllm.finetune import TrainingLogger |
| 55 | +logger = TrainingLogger() |
| 56 | + |
| 57 | +# 1. Dataset Configuration and Loading |
| 58 | +dataset_config = DatasetConfig( |
| 59 | + dataset_name_or_path="imdb", |
| 60 | + dataset_type="huggingface", |
| 61 | + text_column="text", |
| 62 | + label_column="label" |
| 63 | +) |
| 64 | + |
| 65 | +dataset_loader = DatasetLoader(logger) |
| 66 | +dataset = dataset_loader.load_hf_dataset(dataset_config.dataset_name_or_path) |
| 67 | + |
| 68 | +# 2. Model Configuration and Loading |
| 69 | +model_config = ModelConfig( |
| 70 | + model_name_or_path="meta-llama/Llama-2-7b-hf", |
| 71 | + load_in_4bit=True, |
| 72 | + use_lora=True |
| 73 | +) |
| 74 | + |
| 75 | +model_loader = ModelLoader( |
| 76 | + model_name=model_config.model_name_or_path, |
| 77 | + quantization="4bit" if model_config.load_in_4bit else None, |
| 78 | + use_lora=model_config.use_lora |
| 79 | +) |
| 80 | +model = model_loader.get_model() |
| 81 | +tokenizer = model_loader.get_tokenizer() |
| 82 | + |
| 83 | +# 3. Training Configuration |
| 84 | +training_config = TrainingConfig( |
| 85 | + learning_rate=2e-4, |
| 86 | + num_epochs=3, |
| 87 | + batch_size=4 |
| 88 | +) |
| 89 | + |
| 90 | +# 4. Initialize and Run Trainer |
| 91 | +trainer = FineTuningTrainer( |
| 92 | + model=model, |
| 93 | + training_config=training_config, |
| 94 | + train_dataloader=train_dataloader, |
| 95 | + eval_dataloader=val_dataloader, |
| 96 | + logger=logger |
| 97 | +) |
| 98 | +trainer.train() |
| 99 | +``` |
| 100 | + |
| 101 | +### ⚙️ Advanced Usage |
| 102 | + |
| 103 | +#### Configuration Files |
| 104 | + |
| 105 | +Create a config file (e.g., `config.yaml`): |
| 106 | +```yaml |
| 107 | +model: |
| 108 | + model_name_or_path: "meta-llama/Llama-2-7b-hf" |
| 109 | + load_in_4bit: true |
| 110 | + use_lora: true |
| 111 | + lora_config: |
| 112 | + r: 16 |
| 113 | + lora_alpha: 32 |
| 114 | + target_modules: ["q_proj", "v_proj"] |
| 115 | + |
| 116 | +dataset: |
| 117 | + dataset_name_or_path: "imdb" |
| 118 | + text_column: "text" |
| 119 | + label_column: "label" |
| 120 | + max_length: 512 |
| 121 | + |
| 122 | +training: |
| 123 | + learning_rate: 2e-4 |
| 124 | + num_epochs: 3 |
| 125 | + batch_size: 4 |
| 126 | + gradient_accumulation_steps: 4 |
| 127 | +``` |
| 128 | +
|
| 129 | +#### Hub Integration |
| 130 | +
|
| 131 | +```python |
| 132 | +from quantllm.hub import HubManager |
| 133 | + |
| 134 | +hub_manager = HubManager( |
| 135 | + model_id="your-username/llama-2-imdb", |
| 136 | + token=os.getenv("HF_TOKEN") |
| 137 | +) |
| 138 | + |
| 139 | +if hub_manager.is_logged_in(): |
| 140 | + hub_manager.push_model( |
| 141 | + model, |
| 142 | + commit_message="Trained model with custom configuration" |
| 143 | + ) |
| 144 | +``` |
| 145 | +
|
| 146 | +#### Evaluation |
| 147 | +
|
| 148 | +```python |
| 149 | +from quantllm.finetune import ModelEvaluator |
| 150 | + |
| 151 | +evaluator = ModelEvaluator( |
| 152 | + model=model, |
| 153 | + eval_dataloader=test_dataloader, |
| 154 | + metrics=[ |
| 155 | + lambda preds, labels, _: (preds.argmax(dim=-1) == labels).float().mean().item() |
| 156 | + ] |
| 157 | +) |
| 158 | + |
| 159 | +metrics = evaluator.evaluate() |
| 160 | +``` |
| 161 | + |
| 162 | +## 📚 Documentation |
| 163 | + |
| 164 | +### Model Loading |
| 165 | + |
| 166 | +```python |
| 167 | +model_config = ModelConfig( |
| 168 | + model_name_or_path="meta-llama/Llama-2-7b-hf", |
| 169 | + load_in_4bit=True, |
| 170 | + use_lora=True, |
| 171 | + lora_config={ |
| 172 | + "r": 16, |
| 173 | + "lora_alpha": 32, |
| 174 | + "target_modules": ["q_proj", "v_proj"] |
| 175 | + } |
| 176 | +) |
| 177 | +``` |
| 178 | + |
| 179 | +### Dataset Management |
| 180 | + |
| 181 | +```python |
| 182 | +dataset_config = DatasetConfig( |
| 183 | + dataset_name_or_path="imdb", |
| 184 | + dataset_type="huggingface", |
| 185 | + text_column="text", |
| 186 | + label_column="label", |
| 187 | + max_length=512, |
| 188 | + train_size=0.8, |
| 189 | + val_size=0.1, |
| 190 | + test_size=0.1 |
| 191 | +) |
| 192 | +``` |
| 193 | + |
| 194 | +### Training Configuration |
| 195 | + |
| 196 | +```python |
| 197 | +training_config = TrainingConfig( |
| 198 | + learning_rate=2e-4, |
| 199 | + num_epochs=3, |
| 200 | + batch_size=4, |
| 201 | + gradient_accumulation_steps=4, |
| 202 | + warmup_steps=100, |
| 203 | + logging_steps=50, |
| 204 | + eval_steps=200, |
| 205 | + save_steps=500, |
| 206 | + early_stopping_patience=3 |
| 207 | +) |
| 208 | +``` |
| 209 | + |
| 210 | +## 🤝 Contributing |
| 211 | + |
| 212 | +Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change. |
| 213 | + |
| 214 | +## 📝 License |
| 215 | + |
| 216 | +This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. |
| 217 | + |
| 218 | +## 🙏 Acknowledgments |
| 219 | + |
| 220 | +- [HuggingFace](https://huggingface.co/) for their amazing Transformers library |
| 221 | +- [bitsandbytes](https://github.com/TimDettmers/bitsandbytes) for quantization |
| 222 | +- [PEFT](https://github.com/huggingface/peft) for parameter-efficient fine-tuning |
| 223 | +- [Weights & Biases](https://wandb.ai/) for experiment tracking |
0 commit comments