This project focuses on fine-tuning the Llama-2-7b model for chat applications, using a custom dataset based on Amazon SageMaker FAQ. The fine-tuning process leverages several modern techniques and libraries, including LoRA (Low-Rank Adaptation), 4-bit quantization, and the Hugging Face Transformers library.
- Introduction
- Installation
- Usage
- Custom Dataset
- Training Details
- Results
- Model Deployment
- Contributing
- License
The aim of this project is to fine-tune a pre-trained Llama-2-7b model for generating human-like responses to questions based on a custom dataset. The dataset used for training is derived from the Amazon SageMaker FAQ dataset, curated specifically for this project.
To get started with this project, you need to install the required packages:
!pip install -q accelerate==0.21.0 peft==0.4.0 bitsandbytes==0.40.2 transformers==4.31.0 trl==0.4.7
-
Load and Prepare the Custom Dataset:
from datasets import load_dataset dataset_name = "Rohandebnath243536287/Amazon_Sagemaker_Faq_llama" dataset = load_dataset(dataset_name, split="train")
-
Model Configuration and Training:
from transformers import AutoModelForCausalLM, AutoTokenizer, TrainingArguments from peft import LoraConfig, PeftModel from trl import SFTTrainer model_name = "NousResearch/Llama-2-7b-chat-hf" tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=bnb_config, device_map=device_map) training_arguments = TrainingArguments( output_dir="./results", num_train_epochs=1, per_device_train_batch_size=4, gradient_accumulation_steps=1, learning_rate=2e-4, weight_decay=0.001, logging_steps=25, lr_scheduler_type="cosine", report_to="tensorboard" ) trainer = SFTTrainer( model=model, train_dataset=dataset, tokenizer=tokenizer, args=training_arguments, peft_config=peft_config ) trainer.train()
-
Inference:
from transformers import pipeline prompt = "What security measures does Amazon SageMaker have?" pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=200) result = pipe(f"[INST] {prompt} [/INST]") print(result[0]['generated_text'])
This project uses a custom dataset derived from the Amazon SageMaker FAQ dataset, specifically curated for training the Llama-2-7b model. The dataset includes a collection of questions and corresponding answers, focusing on various aspects of Amazon SageMaker.
- Model: Llama-2-7b
- Dataset: Custom Amazon SageMaker FAQ derivative
- Fine-tuning: Utilizes LoRA for parameter-efficient fine-tuning
- Quantization: 4-bit quantization to reduce model size and improve inference speed
- Training Epochs: 1
The fine-tuned model can generate coherent and contextually appropriate responses to questions related to Amazon SageMaker. Here is an example:
Input: What security measures does Amazon SageMaker have?
Output:
Amazon SageMaker provides several security measures to protect your data and models. These include:
1. Data Encryption: Data is encrypted at rest and in transit.
2. IAM Policies: Fine-grained access control using AWS Identity and Access Management (IAM).
3. VPC Support: Secure network isolation using Amazon Virtual Private Cloud (VPC).
4. Logging and Monitoring: Integration with AWS CloudTrail and Amazon CloudWatch for logging and monitoring activities.
The fine-tuned model can be deployed on Hugging Face Hub:
!huggingface-cli login
model.push_to_hub("Rohandebnath243536287/Llama-2-7b-chat-finetune", check_pr=True)
tokenizer.push_to_hub("Rohandebnath243536287/Llama-2-7b-chat-finetune", check_pr=True)
Contributions are welcome! Please open an issue or submit a pull request for any improvements or additions.
This project is licensed under the MIT License.
Make sure to replace `"Rohandebnath243536287/Amazon_Sagemaker_Faq_llama"` with the appropriate dataset identifier you've used or plan to use on Hugging Face Datasets Hub. Adjust any other details or sections based on your specific project requirements.