Skip to content

noviciusss/FineTunning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

FineTuning

This repository contains fine-tuned transformer models for various NLP tasks. Below are the models that have been trained and their respective details.

πŸ€– Fine-tuned Models

1. DistilBERT for AG News Classification

  • Model: noviciusss/agnewsDistilt
  • Base Model: DistilBERT Base Uncased
  • Dataset: AG News (SetFit/ag_news)
  • Task: News Article Classification
  • Classes: 4 categories (World, Sports, Business, Sci/Tech)
  • Performance:
    • Accuracy: ~94.7%
    • F1 Macro: ~94.7%
  • Training Details:
    • Epochs: 3
    • Learning Rate: 2e-5
    • Batch Size: 16 (train), 32 (eval)
    • Weight Decay: 0.01

2. RoBERTa for Banking Intent Classification

  • Model: noviciusss/RoBERTa-base_Banking77
  • Base Model: FacebookAI/RoBERTa Base
  • Dataset: Banking77 (mteb/banking77)
  • Task: Banking Intent Classification
  • Classes: 77 banking-related intents
  • Performance:
    • Accuracy: ~93.7%
    • F1 Macro: ~93.6%
  • Training Details:
    • Epochs: 5
    • Learning Rate: 2e-5
    • Batch Size: 16 (train), 32 (eval)
    • Weight Decay: 0.01

3. FLAN-T5 LoRA for Dialogue Summarization

  • Model: noviciusss/flan-t5-base-samsum
  • Base Model: google/flan-t5-base with LoRA adapters (r=16, alpha=32, dropout=0.05)
  • Dataset: SAMSum (knkarthick/samsum)
  • Task: Dialogue Summarization
  • Performance:
    • ROUGE-1: ~49.0
    • ROUGE-L: ~41.0
    • BERTScore F1: ~72.3
    • METEOR: ~42.5
  • Training Details:
    • Epochs: 3
    • Learning Rate: 1e-4
    • Batch Size: 8 (train), 8 (eval)
    • Generation Max Length: 128
    • Predict with generate enabled for evaluation

πŸ“Š Model Comparison

Model Dataset Task Key Metrics Base Model
agnewsDistilt AG News News classification Accuracy: 94.7%; F1 Macro: 94.7% DistilBERT
RoBERTa-base_Banking77 Banking77 Intent classification Accuracy: 93.7%; F1 Macro: 93.6% RoBERTa
flan-t5-base-samsum SAMSum Dialogue summarization ROUGE-1: 49.0; ROUGE-L: 41.0; BERTScore F1: 72.3 FLAN-T5 + LoRA

πŸš€ Usage

You can use these models directly from Hugging Face:

from transformers import (
  AutoTokenizer,
  AutoModelForSequenceClassification,
  AutoModelForSeq2SeqLM,
)

# AG News Classification
ag_news_tokenizer = AutoTokenizer.from_pretrained("noviciusss/agnewsDistilt")
ag_news_model = AutoModelForSequenceClassification.from_pretrained("noviciusss/agnewsDistilt")

# Banking Intent Classification
banking_tokenizer = AutoTokenizer.from_pretrained("noviciusss/RoBERTa-base_Banking77")
banking_model = AutoModelForSequenceClassification.from_pretrained("noviciusss/RoBERTa-base_Banking77")

# Dialogue Summarization
sam_tokenizer = AutoTokenizer.from_pretrained("noviciusss/flan-t5-base-samsum")
sam_model = AutoModelForSeq2SeqLM.from_pretrained("noviciusss/flan-t5-base-samsum")
inputs = sam_tokenizer("summarize: Alice met Bob to discuss the launch timeline.", return_tensors="pt")
summary_ids = sam_model.generate(**inputs, max_length=128)
summary = sam_tokenizer.decode(summary_ids[0], skip_special_tokens=True)

πŸ“ Repository Structure

FineTunning/
β”œβ”€β”€ AgNews_DistilBERT_model/
β”‚   └── FineTuning_1.ipynb          # DistilBERT fine-tuning notebook
β”œβ”€β”€ RoBERTa_base_Banking77/
β”‚   └── RoBERTa_base_Banking77.ipynb # RoBERTa fine-tuning notebook
β”œβ”€β”€ flan-t5-base-samsum_lora/
β”‚   └── Text_Summ_t5Base_SAMsum.ipynb # FLAN-T5 LoRA summarization notebook
└── README.md

πŸ› οΈ Training Environment

  • Framework: Transformers (Hugging Face)
  • Hardware: GPU-accelerated training
  • Evaluation Strategy: Per epoch evaluation with predict_with_generate for seq2seq runs
  • Adapters: LoRA applied to FLAN-T5 (r=16, alpha=32, dropout=0.05)
  • Metrics: Accuracy and F1 for classification; ROUGE, BERTScore, METEOR, BLEU for summarization
  • Model Selection: Classification models track accuracy, while summarization tracks ROUGE-L for best checkpoint

πŸ“ Notes

  • All training runs use FP16 precision for efficiency
  • Models are saved and pushed to Hugging Face Hub automatically after evaluation
  • Training pipelines track task-appropriate metrics (accuracy/F1 or ROUGE/BERTScore/METEOR/BLEU)
  • Data preprocessing includes consistent tokenization, padding, and task-specific prompts (e.g., summarize: prefix)
  • LoRA adapters keep the FLAN-T5 summarizer lightweight while preserving base model weights

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published