Skip to content

Releases: peremartra/optipfair

v0.3.0 - Fairness-Aware Pruning

02 Mar 16:17

Choose a tag to compare

🎉 New Features

Fairness-Aware Pruning

  • New Function: analyze_neuron_bias() - Analyze per-neuron bias contributions across multiple demographic prompt pairs
    • Computes activation-based bias scores for individual neurons
    • Supports multiple aggregation methods (mean, max) across sequence positions
    • Works with GLU architecture MLP layers (gate_proj, up_proj)
  • New Function: compute_fairness_pruning_scores() - Combine bias and importance scores for balanced pruning
    • Configurable bias_weight parameter (0.0 to 1.0) to adjust fairness vs. performance trade-offs
    • Returns fairness pruning scores for each layer
    • Enables fairness-aware neuron selection strategies

Enhanced Pruning Integration

  • Modified: prune_model_mlp_glu() - Improved compatibility with fairness-aware workflows
  • Documentation: Added comprehensive fairness-aware pruning guide with examples

📚 Documentation Enhancements

  • Complete guide to fairness-aware pruning workflow
  • Step-by-step tutorials for new functions
  • Understanding the bias_weight parameter with recommended configurations
  • Complete end-to-end examples
  • New example notebook: fairness_aware_pruning_demo.ipynb

🧪 Testing & Quality

  • Compatible with existing pruning functionality
  • No breaking changes to existing API
  • All existing tests remain passing

📦 Installation

pip install optipfair==0.3.0

OptiPFair v0.2.4 - L2 & Universal DataLoader Support

10 Jan 19:05

Choose a tag to compare

Bug Fix Hybrid Pruning

04 Dec 09:15

Choose a tag to compare

v0.2.2 - Selective Layer Width Pruning

26 Nov 15:55

Choose a tag to compare

🚀 OptiPFair v0.2.2 - Selective Layer Width Pruning

We're excited to announce OptiPFair v0.2.2, bringing powerful new capabilities for fine-grained control over model pruning!

🎯 Headline Features

1️⃣ Selective Layer Width Pruning

The layer_indices parameter now works for both DEPTH and MLP_GLU pruning, giving you unprecedented control over which layers to optimize:

from optipfair import prune_model

# Prune neurons ONLY in specific layers (preserve first & last)
pruned_model = prune_model(
    model=model,
    pruning_type="MLP_GLU",
    pruning_percentage=30,
    layer_indices=[5, 10, 15, 20],  # Only these layers are pruned
    show_progress=True
)

Key Benefits:

  • 🛡️ Preserve Critical Layers: Keep embedding and output layers at full capacity
  • 🎯 Targeted Optimization: Prune only the layers that matter
  • 🔬 Data-Driven Selection: Combine with layer importance analysis
  • Full Feature Support: Works with expansion_rate, expansion_divisor, dataloader, all methods

2️⃣ Optimized Hybrid Importance Calculation

We've streamlined the data-driven pruning algorithm for better performance:

  • Simplified gate_proj & up_proj: Now use the same fast MAW method as static pruning
  • Focused Complexity: Activation-weighted calculation only where it matters (down_proj)
  • Faster Execution: Reduced computational overhead while maintaining effectiveness
  • Consistent Methodology: Same MAW formula across static and hybrid approaches

📊 What's New

Extended API

  • layer_indices parameter now contextual: removes layers for DEPTH, prunes neurons for MLP_GLU
  • ✅ Comprehensive validation: checks for valid indices, duplicates, empty lists, type errors
  • ✅ Enhanced statistics: reports pruned_layers and total_layers for selective pruning

Improved Performance

  • ⚡ Faster hybrid importance calculation
  • 💾 Selective hook registration (only on specified layers)
  • 🎯 More efficient calibration with layer_indices

Better Documentation

  • 📖 Complete "Selective Layer Width Pruning" guide in README
  • 📝 Extended reference manual with 4+ detailed examples
  • 💻 New example file with 5 practical use cases
  • 🧪 12 comprehensive test cases

💡 Common Use Cases

Use Case 1: Preserve Embedding Layers

# Prune all middle layers, preserve first and last 5
num_layers = len(model.model.layers)
middle_layers = list(range(5, num_layers - 5))

pruned_model = prune_model(
    model=model,
    pruning_type="MLP_GLU",
    pruning_percentage=25,
    layer_indices=middle_layers
)

Use Case 2: Importance-Based Pruning

from optipfair import analyze_layer_importance

# Step 1: Analyze which layers are least important
importance_scores = analyze_layer_importance(model, dataloader)
sorted_layers = sorted(importance_scores.items(), key=lambda x: x[1])
least_important = [idx for idx, score in sorted_layers[:10]]

# Step 2: Prune only those layers
pruned_model = prune_model(
    model=model,
    pruning_type="MLP_GLU",
    pruning_percentage=30,
    layer_indices=least_important
)

Use Case 3: Data-Driven Selective Pruning

# Combine calibration data with selective pruning
pruned_model = prune_model(
    model=model,
    pruning_type="MLP_GLU",
    neuron_selection_method="MAW",
    pruning_percentage=20,
    dataloader=calibration_dataloader,  # Hybrid importance
    layer_indices=[5, 10, 15, 20],      # Only these layers
    show_progress=True
)

🔧 Technical Highlights

Modified Core Functions

  • prune_model(): Now passes layer_indices to MLP_GLU pruning
  • prune_model_mlp_glu(): Full selective pruning implementation with validation
  • setup_mlp_hooks_for_importance(): Selective hook registration
  • compute_neuron_pair_importance_maw_hybrid(): Simplified and optimized
  • get_pruning_statistics(): Detects and reports selective pruning

Enhanced CLI

# CLI now supports layer_indices for both pruning types
optipfair prune \
  --model-path meta-llama/Llama-3.2-1B \
  --pruning-type MLP_GLU \
  --pruning-percentage 30 \
  --layer-indices "5,10,15,20" \
  --output-path ./pruned-model

🧪 Testing & Validation

  • ✅ 12 comprehensive test cases in tests/test_selective_layer_pruning.py
  • ✅ Tested with all neuron selection methods (MAW, VOW, PON)
  • ✅ Verified compatibility with expansion_rate, expansion_divisor, dataloader
  • ✅ Validated error handling and edge cases
  • ✅ Confirmed backward compatibility with v0.2.1

📦 Installation

pip install --upgrade optipfair

Or with visualization support:

pip install --upgrade "optipfair[viz]"

📚 Resources

🙏 Acknowledgments

Thank you to our community for the feedback and suggestions that made this release possible!

📝 Full Changelog

See CHANGELOG.md for detailed changes.


Upgrade today and take control of your model optimization! 🚀

Questions or issues? Open an issue on GitHub.

Hardware-Optimized width Pruning

24 Nov 10:11

Choose a tag to compare

🎉 OptiPFair v0.2.1 - Hardware-Optimized Pruning

This release introduces the expansion_divisor parameter for hardware-optimized model pruning, enabling better GPU/TPU performance through aligned tensor dimensions.

✨ What's New

Hardware-Optimized Pruning with expansion_divisor

The new expansion_divisor parameter allows you to round intermediate layer sizes to specific multiples (32, 64, 128, or 256), optimizing pruned models for modern GPU and TPU architectures.

Quick Example:

from optipfair import prune_model

pruned_model = prune_model(
    model=model,
    pruning_percentage=20,
    expansion_divisor=128,  # Round to multiple of 128
    show_progress=True
)

Key Benefits:

  • 🚀 Better GPU performance through optimized memory access patterns
  • ⚡ Improved tensor core efficiency with aligned dimensions
  • 🎯 Flexible integration with both pruning_percentage and expansion_rate
  • 🔧 Simple to use - just one parameter

📚 New Resources

  • Example Notebook: expansion_divisor_example.ipynb - Complete tutorial with comparisons
  • Test Suite: Comprehensive tests in tests/test_expansion_divisor.py
  • Documentation: Updated README, LLM reference manual, and API docs

🔧 Technical Details

New Functions:

  • round_to_divisor(): Utility function for precise rounding to nearest multiple

Modified Functions:

  • prune_model(): Added expansion_divisor parameter
  • prune_model_mlp_glu(): Integrated validation and rounding logic
  • prune_neuron_pairs(): Applies rounding after pruning calculation

Validation:

  • Valid values: None (default), 32, 64, 128, 256
  • Requires either pruning_percentage or expansion_rate
  • Maintains bounds: result always ≥1 and ≤ original size

🔄 Compatibility

  • ✅ Fully backward compatible with v0.2.0
  • ✅ Works with all neuron selection methods (MAW, VOW, PON)
  • ✅ Compatible with both static and data-driven pruning
  • ✅ No breaking changes

📦 Installation

pip install --upgrade optipfair
# or
pip install optipfair==0.2.1

📖 Documentation

🙏 Acknowledgments

Thank you to the community for your feedback and contributions!


Full Changelog: https://github.com/peremartra/optipfair/blob/main/CHANGELOG.md

v0.2.0 - Data-Driven Width Pruning

27 Oct 15:05

Choose a tag to compare

OptiPFair v0.2.0 - Data-Driven Width Pruning

🌟 Major Features

Data-Driven Width Pruning

This release introduces hybrid importance calculation for neuron pruning, combining static weight analysis with dynamic activation statistics from calibration data.

Key capabilities:

  • Activation-aware pruning: Uses real data to guide neuron selection
  • Domain adaptation: Optimize pruning for your specific use case
  • Research-backed: Based on CFSP methodology (arXiv:2409.13199v2)
  • Easy integration: Just add a dataloader parameter

What's New

API Changes

  • Added dataloader parameter to prune_model() function
  • Automatic switching between static and hybrid pruning
  • Compatible with MAW neuron selection method

New Functions

  • compute_neuron_pair_importance_maw_hybrid(): Hybrid importance calculation
  • setup_mlp_hooks_for_importance(): Activation capture via PyTorch hooks
  • run_calibration_forward_passes(): Calibration workflow with progress tracking
  • get_activation_norms(): Retrieve accumulated activation statistics

Documentation

  • Complete usage guide for data-driven pruning
  • Updated API reference with examples
  • Best practices for calibration data selection
  • Comprehensive CHANGELOG

Quick Start

   from transformers import AutoModelForCausalLM, AutoTokenizer
   from torch.utils.data import DataLoader, TensorDataset
   from optipfair import prune_model
   
   # Load model
   model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
   tokenizer = AutoTokenizer.from_pretrained("meta-llama/Llama-3.2-1B")
   
   # Prepare calibration data
   texts = ["Your domain-specific examples..."] * 500
   inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True)
   dataset = TensorDataset(inputs['input_ids'], inputs['attention_mask'])
   dataloader = DataLoader(dataset, batch_size=8)
   
   # Prune with data-driven method
   pruned_model = prune_model(
       model=model,
       neuron_selection_method="MAW",
       pruning_percentage=20,
       dataloader=dataloader,  # ← NEW: Enables hybrid pruning
       show_progress=True
   )

Installation

   pip install --upgrade optipfair

Breaking Changes

None - This release is fully backward compatible with v0.1.x

🔗 Documentation

Acknowledgments

This implementation is based on the CFSP paper:
"CFSP: An Efficient Structured Pruning Framework for LLMs with Coarse-to-Fine Activation Information" (arXiv:2409.13199v2)


Full Changelog: https://github.com/peremartra/optipfair/blob/main/CHANGELOG.md

OptiPFair v0.1.5 - Layer Importance Analysis

24 Sep 21:34

Choose a tag to compare

New Features

Layer Importance Analysis

  • Added analyze_layer_importance() function for analyzing transformer layer importance using cosine similarity
  • Multi-architecture support: automatic detection of layer paths for LLaMA, Qwen, Mistral, GPT-2, and other architectures
  • Integration with depth pruning workflows to inform layer removal decisions
  • Progress tracking and robust error handling

Improvements

  • Enhanced documentation with layer analysis examples
  • Updated API reference with new functionality

Usage Example

from optipfair import analyze_layer_importance
importance_scores = analyze_layer_importance(model, dataloader)

OptiPFair v0.1.4 - Depth Pruning Support

18 Jul 08:55

Choose a tag to compare

[0.1.4] - 2025-01-18

Added

  • Depth pruning functionality for removing entire transformer layers
  • Enhanced documentation with complete depth pruning guide
  • Automated documentation deployment via GitHub Actions

Changed

  • Updated examples to include depth pruning demonstrations
  • Improved API documentation structure

Fixed

  • Documentation deployment workflow permissions
  • Missing dependencies in CI/CD pipeline

Added Bias Activations Visualizations.

20 Apr 22:00
ef1289b

Choose a tag to compare

This release adds a comprehensive bias visualization module for analyzing how transformer models process information differently based on protected attributes (race, gender, etc.).

New features:

  • Visualization of activation differences across model layers
  • Heatmap analysis for detailed inspection of bias patterns
  • PCA visualization showing demographic effect on token representations
  • Quantitative bias metrics for consistent evaluation
  • Integration with existing pruning functionality
  • Documentation and examples for bias analysis

This update enables researchers and practitioners to understand where bias manifests in model architectures and evaluate how pruning affects fairness.

Fixes

13 Apr 20:03

Choose a tag to compare

Bug in the creation of new layers.