Skip to content

Commit 55b5c02

Browse files
committed
Release v0.2.2: Selective layer width pruning & optimized hybrid importance
1 parent fff73be commit 55b5c02

File tree

13 files changed

+1559
-25
lines changed

13 files changed

+1559
-25
lines changed

.DS_Store

0 Bytes
Binary file not shown.

CHANGELOG.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,82 @@
1+
## [0.2.2] - 2025-11-26
2+
3+
### 🎉 New Features
4+
5+
#### Selective Layer Width Pruning
6+
- **layer_indices for MLP_GLU**: Extended `layer_indices` parameter to support selective neuron pruning in specific layers
7+
- **Contextual Usage**: For DEPTH pruning, specifies layers to remove; for MLP_GLU, specifies layers to prune
8+
- **Preservation Strategy**: Allows preserving critical layers (e.g., first/last) at full capacity while pruning others
9+
- **Full Compatibility**: Works seamlessly with all MLP_GLU features (expansion_rate, expansion_divisor, dataloader, all methods)
10+
11+
#### Simplified Hybrid Importance Calculation
12+
- **Optimized MAW Hybrid**: Simplified `compute_neuron_pair_importance_maw_hybrid()` to use simple MAW for gate_proj and up_proj
13+
- **Focused Complexity**: Maintains complex activation-weighted calculation only for down_proj where it has most impact
14+
- **Better Performance**: Faster execution by reducing unnecessary calculations
15+
- **Consistent Formula**: Uses same MAW method (max + |min|) as static pruning for gate/up components
16+
17+
### ✨ Enhancements
18+
19+
- **Extended API**: `layer_indices` parameter now works for both DEPTH and MLP_GLU pruning types
20+
- **Smart Validation**: Comprehensive error checking for layer indices (range, duplicates, empty lists, types)
21+
- **Enhanced Statistics**: `get_pruning_statistics()` now reports selective pruning info (pruned_layers, total_layers)
22+
- **Selective Calibration**: Hooks only registered on selected layers when using data-driven pruning with layer_indices
23+
- **CLI Support**: Updated `--layer-indices` help text to mention both pruning types
24+
- **Backward Compatible**: `layer_indices=None` maintains default behavior (prunes all layers)
25+
26+
### 🔧 Technical Details
27+
28+
#### Modified Functions
29+
- `prune_model()`: Updated docstring and passes `layer_indices` to `prune_model_mlp_glu()`
30+
- `prune_model_mlp_glu()`: Added `layer_indices` parameter with full validation and filtering logic
31+
- `setup_mlp_hooks_for_importance()`: Now accepts `layer_indices` to register hooks only on selected layers
32+
- `compute_neuron_pair_importance_maw_hybrid()`: Simplified to use MAW for gate/up, complex calculation only for down
33+
- `get_pruning_statistics()`: Detects and reports selective pruning information
34+
- CLI `commands.py`: Removed restriction blocking `layer_indices` for MLP_GLU, added parsing logic
35+
36+
### 📚 Documentation
37+
38+
- **README.md**: New "Selective Layer Width Pruning" section with examples and use cases
39+
- **Reference Manual**: Comprehensive section with 4+ usage examples and best practices
40+
- **New Example File**: `examples/selective_layer_width_pruning.py` with 5 complete examples
41+
- **Updated Roadmap**: Marked selective pruning as completed in v0.2.2
42+
- **API Documentation**: Updated parameter descriptions for contextual meaning
43+
44+
### 🧪 Testing
45+
46+
- Complete test suite in `tests/test_selective_layer_pruning.py`
47+
- 12 comprehensive test cases covering:
48+
- Basic selective pruning (single and multiple layers)
49+
- All neuron selection methods (MAW, VOW, PON)
50+
- Compatibility with expansion_rate and expansion_divisor
51+
- Data-driven pruning with layer_indices
52+
- Invalid input handling and validation
53+
- Statistics reporting
54+
- Weight preservation in unpruned layers
55+
- Result consistency and reproducibility
56+
57+
### 💡 Use Cases
58+
59+
1. **Preserve Critical Layers**: Keep first and last layers at full capacity
60+
2. **Importance-Based**: Target least important layers identified by analysis
61+
3. **Domain Adaptation**: Implement asymmetric pruning strategies
62+
4. **Experimental**: Test different layer-wise pruning patterns
63+
64+
### 🔒 Compatibility
65+
66+
- Fully backward compatible with v0.2.1
67+
- Works with all neuron selection methods (MAW, VOW, PON)
68+
- Compatible with both static and data-driven pruning
69+
- Integrates with expansion_rate and expansion_divisor
70+
71+
### ⚠️ Important Notes
72+
73+
- `layer_indices` validation ensures indices are valid, unique integers within model range
74+
- Empty lists raise `ValueError`
75+
- Selective pruning with dataloader only calibrates on specified layers (more efficient)
76+
- Statistics include `pruned_layers` and `total_layers` when selective pruning is detected
77+
78+
---
79+
180
## [0.2.1] - 2025-11-24
281

382
### 🎉 New Features

README.md

Lines changed: 87 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -184,6 +184,48 @@ pruned_model.save_pretrained("./pruned-datadriven-model")
184184

185185
**Note:** Data-driven pruning is currently only available with `neuron_selection_method="MAW"`. Using a dataloader with "VOW" or "PON" will raise a `ValueError`.
186186

187+
### Selective Layer Width Pruning (NEW in v0.2.0)
188+
189+
Prune neurons only in specific layers while leaving others unchanged. Perfect for preserving critical layers or implementing layer-specific optimization strategies.
190+
191+
```python
192+
from transformers import AutoModelForCausalLM
193+
from optipfair import prune_model
194+
195+
# Load a pre-trained model
196+
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
197+
198+
# Prune neurons only in specific layers (e.g., middle layers)
199+
pruned_model, stats = prune_model(
200+
model=model,
201+
pruning_type="MLP_GLU",
202+
neuron_selection_method="MAW",
203+
pruning_percentage=30,
204+
layer_indices=[5, 10, 15, 20, 25], # Only prune these layers
205+
show_progress=True,
206+
return_stats=True
207+
)
208+
209+
# Print pruning statistics
210+
print(f"Pruned {stats['pruned_layers']} of {stats['total_layers']} layers")
211+
print(f"Total reduction: {stats['reduction']:,} parameters ({stats['percentage_reduction']:.2f}%)")
212+
213+
# Save the pruned model
214+
pruned_model.save_pretrained("./selective-pruned-llama")
215+
```
216+
217+
**Key Benefits:**
218+
- 🎯 **Precision Control**: Choose exactly which layers to optimize
219+
- 🛡️ **Preserve Critical Layers**: Keep first and last layers at full capacity
220+
- 🔬 **Data-Driven Selection**: Combine with layer importance analysis
221+
-**Full Compatibility**: Works with all MLP_GLU features (expansion_rate, expansion_divisor, dataloader)
222+
223+
**Use Cases:**
224+
- Preserve embedding and output layers while pruning middle layers
225+
- Target specific layer ranges based on importance analysis
226+
- Implement asymmetric pruning strategies for domain adaptation
227+
- Experiment with different layer-wise pruning patterns
228+
187229
### Hardware-Optimized Pruning with expansion_divisor (NEW in v0.2.0)
188230

189231
The `expansion_divisor` parameter ensures that intermediate layer sizes are divisible by specific values (32, 64, 128, or 256), optimizing performance on modern GPUs and TPUs.
@@ -219,6 +261,48 @@ pruned_model.save_pretrained("./pruned-optimized-model")
219261

220262
**Note:** Cannot be used alone—requires either `pruning_percentage` or `expansion_rate`.
221263

264+
### Selective Layer Width Pruning (NEW in v0.2.0)
265+
266+
Prune neurons only in specific layers while leaving others unchanged. Perfect for preserving critical layers or implementing layer-specific optimization strategies.
267+
268+
```python
269+
from transformers import AutoModelForCausalLM
270+
from optipfair import prune_model
271+
272+
# Load a pre-trained model
273+
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
274+
275+
# Prune neurons only in specific layers (e.g., middle layers)
276+
pruned_model, stats = prune_model(
277+
model=model,
278+
pruning_type="MLP_GLU",
279+
neuron_selection_method="MAW",
280+
pruning_percentage=30,
281+
layer_indices=[5, 10, 15, 20, 25], # Only prune these layers
282+
show_progress=True,
283+
return_stats=True
284+
)
285+
286+
# Print pruning statistics
287+
print(f"Pruned {stats['pruned_layers']} of {stats['total_layers']} layers")
288+
print(f"Total reduction: {stats['reduction']:,} parameters ({stats['percentage_reduction']:.2f}%)
289+
290+
# Save the pruned model
291+
pruned_model.save_pretrained("./selective-pruned-llama")
292+
```
293+
294+
**Key Benefits:**
295+
- 🎯 **Precision Control**: Choose exactly which layers to optimize
296+
- 🛡️ **Preserve Critical Layers**: Keep first and last layers at full capacity
297+
- 🔬 **Data-Driven Selection**: Combine with layer importance analysis
298+
-**Full Compatibility**: Works with all MLP_GLU features (expansion_rate, expansion_divisor, dataloader)
299+
300+
**Use Cases:**
301+
- Preserve embedding and output layers while pruning middle layers
302+
- Target specific layer ranges based on importance analysis
303+
- Implement asymmetric pruning strategies for domain adaptation
304+
- Experiment with different layer-wise pruning patterns
305+
222306
### Pruning Transformer Layers (Depth Pruning)
223307

224308
Remove entire layers from a model for significant efficiency gains. Here, we remove the last 4 layers.
@@ -365,6 +449,9 @@ The optipfair project is actively developed. Here's what's planned for the futur
365449
### Future Roadmap
366450
Our goal is to make optipfair the go-to toolkit for efficient and fair model optimization. Key upcoming features include:
367451

452+
* **Selective Layer Width Pruning**: Implemented in v0.2.0 ✓ - Prune neurons in specific layers using layer_indices
453+
* **Data-Driven Width Pruning**: Implemented in v0.2.0 ✓ - Hybrid importance with calibration data
454+
* **Hardware-Optimized Pruning**: Implemented in v0.2.0 ✓ - expansion_divisor for GPU optimization
368455
* **Attention Pruning**: Implementing Attention Bypass and Adaptive Attention Bypass(AAB).
369456
* **Advanced Benchmarks**: Integrating more comprehensive performance and evaluation benchmarks.
370457
* **GPU Optimizations**: Creating a v2.0 with significant GPU-specific optimizations for faster execution.

RELEASE_NOTES_v0.2.2.md

Lines changed: 163 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,163 @@
1+
# 🚀 OptiPFair v0.2.2 - Selective Layer Width Pruning
2+
3+
We're excited to announce **OptiPFair v0.2.2**, bringing powerful new capabilities for fine-grained control over model pruning!
4+
5+
## 🎯 Headline Features
6+
7+
### 1️⃣ Selective Layer Width Pruning
8+
9+
The `layer_indices` parameter now works for **both DEPTH and MLP_GLU pruning**, giving you unprecedented control over which layers to optimize:
10+
11+
```python
12+
from optipfair import prune_model
13+
14+
# Prune neurons ONLY in specific layers (preserve first & last)
15+
pruned_model = prune_model(
16+
model=model,
17+
pruning_type="MLP_GLU",
18+
pruning_percentage=30,
19+
layer_indices=[5, 10, 15, 20], # Only these layers are pruned
20+
show_progress=True
21+
)
22+
```
23+
24+
**Key Benefits:**
25+
- 🛡️ **Preserve Critical Layers**: Keep embedding and output layers at full capacity
26+
- 🎯 **Targeted Optimization**: Prune only the layers that matter
27+
- 🔬 **Data-Driven Selection**: Combine with layer importance analysis
28+
-**Full Feature Support**: Works with expansion_rate, expansion_divisor, dataloader, all methods
29+
30+
### 2️⃣ Optimized Hybrid Importance Calculation
31+
32+
We've streamlined the data-driven pruning algorithm for better performance:
33+
34+
- **Simplified gate_proj & up_proj**: Now use the same fast MAW method as static pruning
35+
- **Focused Complexity**: Activation-weighted calculation only where it matters (down_proj)
36+
- **Faster Execution**: Reduced computational overhead while maintaining effectiveness
37+
- **Consistent Methodology**: Same MAW formula across static and hybrid approaches
38+
39+
## 📊 What's New
40+
41+
### Extended API
42+
-`layer_indices` parameter now contextual: removes layers for DEPTH, prunes neurons for MLP_GLU
43+
- ✅ Comprehensive validation: checks for valid indices, duplicates, empty lists, type errors
44+
- ✅ Enhanced statistics: reports `pruned_layers` and `total_layers` for selective pruning
45+
46+
### Improved Performance
47+
- ⚡ Faster hybrid importance calculation
48+
- 💾 Selective hook registration (only on specified layers)
49+
- 🎯 More efficient calibration with layer_indices
50+
51+
### Better Documentation
52+
- 📖 Complete "Selective Layer Width Pruning" guide in README
53+
- 📝 Extended reference manual with 4+ detailed examples
54+
- 💻 New example file with 5 practical use cases
55+
- 🧪 12 comprehensive test cases
56+
57+
## 💡 Common Use Cases
58+
59+
### Use Case 1: Preserve Embedding Layers
60+
```python
61+
# Prune all middle layers, preserve first and last 5
62+
num_layers = len(model.model.layers)
63+
middle_layers = list(range(5, num_layers - 5))
64+
65+
pruned_model = prune_model(
66+
model=model,
67+
pruning_type="MLP_GLU",
68+
pruning_percentage=25,
69+
layer_indices=middle_layers
70+
)
71+
```
72+
73+
### Use Case 2: Importance-Based Pruning
74+
```python
75+
from optipfair import analyze_layer_importance
76+
77+
# Step 1: Analyze which layers are least important
78+
importance_scores = analyze_layer_importance(model, dataloader)
79+
sorted_layers = sorted(importance_scores.items(), key=lambda x: x[1])
80+
least_important = [idx for idx, score in sorted_layers[:10]]
81+
82+
# Step 2: Prune only those layers
83+
pruned_model = prune_model(
84+
model=model,
85+
pruning_type="MLP_GLU",
86+
pruning_percentage=30,
87+
layer_indices=least_important
88+
)
89+
```
90+
91+
### Use Case 3: Data-Driven Selective Pruning
92+
```python
93+
# Combine calibration data with selective pruning
94+
pruned_model = prune_model(
95+
model=model,
96+
pruning_type="MLP_GLU",
97+
neuron_selection_method="MAW",
98+
pruning_percentage=20,
99+
dataloader=calibration_dataloader, # Hybrid importance
100+
layer_indices=[5, 10, 15, 20], # Only these layers
101+
show_progress=True
102+
)
103+
```
104+
105+
## 🔧 Technical Highlights
106+
107+
### Modified Core Functions
108+
- `prune_model()`: Now passes layer_indices to MLP_GLU pruning
109+
- `prune_model_mlp_glu()`: Full selective pruning implementation with validation
110+
- `setup_mlp_hooks_for_importance()`: Selective hook registration
111+
- `compute_neuron_pair_importance_maw_hybrid()`: Simplified and optimized
112+
- `get_pruning_statistics()`: Detects and reports selective pruning
113+
114+
### Enhanced CLI
115+
```bash
116+
# CLI now supports layer_indices for both pruning types
117+
optipfair prune \
118+
--model-path meta-llama/Llama-3.2-1B \
119+
--pruning-type MLP_GLU \
120+
--pruning-percentage 30 \
121+
--layer-indices "5,10,15,20" \
122+
--output-path ./pruned-model
123+
```
124+
125+
## 🧪 Testing & Validation
126+
127+
- ✅ 12 comprehensive test cases in `tests/test_selective_layer_pruning.py`
128+
- ✅ Tested with all neuron selection methods (MAW, VOW, PON)
129+
- ✅ Verified compatibility with expansion_rate, expansion_divisor, dataloader
130+
- ✅ Validated error handling and edge cases
131+
- ✅ Confirmed backward compatibility with v0.2.1
132+
133+
## 📦 Installation
134+
135+
```bash
136+
pip install --upgrade optipfair
137+
```
138+
139+
Or with visualization support:
140+
```bash
141+
pip install --upgrade "optipfair[viz]"
142+
```
143+
144+
## 📚 Resources
145+
146+
- **Documentation**: [https://peremartra.github.io/optipfair/](https://peremartra.github.io/optipfair/)
147+
- **GitHub**: [https://github.com/peremartra/optipfair](https://github.com/peremartra/optipfair)
148+
- **Examples**: Check out `examples/selective_layer_width_pruning.py`
149+
- **Tests**: See `tests/test_selective_layer_pruning.py`
150+
151+
## 🙏 Acknowledgments
152+
153+
Thank you to our community for the feedback and suggestions that made this release possible!
154+
155+
## 📝 Full Changelog
156+
157+
See [CHANGELOG.md](https://github.com/peremartra/optipfair/blob/main/CHANGELOG.md) for detailed changes.
158+
159+
---
160+
161+
**Upgrade today and take control of your model optimization!** 🚀
162+
163+
Questions or issues? Open an issue on [GitHub](https://github.com/peremartra/optipfair/issues).

0 commit comments

Comments
 (0)