You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
**Note:** Data-driven pruning is currently only available with `neuron_selection_method="MAW"`. Using a dataloader with "VOW" or "PON" will raise a `ValueError`.
236
236
237
+
### Selective Layer Width Pruning (NEW in v0.2.0)
238
+
239
+
Prune neurons only in specific layers while leaving others unchanged. Perfect for preserving critical layers or implementing layer-specific optimization strategies.
240
+
241
+
```python
242
+
from transformers import AutoModelForCausalLM
243
+
from optipfair import prune_model
244
+
245
+
# Load a pre-trained model
246
+
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
247
+
248
+
# Prune neurons only in specific layers (e.g., middle layers)
249
+
pruned_model, stats = prune_model(
250
+
model=model,
251
+
pruning_type="MLP_GLU",
252
+
neuron_selection_method="MAW",
253
+
pruning_percentage=30,
254
+
layer_indices=[5, 10, 15, 20, 25], # Only prune these layers
255
+
show_progress=True,
256
+
return_stats=True
257
+
)
258
+
259
+
# Print pruning statistics
260
+
print(f"Pruned {stats['pruned_layers']} of {stats['total_layers']} layers")
- 🎯 **Precision Control**: Choose exactly which layers to optimize
269
+
- 🛡️ **Preserve Critical Layers**: Keep first and last layers at full capacity
270
+
- 🔬 **Data-Driven Selection**: Combine with layer importance analysis
271
+
- ⚡ **Full Compatibility**: Works with all MLP_GLU features (expansion_rate, expansion_divisor, dataloader)
272
+
273
+
**Use Cases:**
274
+
- Preserve embedding and output layers while pruning middle layers
275
+
- Target specific layer ranges based on importance analysis
276
+
- Implement asymmetric pruning strategies for domain adaptation
277
+
- Experiment with different layer-wise pruning patterns
278
+
237
279
### Hardware-Optimized Pruning with expansion_divisor (NEW in v0.2.0)
238
280
239
281
The `expansion_divisor` parameter ensures that intermediate layer sizes are divisible by specific values (32, 64, 128, or 256), optimizing performance on modern GPUs and TPUs.
**Note:** Cannot be used alone—requires either `pruning_percentage` or `expansion_rate`.
271
313
314
+
### Selective Layer Width Pruning (NEW in v0.2.0)
315
+
316
+
Prune neurons only in specific layers while leaving others unchanged. Perfect for preserving critical layers or implementing layer-specific optimization strategies.
317
+
318
+
```python
319
+
from transformers import AutoModelForCausalLM
320
+
from optipfair import prune_model
321
+
322
+
# Load a pre-trained model
323
+
model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-1B")
324
+
325
+
# Prune neurons only in specific layers (e.g., middle layers)
326
+
pruned_model, stats = prune_model(
327
+
model=model,
328
+
pruning_type="MLP_GLU",
329
+
neuron_selection_method="MAW",
330
+
pruning_percentage=30,
331
+
layer_indices=[5, 10, 15, 20, 25], # Only prune these layers
332
+
show_progress=True,
333
+
return_stats=True
334
+
)
335
+
336
+
# Print pruning statistics
337
+
print(f"Pruned {stats['pruned_layers']} of {stats['total_layers']} layers")
0 commit comments