Memory optimization patches for HuggingFace Transformers.
-
Memory Reduction - Significantly lowers memory usage in Transformers models
-
Zero Configuration - Works automatically after import
pip install git+https://github.com/GeeeekExplorer/transformers-patch.git
Just import the patch before loading any Transformers models:
import transformers_patch
from transformers import AutoModel
Test Configuration:
- 8x GPU machine
- Micro batch size: 1
- Sequence length: 4096
- Gradient checkpointing: Disabled
- Model: Qwen3-8B
Memory Component | Fixed Allocation | Before Patch | After Patch |
---|---|---|---|
Model + Gradients | 30.5 GB | - | - |
ZeRO Optimizer States | 11.4 GB | - | - |
Activations | - | 35.4 GB | 17.8 GB |
50% reduction in activation memory!
See complete example in train.py
.