A Python library to generate optimized CUDA code from XGBoost models high-performance GPU for inference.
XGBoost2GPU transforms your trained XGBoost models into highly optimized CUDA code, enabling lightning-fast inference on GPUs with advanced pruning strategies and quantization techniques.
Warning
This library requires NVIDIA GPU with CUDA support and TreeLUT dependencies.
Before installing XGBoost2GPU, ensure you have:
- Python 3.8+
- CUDA 11.0+ (Download CUDA)
- NVIDIA GPU with compute capability 6.0+
- Git for dependency installation
# Clone the repository
git clone https://github.com/Olavo-B/XGBoost2GPU.git
cd XGBoost2GPU
# Create virtual environment (recommended)
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install the package
pip install -e .# Install all dependencies including TreeLUT
pip install -e ".[all]"
# Or install development dependencies
pip install -e ".[dev]"Note
TreeLUT and LoadDataset are automatically installed from GitHub repositories.
from xgboost2gpu import XGBoost2GPU
from treelut import TreeLUTClassifier
import xgboost as xgb
import numpy as np
# 1. Train your XGBoost model
X_train, y_train = load_your_data() # Your data loading function
xgb_model = xgb.XGBClassifier(n_estimators=10, max_depth=5)
xgb_model.fit(X_train, y_train)
# 2. Convert to TreeLUT
treelut_model = TreeLUTClassifier(
xgb_model=xgb_model,
w_feature=3, # Feature quantization bits
w_tree=3, # Tree quantization bits
quantized=True
)
treelut_model.convert()
# 3. Generate CUDA code
xgb2gpu = XGBoost2GPU(
treelut_model=treelut_model,
w_feature=3,
w_tree=3,
n_samples=1000
)
# 4. Generate optimized CUDA kernel
xgb2gpu.generate_cuda_code("inference.cu")
# 5. Apply pruning for optimization
xgb2gpu.calculate_forest_probabilities(
percentage_to_cut=0.1,
strategy="adaptive"
)
print("✅ CUDA code generated successfully!")Tip
Check out the complete Jupyter notebook example for a full workflow demonstration.
# Compile the generated CUDA code
nvcc -arch=sm_75 -O3 -o inference optimized_inference.cu
# Run inference (requires input.csv and expected_output.csv)
./inferenceXGBoost2GPU supports multiple pruning strategies to optimize model performance:
| Strategy | Description | Use Case |
|---|---|---|
linear |
Linear probability distribution | Uniform pruning |
exponential |
Exponential decay by tree depth | Deep tree optimization |
adaptive |
Dynamic pruning based on importance | Best balance |
random |
Random pruning | Testing/debugging |
{
"percentage_to_cut": 0.15,
"strategy": "adaptive",
"level_importance": 0.7,
"progress_importance": 0.3,
"level_bias": 1.5,
"max_cut_percentage": 0.25,
"urgency_override_threshold": 1.2
}| Parameter | Type | Default | Description |
|---|---|---|---|
w_feature |
int | 3 | Feature quantization bits (1-8) |
w_tree |
int | 3 | Tree quantization bits (1-8) |
n_samples |
int | 1000 | Number of inference samples |
n_threads |
int | 1024 | CUDA threads per block |
n_blocks |
int | 768 | Number of CUDA blocks |
XGBoost2GPU/
├── src/xgboost2gpu/ # Main package
│ ├── __init__.py # Package initialization
│ ├── xgboost2gpu.py # Core XGBoost2GPU class
│ └── treePruningHash.py # Pruning utilities
├── misc/example/ # Usage examples
├── misc/docs/ # Documentation
├── test/ # Unit tests
├── requirements.txt # Dependencies
├── setup.py # Package setup
└── README.md # This file
- CUDA Compute Capability: Requires 6.0+ (Maxwell architecture or newer)
- Model Size: Very large models may exceed GPU memory limits
- Precision: Quantization may affect model accuracy (typically <1% loss)
- Dependencies: Requires TreeLUT framework (automatically installed)
| Issue | Solution |
|---|---|
ModuleNotFoundError: No module named 'treelut' |
Install with pip install -e ".[all]" |
| CUDA compilation errors | Check CUDA version and GPU compute capability |
| Memory errors | Reduce n_samples or model size |
| Import errors | Ensure virtual environment activation |
We welcome contributions! Here's how to get started:
This project is licensed under the MIT License - see the LICENSE file for details.
- Olavo Alves Barros Silva - Initial work - GitHub
- Email: olavo.barros@ufv.com
- Institution: Universidade Federal de Viçosa (UFV)
- XGBoost community for the excellent ML library
- NVIDIA for CUDA development tools
- Open Source community for inspiration and feedback
- LoadDataset - Dataset utilities
- XGBoost - Gradient boosting framework
