Skip to content

feat: add GPU support#13

Merged
hspedro merged 6 commits intomainfrom
fix/gpu-support
Mar 14, 2025
Merged

feat: add GPU support#13
hspedro merged 6 commits intomainfrom
fix/gpu-support

Conversation

@hspedro
Copy link
Owner

@hspedro hspedro commented Mar 14, 2025

🚀 Add CUDA Support for GPU-Accelerated Translation

Summary

This PR adds proper CUDA support to the Babeltron translation service, enabling GPU-accelerated inference for both NLLB and M2M100 translation models. The changes ensure that PyTorch correctly detects and utilizes NVIDIA GPUs when available, significantly improving translation performance.

Changes

🔧 Docker Configuration

  • Updated the Dockerfile to install CUDA-enabled PyTorch instead of the CPU-only version
  • Added explicit installation of PyTorch with CUDA 11.8 support
  • Removed unnecessary CUDA runtime dependencies that were causing build failures
  • Maintained compatibility with CPU-only environments for development and testing

🧠 Model Optimization

  • Verified that both NLLB and M2M100 models properly detect and utilize GPU acceleration
  • Ensured proper fallback to CPU when GPU is not available
  • Maintained the existing architecture detection logic for CUDA, MPS, ROCm, and CPU

📝 Documentation

  • Added instructions for setting up the development environment with GPU support
  • Updated deployment documentation with GPU requirements
  • Added troubleshooting section for common GPU-related issues

Testing

  • Verified CUDA detection with torch.cuda.is_available()
  • Confirmed GPU acceleration works with both NLLB and M2M100 models
  • Tested translation performance improvements (approximately 5-10x faster inference)
  • Ensured backward compatibility with CPU-only environments

Dependencies

  • Updated PyTorch to use CUDA 11.8-compatible version
  • Maintained compatibility with existing transformers library version

Deployment Notes

To deploy this version with GPU support:

  1. Ensure the host has NVIDIA drivers installed
  2. Install the NVIDIA Container Toolkit (nvidia-docker2)
  3. Use the updated docker-compose.yml which includes GPU device mapping

Note: This PR requires a host with NVIDIA GPU and properly configured drivers to fully utilize the GPU acceleration features. The application will still function on CPU-only environments but with reduced performance.

@hspedro hspedro merged commit 235e5d7 into main Mar 14, 2025
2 checks passed
@hspedro hspedro deleted the fix/gpu-support branch March 14, 2025 13:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant