We actively maintain and provide security updates for the following versions:
Version | Supported |
---|---|
Latest | ✅ |
< Latest | ❌ |
Flash Dynamic Mask Attention includes CUDA kernels and C++ extensions that execute on your GPU. When using this library:
- Only install from trusted sources (official PyPI releases or verified builds)
- Be cautious when building from source with modifications
- Verify checksums when downloading pre-built binaries
This library depends on:
- PyTorch (with CUDA support)
- NVIDIA CUTLASS library
- Standard Python scientific computing libraries
We recommend keeping all dependencies up to date and using virtual environments for isolation.
Our CUDA kernels are designed with memory safety in mind:
- Bounds checking is implemented where performance allows
- Memory allocation patterns are tested across different input sizes
- We use established patterns from Flash Attention and CUTLASS
However, as with any low-level CUDA code:
- Very large input tensors may cause out-of-memory errors
- Invalid input shapes may cause undefined behavior
- Custom modifications to kernel code should be thoroughly tested
If you discover a security vulnerability, please report it responsibly:
For security issues:
- Email: [email protected]
- Subject: [SECURITY] Flash-DMA Vulnerability Report
- Include: Detailed description, reproduction steps, and potential impact
For general bugs:
- Use our GitHub Issues
- Follow our contributing guidelines
- Acknowledgment: Within 48 hours
- Initial Assessment: Within 1 week
- Resolution: Depends on severity and complexity
Critical security issues will be prioritized and may result in emergency releases.
When using Flash Dynamic Mask Attention:
-
Environment Isolation
# Use virtual environments python -m venv flash_dma_env source flash_dma_env/bin/activate # Linux/Mac # or flash_dma_env\Scripts\activate # Windows
-
Dependency Management
# Keep dependencies updated pip install --upgrade torch flash-dmattn
-
Input Validation
# Validate tensor shapes and dtypes before processing assert query.dtype in [torch.float16, torch.bfloat16, torch.float32] assert query.shape == key.shape == value.shape
-
Resource Monitoring
# Monitor GPU memory usage import torch print(f"GPU Memory: {torch.cuda.memory_allocated() / 1e9:.2f} GB")
- Confirmed vulnerabilities will be disclosed responsibly
- Security fixes will be released as soon as safely possible
- CVE numbers will be requested for significant vulnerabilities
- Credit will be given to security researchers who report issues responsibly
For security-related questions or concerns:
- Primary: [email protected]
- Project maintainers: See AUTHORS file
For general support:
- GitHub Issues: https://github.com/SmallDoges/flash-dmattn/issues
- Documentation: https://github.com/SmallDoges/flash-dmattn/tree/main/docs/