Skip to content

Security: SmallDoges/flash-dmattn

Security

SECURITY.md

Security Policy

Supported Versions

We actively maintain and provide security updates for the following versions:

Version Supported
Latest
< Latest

Security Considerations

CUDA Code Execution

Flash Dynamic Mask Attention includes CUDA kernels and C++ extensions that execute on your GPU. When using this library:

  • Only install from trusted sources (official PyPI releases or verified builds)
  • Be cautious when building from source with modifications
  • Verify checksums when downloading pre-built binaries

Dependencies

This library depends on:

  • PyTorch (with CUDA support)
  • NVIDIA CUTLASS library
  • Standard Python scientific computing libraries

We recommend keeping all dependencies up to date and using virtual environments for isolation.

Memory Safety

Our CUDA kernels are designed with memory safety in mind:

  • Bounds checking is implemented where performance allows
  • Memory allocation patterns are tested across different input sizes
  • We use established patterns from Flash Attention and CUTLASS

However, as with any low-level CUDA code:

  • Very large input tensors may cause out-of-memory errors
  • Invalid input shapes may cause undefined behavior
  • Custom modifications to kernel code should be thoroughly tested

Reporting a Vulnerability

If you discover a security vulnerability, please report it responsibly:

For security issues:

  • Email: [email protected]
  • Subject: [SECURITY] Flash-DMA Vulnerability Report
  • Include: Detailed description, reproduction steps, and potential impact

For general bugs:

Response Timeline

  • Acknowledgment: Within 48 hours
  • Initial Assessment: Within 1 week
  • Resolution: Depends on severity and complexity

Critical security issues will be prioritized and may result in emergency releases.

Security Best Practices

When using Flash Dynamic Mask Attention:

  1. Environment Isolation

    # Use virtual environments
    python -m venv flash_dma_env
    source flash_dma_env/bin/activate  # Linux/Mac
    # or
    flash_dma_env\Scripts\activate     # Windows
  2. Dependency Management

    # Keep dependencies updated
    pip install --upgrade torch flash-dmattn
  3. Input Validation

    # Validate tensor shapes and dtypes before processing
    assert query.dtype in [torch.float16, torch.bfloat16, torch.float32]
    assert query.shape == key.shape == value.shape
  4. Resource Monitoring

    # Monitor GPU memory usage
    import torch
    print(f"GPU Memory: {torch.cuda.memory_allocated() / 1e9:.2f} GB")

Disclosure Policy

  • Confirmed vulnerabilities will be disclosed responsibly
  • Security fixes will be released as soon as safely possible
  • CVE numbers will be requested for significant vulnerabilities
  • Credit will be given to security researchers who report issues responsibly

Contact

For security-related questions or concerns:

For general support:

There aren’t any published security advisories