v1.2.3 - SageAttention Support #94

Enemyx-net · 2025-09-09T09:58:11Z

Enemyx-net
Sep 9, 2025
Maintainer

⚡ SageAttention Integration

New Attention Type: Added "sage" option to attention_type parameter
Quantized Attention: Uses INT8/FP8 precision for faster inference
Performance Boost: Speedup in generation with minimal quality impact

🚀 Performance Improvements

Speed Comparison

SageAttention: Up to 2-3x faster than standard attention
Memory Efficient: Reduced memory bandwidth usage
Quality Maintained: Negligible quality difference in output

✨ How It Works

SageAttention uses quantized operations (INT8/FP8) instead of full precision:

✅ Faster Matrix Operations: Quantized math is significantly faster
✅ Lower Memory Bandwidth: Smaller data types reduce memory transfer
✅ Intelligent Quantization: Preserves important information while reducing precision
✅ Hardware Acceleration: Leverages modern GPU tensor cores

⚙️ Requirements

For SageAttention:

GPU: NVIDIA GPU with CUDA support
Library: Install sageattention
CUDA: Compatible CUDA version installed

🎛️ Usage

Select "sage" from the attention_type dropdown:
auto: Let transformers choose (default)
eager: Standard implementation
sdpa: Scaled Dot Product Attention
flash_attention_2: Flash Attention 2
sage: SageAttention (NEW) - quantized for speed

💾 Installation

Install via ComfyUI Manager or manually:
git clone https://github.com/Enemyx-net/VibeVoice-ComfyUI

📋 Notes

SageAttention is only available for NVIDIA GPUs with CUDA
Falls back to standard attention if requirements aren't met
Ideal for production environments prioritizing speed
Compatible with all model variants including 4-bit quantized models

This discussion was created from the release v1.2.3 - SageAttention Support.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v1.2.3 - SageAttention Support #94

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

v1.2.3 - SageAttention Support #94

Uh oh!

Enemyx-net Sep 9, 2025 Maintainer

⚡ SageAttention Integration

🚀 Performance Improvements

Speed Comparison

✨ How It Works

⚙️ Requirements

For SageAttention:

🎛️ Usage

💾 Installation

📋 Notes

Replies: 0 comments

Enemyx-net
Sep 9, 2025
Maintainer