[RFC] Cache Dit extension supports the MindIE SD acceleration suite to further accelerate the Ascend NPU backend.

**Motivation.**
We have found that Cache Dit now supports the backend for Ascend NPUs. However, currently, it only supports basic capabilities, and the acceleration features need to be enhanced to achieve an overall performance improvement.

MindIE SD serves as an acceleration suite for Ascend NPUs in the multimodal domain, which includes core acceleration operators (FA variants, DiTMoE (in planning)), dedicated multimodal fusion operators, and quantization capabilities. Based on these methods, the performance of flux.1-dev can be further improved by **20%**.

To achieve the associated acceleration benefits, Cache Dit needs to support the backend for the MindIE SD acceleration suite.

**Proposed Change**
Considering that Cache Dit itself provides hardware-agnostic acceleration capabilities such as cache and parallelism, for the Ascend backend, it can further extend support for hardware-specific quantization, FA backends, and operator fusion. These features can be supported through the following solutions:
1. Quantization: Prioritize support for dynamic quantization capabilities.
2. FA backend: Cache Dit will define standard interfaces for different types of FA, and hardware vendors will integrate these interfaces based on their own implementations.
3. Operator fusion: Utilize the mechanism of torch.compile to achieve automatic operator fusion. (Given the varying levels of support for compile across different vendors, custom extensions are required rather than directly using PyTorch's compile.)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Cache Dit extension supports the MindIE SD acceleration suite to further accelerate the Ascend NPU backend. #733

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[RFC] Cache Dit extension supports the MindIE SD acceleration suite to further accelerate the Ascend NPU backend. #733

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions