Skip to content

Development issues with fused_layersΒ #1887

@Eyoel-gebre

Description

@Eyoel-gebre

I am working on adding support for lite-whisper: #1886, efeslab/LiteASR#7

However, the existing fused layer logic does not work for the low rank qkv matrices used ln lite-whisper: paper for context. This is because a low-rank linear layer for any one of the qkv projection matrices has the potential to fall back to the full layer if the compression algorithm can't compress the layer without sacrificing accuracy. This means that in lite-whisper, the qkv layers in each encoder are a mix of Linear and LowRankLinear layers, preventing them from being fused since the two layer types are executed differently.

Example encoder layer with mix of Linear and LowRankLinear layers:
Image

Would it cause problems if this fused layer logic was removed? Would it be better if I removed this logic only when lite-whisper runs? Is there another work around that allows me to keep the fused layer concatenation logic even though the layers are different?

@minhthuc2502

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions