Skip to content

mmvq: dynamic nwarps based on matrix width for MoE models

40dc8a1
Select commit
Loading
Failed to load commit list.
Sign in for the full log view
Open

cuda : dynamic MMVQ nwarps for narrow matrices #20831

mmvq: dynamic nwarps based on matrix width for MoE models
40dc8a1
Select commit
Loading
Failed to load commit list.