Skip to content

Conversation

@fabianlim
Copy link
Contributor

@fabianlim fabianlim commented Nov 8, 2024

This PR

  • disables the MLP Fused Ops if the activation function is not SwiGLU. In this implementation, the rules are generated upfront before checking what model is activated.
  • fast_quantized_peft is now removed to prevent further confusion
  • Update the benchmarks with all the new updates.

Updated Benchmarks

Outliers
image

Generally we noticed two things

image image
image image

outliers.csv

Base automatically changed from fix/lora-drop to main November 8, 2024 11:00
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
@fabianlim fabianlim changed the title Disable MLP Fused Ops if Not SwiGLU and removed Depracted Fast Quantized Peft Plugin Disable MLP Fused Ops if Not SwiGLU, Depracted Fast Quantized Peft Plugin, Update Benchmarks Nov 10, 2024
Signed-off-by: Yu Chin Fabian Lim <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants