Skip to content

JenWei0312/All_things_attention

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 

Repository files navigation

Attention Mechanisms

  1. Implemented: Multi-head Attention with scaled dot product from "Attention Is All You Need" paper
  2. Implemented: Grouped Query Attention as in Llma models. But unlike in Llama models, my implementation still uses scaled dot product
  3. Implemented: Multi-Head Latent Attention as in DeepSeek models.

About

Comparison of different kinds of attentions

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published