can you give a compare with existing implementations?

I appreciate your excellent work, especially the example https://juliamltools.github.io/shakespeare-gpt

there are some existing implementations of MultiHeadAttention and Transformer: 
https://github.com/FluxML/Flux.jl/pull/2146
https://github.com/chengchingwen/NeuralAttentionlib.jl
https://github.com/chengchingwen/Transformers.jl

can you give a compare with existing implementations ?
why you want to implementation this again  ? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

can you give a compare with existing implementations? #1

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

can you give a compare with existing implementations? #1

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions