Skip to content

Regarding the bi-directional attention you mentioned in the article #48

@1286671

Description

@1286671

I see in the article that you use bi-directional attention in the Transformer phase, but I don't see you using bi-directional attention in /Transformer/model.py. Instead of using CausalSelfAttention, do you use bi-directional attention elsewhere?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions