Reason for not using transformer as EBM architecture #6

Open

opened

on Sep 22, 2024

Do you know why EBM architectures are always parametrized as convolutional models and never use self-attention?

Metadata

Assignees

No one assigned

Labels

No labels

No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests