What is the difference between GVA and VA with shared plane? 

GVA is proposed in PTV2 as a new method, but the implementation is equal as Vector Attention with shared plane in PTV1.

below are the comparisons:

PTV1:

        w = self.softmax(w)  # (n, nsample, c//s)
        n, nsample, c = x_v.shape; s = self.share_planes
        x = ((x_v + p_r).view(n, nsample, s, c // s) * w.unsqueeze(2)).sum(1).view(n, c)  # v * A

PTV2:

        value = einops.rearrange(value, "n ns (g i) -> n ns g i", g=self.groups)
        feat = torch.einsum("n s g i, n s g -> n g i", value, weight)
        feat = einops.rearrange(feat, "n g i -> n (g i)")

They are functionally equivalent!

I may also be wrong because I don't understand it well, can you explain the difference between them?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

What is the difference between GVA and VA with shared plane? #40

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

What is the difference between GVA and VA with shared plane? #40

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions