Skip to content

Question regarding the aggregation strategy in _get_attn_k() #16

@zhangshuoneu

Description

@zhangshuoneu

This is a very insightful piece of work! It's impressive to see robust motion segmentation and quality enhancement for dynamic point clouds achieved in a training-free manner.

I have a question regarding the implementation details. I noticed that when aggregating the cross-attention maps for a single image (referring to the _get_attn_k() function in model.py), the aggregation is performed along the query dimension rather than the key dimension.

Is there a specific reason or intuition behind this design choice? I would really appreciate it if you could share your insights. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions