Question about "enhance_scores" and Attention Scaling

Hi,  

Thank you for sharing this project! I have a few questions about the implementation details and would appreciate your clarification:  

1. In the code, I noticed that `get_enhance_weight()` seems to act like a “temperature” parameter mentioned in the paper, but I couldn’t find any reference in the paper to multiplying `mean_scores.mean()` by `num_frames`. 

   ```python
   enhance_scores = mean_scores.mean() * (num_frames + get_enhance_weight())
   ```  

   - Why is `mean_scores.mean()` used here? What exactly is being averaged, and what does it represent?  
   - How does this averaging influence the final enhancement process?  
   - What is the rationale behind multiplying by `num_frames`, and how does this relate to the method described in the paper?  

2. I also noticed that `enhance_scores` is multiplied by the attention output, and it is mentioned that this improves temporal consistency. Could you explain the theoretical or intuitive reasoning behind this? Specifically, why does scaling the attention output with `enhance_scores` contribute to better temporal consistency?  

Any insights you can share would be greatly appreciated. Thank you in advance for your time!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question about "enhance_scores" and Attention Scaling #32

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Question about "enhance_scores" and Attention Scaling #32

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions