Upper Face Dynamic Deviation (FDD) Metrics for 3D Talking Heads Evaluation

## 🚀 Feature

I would like to propose incorporating an essential evaluation metric for 3D talking heads into the TorchMetrics library:  Upper Face Dynamic Deviation (FDD).
### Motivation

Current TorchMetrics offerings lack dedicated metrics for evaluating 3D talking heads, except for [LVE](https://github.com/Lightning-AI/torchmetrics/issues/3003). I think this metric also fits in multimodal folder of this library.

### Pitch
This metric is widely used in speech-driven facial animation research, it measures the variation of facial dynamics for motion sequences in comparison with ground truth. It gives an indication of how close the standard deviation (or upper face motion variation) of generated sequences (of test-set audios) is compared to the variation observed in ground truth.

![Image](https://github.com/user-attachments/assets/36f3e113-4641-490c-b6fa-7bc4a15831a9)

### Reference
 
- Paper : [Codetalker](https://openaccess.thecvf.com/content/CVPR2023/papers/Xing_CodeTalker_Speech-Driven_3D_Facial_Animation_With_Discrete_Motion_Prior_CVPR_2023_paper.pdf)

### Additional context

If agreed, I would like to open a PR for the same.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upper Face Dynamic Deviation (FDD) Metrics for 3D Talking Heads Evaluation #3097

🚀 Feature

Motivation

Pitch

Reference

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Upper Face Dynamic Deviation (FDD) Metrics for 3D Talking Heads Evaluation #3097

Description

🚀 Feature

Motivation

Pitch

Reference

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions