Evaluation score for Tiktok dataset  not meeting what's stated in paper.

Hi, 

I have been trying to reproduce your evaluation scores  but so far the scores I am getting is quite worse than whats mentioned in the paper. 

For the standard UniAnimate I get  L1 : 6.36E-04, PSNR* : 13.29  , SSIM: 0.549, LPIPS: 0.425 
For the UniAnimate-Long I get L1 : 4.07E-04 , PSNR* : 15.99 , SSIM: 0.631 , LPIPS: 0.309 

None of these match the reported score of L1: 2.66E-04 , PSNR: 20.58 , SSIM: 0.811 , LPIPS: 0.231 

I am using the  benchmark pipeline from [Disco](https://github.com/Wangt-CN/DisCo) ( as mentioned in the  paper) and using the same 10 tiktok videos with configuration of 256x256 resolution that DiSCO uses.  

Could you kindly share more details of your evaluation setup (e.g.  A) standard or long B) Max Frames and any other information that would be useful in recreating the pipeline)? 




(Corrected PSNR* as per: [[Source]](https://github.com/Wangt-CN/DisCo/issues/86)) 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluation score for Tiktok dataset not meeting what's stated in paper. #80

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Evaluation score for Tiktok dataset not meeting what's stated in paper. #80

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions