I believe this is an important aspect of lip-syncing. Have you ever worked on tasks like calculating the match rate between video segments and audio?