I understand that UniVTG is CLIP based. Can I use 'highlight of the video' as a unified query to extract highlights? What will be the performance?