Skip to content

This repository demonstrates an AI-powered system that generates soccer commentary with synchronized talking head videos using advanced speech synthesis and lip-sync technology.

Notifications You must be signed in to change notification settings

allanchan339/VLM_Soccer_Commentator_THG

Repository files navigation

Cantonese Soccer Commentary & Talking Head Video Generation

This repository demonstrates an AI-powered system that generates soccer commentary with synchronized talking head videos using advanced speech synthesis and lip-sync technology.

Demo Videos

Video Source

The source video is extracted from here

Pure Lip Sync Demo

pure_lip_sync.mp4

Commentary Results -- EdgeTTS

commentary_results.mp4

Commentary Results -- SoVITS

realtime_Ball_short.mp4

Full Demo Recording

2025-08-27.14-27-28.mov

Features

  • AI Soccer Commentary: Generate realistic soccer match commentary using LLM
  • Talking Head Generation: Create synchronized talking head videos with lip-sync
  • High-Quality Audio: Advanced TTS (Text-to-Speech) synthesis
  • Real-time Processing: Optimized for GPU acceleration

Requirement

  1. NVidia GPU with CUDA support (1*RTX4060 is enough)
  2. Ubuntu 20.04 or higher
  3. Driver version >= 570.133
  4. CUDA version >= 12.0
  5. The environment must be created with Python 3.10 (CosyVoice-ttsfrd requires Python 3.10)
  6. ModelScope API key is required for LLM.

Installation

Git

  1. Clone the repository:
git clone https://github.com/allanchan339/VLM_Soccer_Commentator_THG --depth 1  
cd VLM_Soccer_Commentator_THG
git submodule update --init --recursive

Conda

  1. Install Miniconda or Anaconda, then run following commands conda env create -f environment_torch2.4.yml

  2. Activate the environment:

conda activate SoCommVoice2.4

Additional Dependencies

Install additional dependencies for musetalk:

# Install dependencies related to musetalk
pip install --no-cache-dir -U openmim
mim install mmengine 

Install mmdet (For Pytorch 2.4.1 only)

Then we need to install mmdet and mmpose from source code and comment out the compatibility check in init.py. Otherwise, assertion error will be raised.

cd mmdetection
# Comment out the compatibility check in init.py
nano {python_path}/lib/python3.10/site-packages/mmdet/__init__.py 

Change the line 17 from:

and mmcv_version < digit_version(mmcv_maximum_version)), \

to:

and mmcv_version <= digit_version(mmcv_maximum_version)), \

Install mmpose

mim install "mmpose>=1.1.0" # not exist in conda-forge

Download pre-trained models

Download the pre-trained models and install MuseTalk:

# Download the MuseTalk model
sh ./download_THG_weight.sh

# Download the GPT-SoVITS models:
bash ./download_TTS_weight.sh

Run the demo

python web_ui_all.py

About

This repository demonstrates an AI-powered system that generates soccer commentary with synchronized talking head videos using advanced speech synthesis and lip-sync technology.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published