-
Notifications
You must be signed in to change notification settings - Fork 1
Home
Folly is an AI-powered pipeline designed to facilitate the generation of therapeutic content by leveraging cutting-edge techniques in affective computing, audio understanding, and image generation. This pipeline is composed of two main components:
-
Audio Understanding Modules:
- Music Analysis System: This component processes input music, extracting valuable information such as segments, genre, instrumentation, and emotion analysis.
- Voice-Over Analysis System: Paired with the music file, this system processes voice-over (speech) files, extracting transcription, voice activity, and emotion.
-
Video Generation Module: Using the audio files (music and voice-over) from the previous stage, this component generates video content that is synchronized with the audio. It takes inputs such as prompts, timing, and transitions to produce coherent, audio-synchronised videos. An image generative model is used to generate the video frame by frame.
The primary goal of Folly is to contribute to therapeutic content creation by seamlessly integrating affective computing techniques with state-of-the-art audio understanding and video generation frameworks. This holistic approach enables the creation of emotionally resonant and synchronized media, offering potential applications in therapeutic settings.
Check out the generated samples in this folder.
Using the source code (with python <= 3.10):
apt-get update && apt-get install -y libsndfile1 ffmpeg portaudio19-dev
python3.10 -m venv lucid_env
source lucid_env/bin/activate
git clone https://github.com/thelucidproject/Folly
cd Folly/
pip install -r requirements.txt
python demo.py
or using Docker image:
docker build -t folly-image .
docker run --gpus all -it --rm folly-image
