open-edge-platform
diff --git a/‎usecases/ai/digital-avatar/.gitignore‎
Lines changed: 12 additions & 0 deletions b/‎usecases/ai/digital-avatar/.gitignore‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎usecases/ai/digital-avatar/README.md‎
Lines changed: 118 additions & 0 deletions b/‎usecases/ai/digital-avatar/README.md‎
Lines changed: 118 additions & 0 deletions
diff --git a/‎usecases/ai/digital-avatar/backend/liveportrait/Dockerfile‎
Lines changed: 77 additions & 0 deletions b/‎usecases/ai/digital-avatar/backend/liveportrait/Dockerfile‎
Lines changed: 77 additions & 0 deletions
diff --git a/‎usecases/ai/digital-avatar/backend/liveportrait/liveportrait/.gitignore‎
Lines changed: 34 additions & 0 deletions b/‎usecases/ai/digital-avatar/backend/liveportrait/liveportrait/.gitignore‎
Lines changed: 34 additions & 0 deletions
diff --git a/‎usecases/ai/digital-avatar/backend/liveportrait/liveportrait/LICENSE‎
Lines changed: 30 additions & 0 deletions b/‎usecases/ai/digital-avatar/backend/liveportrait/liveportrait/LICENSE‎
Lines changed: 30 additions & 0 deletions
@@ -0,0 +1,12 @@
+__pycache__
+.env
+
+ffmpeg*/
+checkpoints
+cache/
+backend/musetalk/models
+backend/musetalk/data/avatars
+backend/wav2lip/wav2lip/results
+backend/wav2lip/wav2lip/temp
+weights/*
+backend/liveportrait/templates
@@ -0,0 +1,118 @@
+# Digital Avatar
+
+A digital avatar that utilizes Image to Video, Text To Speech, Speech To Text, and LLM to create an interactive avatar.
+
+![Demo](./docs/demo.gif)
+
+
+## Table of Contents
+- [Architecture Diagram](#requirements)
+- [Requirements](#requirements)
+  - [Minimum](#minimum)
+  - [Recommended](#recommended)
+- [Application Ports](#application-ports)
+- [Setup](#setup)
+  - [Prerequisite](#prerequisite)
+  - [Setup ENV](#setup-env)
+  - [Build Docker Container](#build-docker-container)
+  - [Start Docker Container](#start-docker-container)
+  - [Access the App](#access-the-app)
+- [FAQ](#faq)
+
+## Architecture DIagram
+![Archictecture Diagram](./docs/architecture.png)
+
+## Requirements
+
+### Minimum
+- CPU: 13th generations of Intel Core i5 and above
+- GPU: Intel® Arc™ A770 graphics (16GB)
+- RAM: 32GB
+- DISK: 128GB
+
+## Application Ports
+Please ensure that you have these ports available before running the applications.
+
+| Apps         | Port |
+|--------------|------|
+| Lipsync      | 8011 |
+| LivePortrait | 8012 |
+| TTS          | 8013 |
+| STT          | 8014 |
+| OLLAMA       | 8015 |
+| Frontend     | 80   |
+
+## Setup
+
+### Prerequisite
+1. **OS**: Ubuntu (Validated on 22.04)
+  
+    ***Note***: If you are using different Ubuntu version, please [update the RENDER_GROUP_ID](#1-how-to-check-render-group-id)
+
+1. **Docker and Docker Compose**: Ensure Docker and Docker Compose are installed. Refer to [Docker installation guide](https://docs.docker.com/engine/install/).
+1. **Intel GPU Drivers**:
+    1. Refer to [here](../../../README.md#gpu) to install Intel GPU Drivers
+1. **Download Wav2Lip Model**: Download the [Wav2Lip model](https://iiitaphyd-my.sharepoint.com/:u:/g/personal/radrabha_m_research_iiit_ac_in/EdjI7bZlgApMqsVoEUUXpLsBxqXbn5z8VTmoxp55YNDcIA?e=n9ljGW) and place the file in the `weights` folder.
+1. **Create Avatar**:
+    1. Place an `image.png` file containing an image of a person (preferably showing at least the upper half of the body) in the assets folder.
+    2. Place an `idle.mp4` file of a person with some movement such as eye blinking (to be used as a reference) in the assets folder.
+
+### Setup ENV
+1. Create a `.env` file and copy the contents from `.env.template`:
+    ```bash
+    cp .env.template .env
+    ```
+2. Modify the `LLM_MODEL` in the `.env` file. Refer to [Ollama library](https://ollama.com/library) for available models. (Default is `QWEN2.5`).
+
+### Build Docker Container
+```bash
+docker compose build
+```
+
+### Start Docker container
+```bash
+docker compose up -d
+```
+
+### Access the App
+- Navigate to http://localhost
+
+## Notes
+### Device Workload Configurations
+You can offload model inference to specific device by modifying the environment variable setting in the docker-compose.yml file.
+
+| Workload             | Environment Variable |Supported Device         | 
+|----------------------|----------------------|-------------------------|
+| LLM                  |            -         |        GPU              |
+| STT - Encoded Device | STT_ENCODED_DEVICE   | CPU,GPU,NPU             | 
+| STT - Decided Device | STT_DECODED_DEVICE   | CPU,GPU                 |
+| TTS                  | TTS_DEVICE           | CPU                     |
+| Lipsync (Wav2lip)    | DEVICE               | CPU, GPU                |
+
+Example Configuration:
+
+* To offload the STT encoded workload to `NPU`, you can use the following configuration.
+
+```
+wav2lip:
+  ...
+  environment:
+    ...
+    DEVICE=CPU
+    ...
+```
+
+## FAQ
+### 1. Update Render Group ID
+1. Ensure the [Intel GPU driver](#prerequisite) is installed.
+2. Check the group ID from `/etc/group`:
+    ```bash
+    grep render /etc/group
+    ```
+3. The output will be something like:
+    ```
+    render:x:110:user
+    ```
+4. The group ID is the number in the third field (e.g., `110` in the example above).
+5. Ensure the `RENDER_GROUP_ID` in the [docker-compose.yml](./docker-compose.yml) file matches the render group ID.
+
@@ -0,0 +1,77 @@
+FROM debian:12-slim
+
+ARG DEBIAN_FRONTEND=noninteractive
+ARG RENDER_GROUP_ID
+RUN apt-get update \
+    && apt-get upgrade -y \
+    && apt-get install --no-install-recommends -y \
+    sudo \
+    wget \
+    ca-certificates \
+    ffmpeg \
+    libsm6 \
+    libxext6 \
+    curl \
+    git \
+    build-essential \
+    libssl-dev \
+    zlib1g-dev \
+    libbz2-dev \
+    libreadline-dev \
+    libsqlite3-dev \
+    llvm \
+    libncursesw5-dev \
+    xz-utils \
+    tk-dev \
+    libxml2-dev \
+    libxmlsec1-dev \
+    libffi-dev \
+    liblzma-dev \
+    && addgroup --system intel --gid 1000 \
+    && adduser --system --ingroup intel --uid 1000 --home /home/intel intel \
+    && echo "intel ALL=(ALL:ALL) NOPASSWD:ALL" > /etc/sudoers.d/intel \
+    && groupadd -g ${RENDER_GROUP_ID} render \
+    && usermod -aG render intel \
+    && rm -rf /var/lib/apt/lists/* \
+    && mkdir -p /usr/src \
+    && chown -R intel:intel /usr/src
+
+# Intel GPU Driver 
+RUN apt-get update && apt-get install -y gnupg
+
+RUN wget -qO - https://repositories.intel.com/gpu/intel-graphics.key | \
+    gpg --yes --dearmor --output /usr/share/keyrings/intel-graphics.gpg && \
+    echo "deb [arch=amd64,i386 signed-by=/usr/share/keyrings/intel-graphics.gpg] https://repositories.intel.com/gpu/ubuntu jammy client" | \
+    tee /etc/apt/sources.list.d/intel-gpu-jammy.list && \
+    apt update && \
+    apt-get install -y --no-install-recommends libze1 intel-level-zero-gpu intel-opencl-icd clinfo libze-dev intel-ocloc
+
+USER intel
+WORKDIR /usr/src/app
+
+# Set environment variables for pyenv
+ENV PYENV_ROOT="/usr/src/app/.pyenv"
+ENV PATH="$PYENV_ROOT/bin:$PYENV_ROOT/shims:$PATH"
+
+# Install pyenv
+RUN curl https://pyenv.run | bash \
+    && echo 'export PYENV_ROOT="$PYENV_ROOT"' >> ~/.bashrc \
+    && echo 'export PATH="$PYENV_ROOT/bin:$PATH"' >> ~/.bashrc \
+    && echo 'eval "$(pyenv init --path)"' >> ~/.bashrc \
+    && echo 'eval "$(pyenv init -)"' >> ~/.bashrc \
+    && . ~/.bashrc \
+    && pyenv install 3.10.15 \
+    && pyenv global 3.10.15
+
+RUN python3 -m pip install --upgrade pip \
+    && python3 -m pip install virtualenv
+
+RUN python3 -m venv /usr/src/.venv
+ENV PATH="/usr/src/.venv/bin:$PATH"
+
+COPY --chown=intel ./backend/liveportrait .
+RUN python3 -m pip install -r requirements.txt \
+    && huggingface-cli download KwaiVGI/LivePortrait --local-dir liveportrait/pretrained_weights --exclude "*.git*" "README.md" "docs"
+
+HEALTHCHECK --interval=30s --timeout=180s --start-period=60s --retries=3 \
+    CMD sh -c 'PORT=${SERVER_PORT:-8012} && wget --no-verbose -O /dev/null --tries=1 http://localhost:$PORT/healthcheck || exit 1'
@@ -0,0 +1,34 @@
+# Byte-compiled / optimized / DLL files
+__pycache__/
+**/__pycache__/
+*.py[cod]
+**/*.py[cod]
+*$py.class
+
+# Model weights
+**/*.pth
+**/*.onnx
+
+pretrained_weights/*.md
+pretrained_weights/docs
+pretrained_weights/liveportrait
+pretrained_weights/liveportrait_animals
+
+# Ipython notebook
+*.ipynb
+
+# Temporary files or benchmark resources
+animations/*
+tmp/*
+.vscode/launch.json
+**/*.DS_Store
+gradio_temp/**
+
+# Windows dependencies
+ffmpeg/
+LivePortrait_env/
+
+# XPose build files
+src/utils/dependencies/XPose/models/UniPose/ops/build
+src/utils/dependencies/XPose/models/UniPose/ops/dist
+src/utils/dependencies/XPose/models/UniPose/ops/MultiScaleDeformableAttention.egg-info
@@ -0,0 +1,30 @@
+MIT License
+
+Copyright (c) 2024 Kuaishou Visual Generation and Interaction Center
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
+
+---
+
+The code of InsightFace is released under the MIT License.
+The models of InsightFace are for non-commercial research purposes only.
+
+If you want to use the LivePortrait project for commercial purposes, you 
+should remove and replace InsightFace’s detection models to fully comply with 
+the MIT license.