Local Voice Cloning App

Minimal voice cloning experiment. This repository contains a small Python app that runs a voice-cloning workflow (load model, accept input, synthesize audio). The project is lightweight and uses the uv helper/runner (see run instructions) to sync dependencies and run the app.

Clone the repository

Clone the repo to your machine (replace <repo-url> with the repository HTTPS or SSH URL):

git clone https://github.com/arifulislamat/local-voice-cloning-app.git
cd local-voice-cloning-app

Install / sync dependencies

This project uses uv as the local helper for syncing and running. If your project expects a different tool, you can also use the normal Python tooling (pip, poetry, etc.).

To sync dependencies with uv:

uv sync

If you don't have uv available, use your normal environment setup (for example, create a venv and install packages listed in pyproject.toml).

Run the app

Run locally (default, not publicly exposed):

uv run main.py

Run in "public" mode (if the app supports exposing an endpoint or external access):

uv run main.py --public

Fallback (if you prefer running directly with Python):

source .venv/bin/activate  # (linux/mac)
python main.py

Note: The app will automatically use your CUDA GPU (NVIDIA) if available for faster inference. If no compatible GPU is found, it will fall back to CPU mode automatically.

Demo Videos

Short Demo

How to run the App on your computer

.env file — what to put there

A .env file is already included in the repository for your convenience. Review and update as needed before running the app. Do not share sensitive information from .env publicly.

Example .env

# Internal environment variables (for transformers/diffusers)
TRANSFORMERS_ATTN_IMPLEMENTATION=eager   # Use eager attention implementation for HuggingFace Transformers (improves compatibility)
TOKENIZERS_PARALLELISM=false             # Disable parallelism in tokenizers to avoid warning spam
TRANSFORMERS_VERBOSITY=error             # Only show error logs from HuggingFace Transformers
DIFFUSERS_VERBOSITY=error                # Only show error logs from HuggingFace Diffusers

High-level architecture (Mermaid)

The following Mermaid diagram shows the core flow of the app. Save it in this README or render it in a Markdown viewer that supports Mermaid.

flowchart TD
	A[User / Client] -->|Text/Audio Input| B[Gradio UI]
	B --> C[main.py Handler]
	C --> D[Validate & Preprocess Input]
	D --> E[Load TTS Model]
	E --> F[Generate Audio]
	F --> G[Return Audio to User]

This diagram is intentionally generic. If you want a more detailed sequence diagram (for async tasks, queues, or third-party API calls), tell me which modules to include and I will expand it.

How to contribute

We welcome contributions. A minimal workflow:

Fork the repository.
Create a branch for your change: git checkout -b feat/your-feature.
Make changes and add tests where applicable.
Run any project linters/tests and ensure they pass.
Commit with clear messages and push your branch: git push origin feat/your-feature.
Open a Pull Request against the main branch, describe the change, and reference any related issues.
Address any feedback and iterate as needed.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.env		.env
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local Voice Cloning App

Clone the repository

Install / sync dependencies

Run the app

Demo Videos

Short Demo

How to run the App on your computer

.env file — what to put there

High-level architecture (Mermaid)

How to contribute

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Local Voice Cloning App

Clone the repository

Install / sync dependencies

Run the app

Demo Videos

Short Demo

How to run the App on your computer

.env file — what to put there

High-level architecture (Mermaid)

How to contribute

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages