Fast-Phasr-Next

New generation lightweight DiffSinger automatic phoneme annotation tool

For a better solution,please see here.

⚠️ Warning: Programs always have uncertainty, please do not trust automatic programs 100% (even if the program has high reliability). If it is a major project, please perform necessary checks on the phoneme sequence after using the program

Currently, the project supports Chinese, English, and Japanese (but the reliability of Japanese recognition is not high and a larger model needs to be selected)

Supported languages

Support for Chinese
Support for English
Support for Japanese

Getting Started

Requirements

torch
faster-whisper
pykakasi

fast-phasr-next requires Python 3.8 or later. We strongly recommend you create a virtual environment via Conda or venv before installing dependencies.

install

# cpu
pip install -r requirement.txt

# gpu
conda install cudatoolkit -y
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

pip install -r requirement.txt

About Whisper

This project uses fast-whisper, which reimplements OpenAI's Whisper model using CTranslate2, a fast inference engine for the Transformer model. This implementation is 4x faster than openai/whisper but uses less memory and has the same accuracy. Efficiency can be further improved through 8-bit quantization on CPU and GPU.

In the test environment of RTX 3060 Laptop 6G GPU, using the Large-v3-fp16 model, it only takes about 0.7s to label a 6~10s audio, and in the labeling test of 50 audios, about 98.71% can be obtained accuracy

Optional model

Size	Parameters	English-only model	Multilingual model	Required VRAM	Relative speed (Compared with the original project)
tiny	39 M	`tiny.en`	`tiny`	~1 GB	~128x
base	74 M	`base.en`	`base`	~1 GB	~64x
small	244 M	`small.en`	`small`	~2 GB	~36x
medium	769 M	`medium.en`	`medium`	~5 GB	~8x
large	1550 M	N/A	`large`	~10 GB	~4x

Inference

python main.py -d [import directory] -m [model default="large"] -l [language default="Chinese"] --device [default="cuda"] --compute_type [default="float16"]

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
Dicts/mandarin		Dicts/mandarin
img		img
.gitignore		.gitignore
G2p.py		G2p.py
LICENSE		LICENSE
README.md		README.md
README.zh-CN.md		README.zh-CN.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fast-Phasr-Next

Supported languages

Getting Started

Requirements

About Whisper

Optional model

Inference

Thank you to the following contributors!

About

Uh oh!

Releases 5

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Fast-Phasr-Next

Supported languages

Getting Started

Requirements

About Whisper

Optional model

Inference

Thank you to the following contributors!

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages