3a6bf5adbb391e83b673e1031024fc6fa36bfa25
VITS Voice Conversion
This repo will guide you to add your voice into an existing VITS TTS model to make it a high-quality voice converter to all existing character voices in the model.
Welcome to play around with the base model, a Trilingual Anime VITS!
Currently Supported Tasks:
- Convert user's voice to characters listed here
- Chinese, English, Japanese TTS with user's voice
- Chinese, English, Japanese TTS with custom characters...
Currently Supported Characters for TTS & VC:
- Umamusume Pretty Derby
- Sanoba Witch
- Genshin Impact
- Custom characters...
Fine-tuning
It's recommended to perform fine-tuning on Google Colab because the original VITS has some dependencies that are difficult to configure.
How long does it take?
- Install dependencies (2 min)
- Record at least 10 your own voice (5 min)
- Fine-tune (30 min)
After everything is done, download the fine-tuned model & model config
Inference or Usage (Currently support Windows only)
- Remember to download your fine-tuned model!
- Download the latest release
- Put your model & config file into the folder
VC_inference, make sure to rename the model toG_latest.pthand config file tofinetune_speaker.json - The file structure should be as follows:
VC_inference
├───VC_inference.exe
├───...
├───finetune_speaker.json
└───G_latest.json
- run
VC_inference.exe, the browser should pop up automatically.
Description
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
Languages
Python
99.4%
Cython
0.6%