2023-02-16 17:41:23 +08:00
2023-02-13 14:17:54 +08:00
2023-02-16 15:56:03 +08:00
2023-02-11 08:25:01 +08:00
2023-02-11 08:22:58 +08:00
2023-02-13 14:17:54 +08:00
2023-02-11 08:22:58 +08:00
2023-02-13 17:20:19 +08:00
2023-02-13 17:20:19 +08:00
2023-02-16 15:56:03 +08:00
2023-02-15 17:18:24 +08:00
2023-02-16 17:41:23 +08:00
2023-02-11 08:14:59 +08:00
2023-02-11 08:22:58 +08:00
2023-02-11 08:22:58 +08:00
2023-02-15 16:18:49 +08:00
2023-02-13 17:20:19 +08:00
2023-02-13 17:20:19 +08:00
2023-02-16 15:56:03 +08:00
2023-02-16 15:58:00 +08:00
2023-02-16 15:57:28 +08:00
2023-02-16 15:58:30 +08:00
2023-02-15 16:18:49 +08:00
2023-02-16 17:09:40 +08:00
2023-02-11 08:22:58 +08:00
2023-02-13 14:42:04 +08:00
2023-02-16 17:41:23 +08:00
2023-02-15 16:18:49 +08:00
2023-02-16 16:39:34 +08:00
2023-02-16 17:41:23 +08:00

中文文档请点击这里

VITS Voice Conversion

This repo will guide you to add your voice into an existing VITS TTS model to make it a high-quality voice converter to all existing character voices in the model.

Welcome to play around with the base model, a Trilingual Anime VITS! Hugging Face Spaces

Currently Supported Tasks:

  • Convert user's voice to characters listed here
  • Chinese, English, Japanese TTS with user's voice
  • Chinese, English, Japanese TTS with custom characters...

Currently Supported Characters for TTS & VC:

  • Umamusume Pretty Derby
  • Sanoba Witch
  • Genshin Impact
  • Custom characters...

Fine-tuning

It's recommended to perform fine-tuning on Google Colab because the original VITS has some dependencies that are difficult to configure.

How long does it take?

  1. Install dependencies (2 min)
  2. Record at least 10 your own voice (5 min)
  3. Fine-tune (30 min)
    After everything is done, download the fine-tuned model & model config

Inference or Usage (Currently support Windows only)

  1. Remember to download your fine-tuned model!
  2. Download the latest release
  3. Put your model & config file into the folder VC_inference, make sure to rename the model to G_latest.pth and config file to finetune_speaker.json
  4. The file structure should be as follows:
VC_inference
├───VC_inference.exe
├───...
├───finetune_speaker.json
└───G_latest.json
  1. run VC_inference.exe, the browser should pop up automatically.
S
Description
This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion
Readme 703 KiB
Languages
Python 99.4%
Cython 0.6%