# GPT-SoVITS Microservice [GPT-SoVITS](https://github.com/RVC-Boss/GPT-SoVITS) allows you to to do zero-shot voice cloning and text to speech of multi languages such as English, Japanese, Korean, Cantonese and Chinese. This microservice is validated on Xeon/CUDA. HPU support is under development. ## Build the Image ```bash docker build -t opea/gpt-sovits:latest --build-arg http_proxy=$http_proxy --build-arg https_proxy=$https_proxy -f comps/tts/src/integrations/dependency/gpt-sovits/Dockerfile . ``` ## Start the Service ```bash docker run -itd -p 9880:9880 -e http_proxy=$http_proxy -e https_proxy=$https_proxy opea/gpt-sovits:latest ``` ## Test - Chinese only ```bash curl localhost:9880/ -XPOST -d '{ "text": "先帝创业未半而中道崩殂,今天下三分,益州疲弊,此诚危急存亡之秋也。", "text_language": "zh" }' --output out.wav ``` - English only ```bash curl localhost:9880/ -XPOST -d '{ "text": "Discuss the evolution of text-to-speech (TTS) technology from its early beginnings to the present day. Highlight the advancements in natural language processing that have contributed to more realistic and human-like speech synthesis. Also, explore the various applications of TTS in education, accessibility, and customer service, and predict future trends in this field. Write a comprehensive overview of text-to-speech (TTS) technology.", "text_language": "en" }' --output out.wav ``` - Auto detection of languages ```bash curl localhost:9880/ -XPOST -d '{ "text": "Hi 你好,这里是一个 cross-lingual 的例子。", "text_language": "auto" }' --output out.wav ``` - Change reference audio This microservice allows you to use the zero-shot voice cloning feature. For example, you can change the reference audio from the default female to a male voice: ```bash wget https://github.com/OpenTalker/SadTalker/blob/main/examples/driven_audio/chinese_poem1.wav docker cp chinese_poem1.wav gpt-sovits-service:/home/user/chinese_poem1.wav curl localhost:9880/change_refer -d '{ "refer_wav_path": "/home/user/chinese_poem1.wav", "prompt_text": "窗前明月光,疑是地上霜,举头望明月,低头思故乡。", "prompt_language": "zh" }' ``` - openai protocol compatible request ```bash curl localhost:9880/v1/audio/speech -XPOST -d '{"input":"你好呀,你是谁. Hello, who are you?"}' -H 'Content-Type: application/json' --output speech.mp3 ```