# Avatar Animation Microservice The avatar animation model is a combination of two models: Wav2Lip and GAN-based face generator (GFPGAN). The Wav2Lip model is used to generate lip movements from an audio file, and the GFPGAN model is used to generate a high-quality face image from a low-quality face image. The avatar animation microservices takes an audio piece and a low-quality face image/video as input, fuses mel-spectrogram from the audio with frame(s) from the image/video, and generates a high-quality video of the face image with lip movements synchronized with the audio. # 🚀1. Start Microservice with Docker (option 1) ## 1.1 Build the Docker images ### 1.1.1 Wav2Lip Server image ```bash git clone https://github.com/opea-project/GenAIComps.git cd GenAIComps ``` - Xeon CPU ```bash docker build -t opea/wav2lip:latest -f comps/third_parties/wav2lip/src/Dockerfile . ``` - Gaudi2 HPU ```bash docker build -t opea/wav2lip-gaudi:latest -f comps/third_parties/wav2lip/src/Dockerfile.intel_hpu . ``` ### 1.1.2 Animation server image ```bash docker build -t opea/animation:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/animation/src/Dockerfile . ``` ## 1.2. Set environment variables - Xeon CPU ```bash export ip_address=$(hostname -I | awk '{print $1}') export DEVICE="cpu" export WAV2LIP_PORT=7860 export ANIMATION_PORT=9066 export INFERENCE_MODE='wav2lip+gfpgan' export CHECKPOINT_PATH='/usr/local/lib/python3.11/site-packages/Wav2Lip/checkpoints/wav2lip_gan.pth' export FACE="assets/img/avatar1.jpg" # export AUDIO='assets/audio/eg3_ref.wav' # audio file path is optional, will use base64str in the post request as input if is 'None' export AUDIO='None' export FACESIZE=96 export OUTFILE="assets/outputs/result.mp4" export GFPGAN_MODEL_VERSION=1.4 # latest version, can roll back to v1.3 if needed export UPSCALE_FACTOR=1 export FPS=10 ``` - Gaudi2 HPU ```bash export ip_address=$(hostname -I | awk '{print $1}') export DEVICE="hpu" export WAV2LIP_PORT=7860 export ANIMATION_PORT=9066 export INFERENCE_MODE='wav2lip+gfpgan' export CHECKPOINT_PATH='/usr/local/lib/python3.10/dist-packages/Wav2Lip/checkpoints/wav2lip_gan.pth' export FACE="assets/img/avatar1.jpg" # export AUDIO='assets/audio/eg3_ref.wav' # audio file path is optional, will use base64str in the post request as input if is 'None' export AUDIO='None' export FACESIZE=96 export OUTFILE="assets/outputs/result.mp4" export GFPGAN_MODEL_VERSION=1.4 # latest version, can roll back to v1.3 if needed export UPSCALE_FACTOR=1 export FPS=10 ``` # 🚀2. Run the Docker container ## 2.1 Run Wav2Lip Microservice - Xeon CPU ```bash docker run --privileged -d --name "wav2lip-service" -p 7860:7860 --ipc=host -w /home/user/comps/animation/src -e PYTHON=/usr/bin/python3.11 -v $(pwd)/comps/animation/src/assets:/home/user/comps/animation/src/assets -e DEVICE=$DEVICE -e INFERENCE_MODE=$INFERENCE_MODE -e CHECKPOINT_PATH=$CHECKPOINT_PATH -e FACE=$FACE -e AUDIO=$AUDIO -e FACESIZE=$FACESIZE -e OUTFILE=$OUTFILE -e GFPGAN_MODEL_VERSION=$GFPGAN_MODEL_VERSION -e UPSCALE_FACTOR=$UPSCALE_FACTOR -e FPS=$FPS -e WAV2LIP_PORT=$WAV2LIP_PORT opea/wav2lip:latest ``` - Gaudi2 HPU ```bash docker run --privileged -d --name "wav2lip-gaudi-service" -p 7860:7860 --runtime=habana --cap-add=sys_nice --ipc=host -w /home/user/comps/animation/src -v $(pwd)/comps/animation/src/assets:/home/user/comps/animation/src/assets -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e PYTHON=/usr/bin/python3.10 -e DEVICE=$DEVICE -e INFERENCE_MODE=$INFERENCE_MODE -e CHECKPOINT_PATH=$CHECKPOINT_PATH -e FACE=$FACE -e AUDIO=$AUDIO -e FACESIZE=$FACESIZE -e OUTFILE=$OUTFILE -e GFPGAN_MODEL_VERSION=$GFPGAN_MODEL_VERSION -e UPSCALE_FACTOR=$UPSCALE_FACTOR -e FPS=$FPS -e WAV2LIP_PORT=$WAV2LIP_PORT opea/wav2lip-gaudi:latest ``` ## 2.2 Run Animation Microservice ```bash docker run -d -p 9066:9066 --ipc=host --name "animation-service" -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e WAV2LIP_ENDPOINT=http://$ip_address:7860 opea/animation:latest ``` # 🚀3. Start Microservice with Docker Compose Alternatively, you can also start the Animation microservice with Docker Compose. - Xeon CPU ```bash cd comps/animation/deployment/docker_compose docker compose -f compose.yaml up animation -d ``` - Gaudi2 HPU ```bash cd comps/animation/deployment/docker_compose docker compose -f compose.yaml up animation-gaudi -d ``` # 🚀4. Validate Microservice Once microservice starts, user can use below script to validate the running microservice. ## 4.1 Validate Wav2Lip service ```bash cd GenAIComps python3 comps/third_parties/wav2lip/src/check_wav2lip_server.py ``` ## 4.2 Validate Animation service ```bash cd GenAIComps export ip_address=$(hostname -I | awk '{print $1}') curl http://${ip_address}:9066/v1/animation -X POST -H "Content-Type: application/json" -d @comps/animation/src/assets/audio/sample_question.json ``` or ```bash cd GenAIComps python3 comps/third_parties/wav2lip/src/check_animation_server.py ``` The expected output will be a message similar to the following: ```bash {'wav2lip_result': '....../GenAIComps/comps/animation/src/assets/outputs/result.mp4'} ``` Please find "comps/animation/src/assets/outputs/result.mp4" as a reference generated video.