# ASR Microservice ASR (Audio-Speech-Recognition) microservice helps users convert speech to text. When building a talking bot with LLM, users will need to convert their audio inputs (What they talk, or Input audio from other sources) to text, so the LLM is able to tokenize the text and generate an answer. This microservice is built for that conversion stage. ## 🚀1. Start Microservice with Python (Option 1) To start the ASR microservice with Python, you need to first install python packages. ### 1.1 Install Requirements ```bash pip install -r requirements.txt ``` ### 1.2 Start Whisper Service/Test - Xeon CPU ```bash cd integrations/dependency/whisper nohup python whisper_server.py --device=cpu & python check_whisper_server.py ``` Note: please make sure that port 7066 is not occupied by other services. Otherwise, use the command `npx kill-port 7066` to free the port. If the Whisper server is running properly, you should see the following output: ```bash {'asr_result': 'Who is pat gelsinger'} ``` - Gaudi2 HPU ```bash pip install optimum[habana] cd dependency/ nohup python whisper_server.py --device=hpu & python check_whisper_server.py # Or use openai protocol compatible curl command # Please refer to https://platform.openai.com/docs/api-reference/audio/createTranscription wget https://github.com/intel/intel-extension-for-transformers/raw/main/intel_extension_for_transformers/neural_chat/assets/audio/sample.wav curl http://localhost:7066/v1/audio/transcriptions \ -H "Content-Type: multipart/form-data" \ -F file="@./sample.wav" \ -F model="openai/whisper-small" ``` ### 1.3 Start ASR Service/Test ```bash cd ../../.. python opea_asr_microservice.py python check_asr_server.py ``` While the Whisper service is running, you can start the ASR service. If the ASR service is running properly, you should see the output similar to the following: ```bash {'text': 'who is pat gelsinger'} ``` ## 🚀2. Start Microservice with Docker (Option 2) Alternatively, you can also start the ASR microservice with Docker. ### 2.1 Build Images #### 2.1.1 Whisper Server Image - Xeon CPU ```bash cd ../../.. docker build -t opea/whisper:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/src/integrations/dependency/whisper/Dockerfile . ``` - Gaudi2 HPU ```bash cd ../../.. docker build -t opea/whisper-gaudi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/src/integrations/dependency/whisper/Dockerfile.intel_hpu . ``` #### 2.1.2 ASR Service Image ```bash docker build -t opea/asr:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/asr/src/Dockerfile . ``` ### 2.2 Start Whisper and ASR Service #### 2.2.1 Start Whisper Server - Xeon ```bash docker run -p 7066:7066 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy opea/whisper:latest ``` - Gaudi2 HPU ```bash docker run -p 7066:7066 --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy opea/whisper-gaudi:latest ``` #### 2.2.2 Start ASR service ```bash ip_address=$(hostname -I | awk '{print $1}') docker run -d -p 9099:9099 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e ASR_ENDPOINT=http://$ip_address:7066 opea/asr:latest ``` #### 2.2.3 Test ```bash # Use curl or python # curl wget https://github.com/intel/intel-extension-for-transformers/raw/main/intel_extension_for_transformers/neural_chat/assets/audio/sample.wav curl http://localhost:9099/v1/audio/transcriptions \ -H "Content-Type: multipart/form-data" \ -F file="@./sample.wav" \ -F model="openai/whisper-small" # python python check_asr_server.py ``` ## 🚀3. Start Microservice with Docker Compose (Option 3) Alternatively, you can also start the ASR microservice with Docker Compose. ```bash export ip_address=$(hostname -I | awk '{print $1}') export ASR_ENDPOINT=http://$ip_address:7066 export no_proxy=localhost,$no_proxy # cpu docker compose -f ../deployment/docker_compose/compose.yaml up whisper-service asr-whisper -d # hpu docker compose -f ../deployment/docker_compose/compose.yaml up whisper-gaudi-service asr-whisper-gaudi -d # Test wget https://github.com/intel/intel-extension-for-transformers/raw/main/intel_extension_for_transformers/neural_chat/assets/audio/sample.wav curl http://localhost:9099/v1/audio/transcriptions \ -H "Content-Type: multipart/form-data" \ -F file="@./sample.wav" \ -F model="openai/whisper-small" ```