# Usage guide for genai-microservices-connector(GMC) genai-microservices-connector(GMC) can be used to compose and adjust GenAI pipelines dynamically. It can leverage the microservices provided by [GenAIComps](https://github.com/opea-project/GenAIComps) and external services to compose GenAI pipelines. Below are sample use cases: ## Use GMC to compose a chatQnA Pipeline A sample for chatQnA can be found at config/samples/ChatQnA/chatQnA_dataprep_xeon.yaml **Deploy chatQnA GMC custom resource** ```sh kubectl create ns chatqa kubectl apply -f $(pwd)/config/samples/ChatQnA/chatQnA_dataprep_xeon.yaml # To use Gaudi device #kubectl apply -f $(pwd)/config/samples/ChatQnA/chatQnA_dataprep_gaudi.yaml # To use Nvidia GPU #kubectl apply -f $(pwd)/config/samples/ChatQnA/chatQnA_nv.yaml ``` **GMC will reconcile chatQnA custom resource and get all related components/services ready** ```sh kubectl get service -n chatqa ``` **Check GMC chatQnA custom resource to get access URL for the pipeline** ```bash $kubectl get gmconnectors.gmc.opea.io -n chatqa NAME URL READY AGE chatqa http://router-service.chatqa.svc.cluster.local:8080 10/0/10 3m ``` the `READY 10/0/10` means there are 10(the 2nd 10) services deployed by the GMC and 10(the 1st 10) are ready, so the 10 of 10 means the pipeline is all set. the 0 in the middle means there are no external services used, all the resources are managed by GMC inside the clusters.` you can get the resources via `kubectl` commands ``` $ kubectl get pods -n chatqa NAME READY STATUS RESTARTS AGE data-prep-svc-deployment-68f7c5dcb9-8fbh8 1/1 Running 0 2m41s embedding-svc-deployment-775bd5dc49-j4ltr 1/1 Running 0 2m43s llm-svc-deployment-59f756fb56-4xckz 1/1 Running 0 2m41s redis-vector-db-deployment-587844d666-hbchr 1/1 Running 0 2m42s reranking-svc-deployment-846c89f79f-gv7b9 1/1 Running 0 2m42s retriever-svc-deployment-5c44f7d46-m4qgq 1/1 Running 0 2m43s router-service-deployment-7f6c5f4796-tzchw 1/1 Running 0 2m41s tei-embedding-svc-deployment-54b58d57cb-9mwvk 1/1 Running 0 2m43s tei-reranking-svc-deployment-54c5dd5795-b6wcb 1/1 Running 0 2m42s tgi-service-m-deployment-5ff67f4db7-b7ztj 1/1 Running 0 2m41s ``` you can also get the detailed information of these resource by checking the pipeline's status, this will list all the configmap, deployment and service and their status as below: ``` $ kubectl get gmc -n chatqa chatqa -o json | jq '.status.annotations' | yq -P ConfigMap:v1:data-prep-config:chatqa: provisioned ConfigMap:v1:embedding-usvc-config:chatqa: provisioned ConfigMap:v1:llm-uservice-config:chatqa: provisioned ConfigMap:v1:reranking-usvc-config:chatqa: provisioned ConfigMap:v1:retriever-usvc-config:chatqa: provisioned ConfigMap:v1:tei-config:chatqa: provisioned ConfigMap:v1:teirerank-config:chatqa: provisioned ConfigMap:v1:tgi-config:chatqa: provisioned Deployment:apps/v1:data-prep-svc-deployment:chatqa: | Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable Conditions: Type: Available Status: True Reason: MinimumReplicasAvailable Message: Deployment has minimum availability. Type: Progressing Status: True Reason: NewReplicaSetAvailable Message: ReplicaSet "data-prep-svc-deployment-7c7c648846" has successfully progressed. Deployment:apps/v1:embedding-svc-deployment:chatqa: | Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable Conditions: Type: Available Status: True Reason: MinimumReplicasAvailable Message: Deployment has minimum availability. Type: Progressing Status: True Reason: NewReplicaSetAvailable Message: ReplicaSet "embedding-svc-deployment-775bd5dc49" has successfully progressed. Deployment:apps/v1:llm-svc-deployment:chatqa: | Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable Conditions: Type: Available Status: True Reason: MinimumReplicasAvailable Message: Deployment has minimum availability. Type: Progressing Status: True Reason: NewReplicaSetAvailable Message: ReplicaSet "llm-svc-deployment-59f756fb56" has successfully progressed. Deployment:apps/v1:redis-vector-db-deployment:chatqa: | Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable Conditions: Type: Available Status: True Reason: MinimumReplicasAvailable Message: Deployment has minimum availability. Type: Progressing Status: True Reason: NewReplicaSetAvailable Message: ReplicaSet "redis-vector-db-deployment-587844d666" has successfully progressed. Deployment:apps/v1:reranking-svc-deployment:chatqa: | Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable Conditions: Type: Available Status: True Reason: MinimumReplicasAvailable Message: Deployment has minimum availability. Type: Progressing Status: True Reason: NewReplicaSetAvailable Message: ReplicaSet "reranking-svc-deployment-846c89f79f" has successfully progressed. Deployment:apps/v1:retriever-svc-deployment:chatqa: | Replicas: 1 desired | 1 updated | 2 total | 1 available | 1 unavailable Conditions: Type: Available Status: True Reason: MinimumReplicasAvailable Message: Deployment has minimum availability. Type: Progressing Status: True Reason: ReplicaSetUpdated Message: ReplicaSet "retriever-svc-deployment-95b967c9d" is progressing. Deployment:apps/v1:router-service-deployment:chatqa: | Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable Conditions: Type: Available Status: True Reason: MinimumReplicasAvailable Message: Deployment has minimum availability. Type: Progressing Status: True Reason: NewReplicaSetAvailable Message: ReplicaSet "router-service-deployment-79f54548f4" has successfully progressed. Deployment:apps/v1:tei-embedding-svc-deployment:chatqa: | Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable Conditions: Type: Available Status: True Reason: MinimumReplicasAvailable Message: Deployment has minimum availability. Type: Progressing Status: True Reason: NewReplicaSetAvailable Message: ReplicaSet "tei-embedding-svc-deployment-54b58d57cb" has successfully progressed. Deployment:apps/v1:tei-reranking-svc-deployment:chatqa: | Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable Conditions: Type: Available Status: True Reason: MinimumReplicasAvailable Message: Deployment has minimum availability. Type: Progressing Status: True Reason: NewReplicaSetAvailable Message: ReplicaSet "tei-reranking-svc-deployment-54c5dd5795" has successfully progressed. Deployment:apps/v1:tgi-service-m-deployment:chatqa: | Replicas: 1 desired | 1 updated | 1 total | 1 available | 0 unavailable Conditions: Type: Available Status: True Reason: MinimumReplicasAvailable Message: Deployment has minimum availability. Type: Progressing Status: True Reason: NewReplicaSetAvailable Message: ReplicaSet "tgi-service-m-deployment-5fcff459f5" has successfully progressed. Service:v1:data-prep-svc:chatqa: http://data-prep-svc.chatqa.svc.cluster.local:6007/v1/dataprep Service:v1:embedding-svc:chatqa: http://embedding-svc.chatqa.svc.cluster.local:6000/v1/embeddings Service:v1:llm-svc:chatqa: http://llm-svc.chatqa.svc.cluster.local:9000/v1/chat/completions Service:v1:redis-vector-db:chatqa: http://redis-vector-db.chatqa.svc.cluster.local:6379 Service:v1:reranking-svc:chatqa: http://reranking-svc.chatqa.svc.cluster.local:8000/v1/reranking Service:v1:retriever-svc:chatqa: http://retriever-svc.chatqa.svc.cluster.local:7000/v1/retrieval Service:v1:router-service:chatqa: http://router-service.chatqa.svc.cluster.local:8080 Service:v1:tei-embedding-svc:chatqa: http://tei-embedding-svc.chatqa.svc.cluster.local:80 Service:v1:tei-reranking-svc:chatqa: http://tei-reranking-svc.chatqa.svc.cluster.local:80/rerank Service:v1:tgi-service-m:chatqa: http://tgi-service-m.chatqa.svc.cluster.local:80/generate ``` **NOTE: if you upgrade from pre 0.9 to 0.9 or later, you might encounter below issue** if the router-service and it's deployment are not initialized, which is mandatory for every pipeline, you also need to upgrade the gmc-router.yaml to the latest version which is mentioned in the [GMC README](/GenAIInfra/microservices-connector/README.md) **Deploy one client pod for testing the chatQnA application** ```bash kubectl create deployment client-test -n chatqa --image=python:3.8.13 -- sleep infinity ``` **Access the pipeline using the above URL from the client pod** ```bash export CLIENT_POD=$(kubectl get pod -n chatqa -l app=client-test -o jsonpath={.items..metadata.name}) export accessUrl=$(kubectl get gmc -n chatqa -o jsonpath="{.items[?(@.metadata.name=='chatqa')].status.accessUrl}") kubectl exec "$CLIENT_POD" -n chatqa -- curl $accessUrl -X POST -d '{"text":"What is the revenue of Nike in 2023?","parameters":{"max_new_tokens":17, "do_sample": true}}' -H 'Content-Type: application/json' ``` ## Use GMC to adjust the chatQnA Pipeline **Modify chatQnA custom resource to change to another LLM model** ```yaml - name: Tgi internalService: serviceName: tgi-svc config: LLM_MODEL_ID: Llama-2-7b-chat-hf ``` **Check the tgi-svc-deployment has been changed to use the new LLM Model** ```sh kubectl get deployment tgi-svc-deployment -n chatqa -o jsonpath="{.spec.template.spec.containers[*].env[?(@.name=='LLM_MODEL_ID')].value}" ``` **Access the updated pipeline using the above URL from the client pod** ```bash kubectl exec "$CLIENT_POD" -n chatqa -- curl $accessUrl -X POST -d '{"text":"What is the revenue of Nike in 2023?","parameters":{"max_new_tokens":17, "do_sample": true}}' -H 'Content-Type: application/json' ``` **Remove one step of the pipeline** If you want to adjust the steps of the pipeline, for example, if you want to delete the data preparation step from chatQnA, you can simply delete this part from the yaml file config/samples/chatQnA_dataprep_xeon.yaml ``` - name: DataPrep internalService: serviceName: data-prep-svc config: endpoint: /v1/dataprep REDIS_URL: redis-vector-db TEI_ENDPOINT: tei-embedding-svc isDownstreamService: true ``` and re-apply the yaml file ``` kubectl apply -f $(pwd)/config/samples/chatQnA_dataprep_xeon.yaml ``` you would see the `dataprep` is deleted ``` $ kubectl get gmc -n chatqa chatqa NAME URL READY AGE chatqa http://router-service.chatqa.svc.cluster.local:8080 9/0/9 3m37s ``` But please be noted, **you have to make sure** the step is eligible to be deleted without affecting the pipeline function. ## Use GMC to delete the chatQnA Pipeline you can delete all the resources by deleting the gmc custom resource ``` $ kubectl delete gmc -n chatqa chatqa gmconnector.gmc.opea.io "chatqa" deleted $ kubectl get gmc -n chatqa No resources found in chatqa namespace. $ kubectl get all -n chatqa No resources found in chatqa namespace. ``` ## Use GMC and Istio to compose an OPEA Pipeline with authentication and authorization enabled The critical steps of authentication and authorization are vital to maintaining the integrity and safety of our GenAI workload. Please check the [readme](../authN-authZ/README.md) file for more details.