Deploy ChatQnA on Kubernetes cluster

Deploy on Xeon

export HFTOKEN="insert-your-huggingface-token-here"
helm install chatqna oci://ghcr.io/opea-project/charts/chatqna  --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f cpu-values.yaml

Deploy on Gaudi

export HFTOKEN="insert-your-huggingface-token-here"
helm install chatqna oci://ghcr.io/opea-project/charts/chatqna  --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f gaudi-values.yaml

Deploy variants of ChatQnA

ChatQnA is configurable and you can enable/disable features by providing values.yaml file. For example, to run with vllm instead of tgi on Gaudi hardware, use gaudi-vllm-values.yaml file:

export HFTOKEN="insert-your-huggingface-token-here"
helm install chatqna oci://ghcr.io/opea-project/charts/chatqna  --set global.HUGGINGFACEHUB_API_TOKEN=${HFTOKEN} -f gaudi-vllm-values.yaml

See other *-values.yaml files in this directory for more reference.