AgentQnA

Helm chart for deploying AgentQnA example. It demonstrates how agent works, using prepared data and questions. See AgentQnA overview for the details.

Using different datasets, models and questions may get different results.

Agent usually requires larger models to perform better, we used Llama-3.3-70B-Instruct for test, which requires 4x Gaudi devices for local deployment.

With helm chart, we also provided option with smaller model (Meta-Llama-3-8B-Instruct) with compromised performance on Xeon CPU only environment for you to try.

Deploy

The Deployment includes preparing tools and SQL data.

Prerequisites

A volume is required to put tools configuration used by agent, and the database data used by sqlagent.

We’ll use hostPath in this readme, which is convenient for single worker node deployment. PVC is recommended in a bigger cluster. If you want to use a PVC, comment out the toolHostPath and replace with toolPVC in the values.yaml.

Create the directory /mnt/tools in the worker node, which is the default in values.yaml. We use the same directory for all 3 agents for easy configuration.

sudo mkdir /mnt/tools
sudo chmod 777 /mnt/tools

Download tools and the configuration to /mnt/tools

# tools used by supervisor
wget https://raw.githubusercontent.com/opea-project/GenAIExamples/refs/heads/main/AgentQnA/tools/supervisor_agent_tools.yaml -O /mnt/tools/supervisor_agent_tools.yaml
wget https://raw.githubusercontent.com/opea-project/GenAIExamples/refs/heads/main/AgentQnA/tools/tools.py -O /mnt/tools/tools.py
wget https://raw.githubusercontent.com/opea-project/GenAIExamples/refs/heads/main/AgentQnA/tools/pycragapi.py -O /mnt/tools/pycragapi.py

# tools used by rag agent
wget https://raw.githubusercontent.com/opea-project/GenAIExamples/refs/heads/main/AgentQnA/tools/worker_agent_tools.yaml -O /mnt/tools/worker_agent_tools.yaml
wget https://raw.githubusercontent.com/opea-project/GenAIExamples/refs/heads/main/AgentQnA/tools/worker_agent_tools.py -O /mnt/tools/worker_agent_tools.py

Download the sqlite database binary file

wget https://raw.githubusercontent.com/lerocha/chinook-database/refs/heads/master/ChinookDatabase/DataSources/Chinook_Sqlite.sqlite -O /mnt/tools/Chinook_Sqlite.sqlite

Deploy with Helm chart

Deploy everything on Gaudi enabled Kubernetes cluster:

If you want to try with latest version, use helm pull oci://ghcr.io/opea-project/charts/agentqna --version 0-latest --untar

export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
helm pull oci://ghcr.io/opea-project/charts/agentqna --untar
helm install agentqna agentqna -f agentqna/gaudi-values.yaml --set global.HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN}

Verify

To verify the installation, run the command kubectl get pod to make sure all pods are running.

Ingest data for RAG

Ingest data used by RAG.

wget https://raw.githubusercontent.com/opea-project/GenAIExamples/refs/heads/main/AgentQnA/retrieval_tool/index_data.py -O /mnt/tools/index_data.py
wget https://raw.githubusercontent.com/opea-project/GenAIExamples/refs/heads/main/AgentQnA/example_data/test_docs_music.jsonl -O /mnt/tools/test_docs_music.jsonl
host_ip=$(kubectl get svc -o jsonpath="{.items[].spec.clusterIP}" --selector app.kubernetes.io/name=data-prep)
python3 index_data.py --filedir /mnt/tools --filename test_docs_music.jsonl --host_ip $host_ip

Verify the workload through curl command

Run the command kubectl port-forward svc/agentqna-supervisor 9090:9090 to expose the service for access.

Open another terminal and run the following command to verify the service if working:

curl http://localhost:9090/v1/chat/completions \
    -X POST \
    -H "Content-Type: application/json" \
    -d '{"messages": "How many albums does Iron Maiden have?"}'