# GraphRAG Application
While naive RAG works well in fetching precise information it fails on global questions directed at an entire text corpus, such as "What are the main themes in the dataset?".
GraphRAG was introduced by Microsoft paper "From Local to Global: A Graph RAG Approach to Query-Focused Summarization". The key elements are:
- Uses LLM to derive an entity knowledge graph from the source documents
- Uses hierarchical leiden algorithm to identify communities of closely-related entities and summaries are extracted for each community
- For an input query the relevant communities are identified and partial answers are generated from each of the community summaries with a retrieval LLM (query-focused summarization (QFS))
- There is a final generation stage (last LLM) that responds to the query based on the intermediate community answers (QFS). See [GraphRAG Model Notes](GraphRAG_LLM_notes.md)
- In this app three LLM models are used: dataprep (knowledge graph), retriever (query-focused summaries), and final generation. CPU (Xeon) is used for the final generation LLM, and embedding, and dataprep and retriever LLMs are used by endpoints.
## Deploy GraphRAG Service
Quick Start Deployment Steps:
1. Set up the environment variables.
2. Run Docker Compose.
3. Consume the GraphRAG Service.
Note: If you do not have docker installed you can run this script to install docker : `bash docker_compose/install_docker.sh`
## Pre-requisites
Build images:
```bash
cd ~/
git clone https://github.com/opea-project/GenAIExamples.git
git clone https://github.com/vllm-project/vllm.git
git clone https://github.com/opea-project/GenAIComps.git
# vllm-service
cd vllm/
VLLM_VER=v0.8.3
git checkout "${VLLM_VER}"
docker build --no-cache -f docker/Dockerfile.cpu -t opea/vllm-cpu:"${TAG:-latest}" --shm-size=128g .
# opea/dataprep
cd ~/GenAIComps
docker build -t opea/dataprep:latest \
--build-arg "no_proxy=${no_proxy}" \
--build-arg "https_proxy=${https_proxy}" \
--build-arg "http_proxy=${http_proxy}" \
-f comps/dataprep/src/Dockerfile .
# opea/retrievers
cd ~/GenAIComps
docker build -t opea/retriever:latest \
--build-arg "no_proxy=${no_proxy}" \
--build-arg "https_proxy=${https_proxy}" \
--build-arg "http_proxy=${http_proxy}" \
-f comps/retrievers/src/Dockerfile .
# opea/graphrag-ui
cd ~/GenAIExamples/GraphRAG/ui
docker build -t opea/graphrag-ui:latest \
--build-arg "no_proxy=${no_proxy}" \
--build-arg "https_proxy=${https_proxy}" \
--build-arg "http_proxy=${http_proxy}" \
-f docker/Dockerfile .
# opea/graphrag
cd ~/GenAIExamples/GraphRAG
docker build -t opea/graphrag:latest .
# Note: it is important to be in the correct path before builds so that docker has the correct context to COPY relevant code to containers.
```
### Quick Start: 1.Setup Environment Variable
To set up environment variables for deploying GraphRAG services, follow these steps:
1. Set the required private environment variables:
```bash
# For simplicity Openrouter.ai is used as an endpoint for both dataprep and retriever components.
# These endpoints could be configured to any openAI-like endpoint.
export OPENROUTER_KEY="mykey"
export HUGGINGFACEHUB_API_TOKEN="mytoken"
source set_env.sh
# Below will override some of these defaults in set_env.sh
export host_ip=$(hostname -I | awk '{print $1}')
export NEO4J_PORT1=11631
export NEO4J_PORT2=11632
export NEO4J_URI="bolt://${host_ip}:${NEO4J_PORT2}"
export NEO4J_URL="bolt://${host_ip}:${NEO4J_PORT2}"
export NEO4J_USERNAME="neo4j"
export NEO4J_PASSWORD="neo4jtest"
export DATAPREP_SERVICE_ENDPOINT="http://${host_ip}:5000/v1/dataprep/ingest"
# Must explicitly override default to not use OpenAI.
export OPENAI_LLM_MODEL=""
export OPENAI_EMBEDDING_MODEL=""
# Embedder endpoint
export TEI_EMBEDDER_PORT=6006
export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:${TEI_EMBEDDER_PORT}"
# LLM for dataprep is used to extract knowledge graph
export DATAPREP_LLM_ENDPOINT="https://openrouter.ai/api"
export DATAPREP_LLM_MODEL_ID="anthropic/claude-3-haiku"
export DATAPREP_LLM_ENDPOINT_KEY=${OPENROUTER_KEY}
# LLM for retriever performs community summaries at retrieval time
export RETRIEVER_LLM_ENDPOINT="https://openrouter.ai/api"
export RETRIEVER_LLM_MODEL_ID="anthropic/claude-3-haiku"
export RETRIEVER_LLM_ENDPOINT_KEY=${OPENROUTER_KEY}
# Final LLM to formulates response based on relevant community summaries.
export FINAL_LLM_MODEL_ID="Qwen/Qwen2.5-0.5B-Instruct"
export LOGFLAG=True
export MAX_INPUT_TOKENS=4096
export MAX_TOTAL_TOKENS=8192
export DATAPREP_PORT=11103
export RETRIEVER_PORT=11635
export MEGA_SERVICE_PORT=8888
```
2. If you are in a proxy environment, also set the proxy-related environment variables:
```bash
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
export no_proxy=$no_proxy,${host_ip} #important to add {host_ip} for containers communication
```
### Quick Start: 2.Run Docker Compose
If the microservice images are available in Docker Hub they will be pulled, otherwise you will need to build the container images manually. Please refer to the 'Build Docker Images' in [Guide](../../../../../ChatQnA/docker_compose/intel/cpu/xeon/README.md). [test_compose_on_xeon.sh](../../../../../ChatQnA/tests/test_compose_on_xeon.sh) can be a good resource as it shows how to do image build, starting services, validated each microservices and megaservices. This is what is used in CI/CD.
```bash
cd GraphRAG/docker_compose/intel/cpu/xeon
NGINX_PORT=8080 docker compose -f compose.yaml up -d
```
Here NGINX_PORT=8080 because typically port 80 is used for internet browsing.
#### Check the Deployment Status
After running docker compose, check if all the containers launched via docker compose have started:
```bash
docker ps -a
```
The following containers should have started:
```bash
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
740d0061fce2 opea/nginx:latest "/docker-entrypoint.…" 3 hours ago Up 3 hours 0.0.0.0:8080->80/tcp, [::]:8080->80/tcp graphrag-xeon-nginx-server
3010243786cd opea/graphrag-ui:latest "docker-entrypoint.s…" 3 hours ago Up 3 hours 0.0.0.0:5173->5173/tcp, :::5173->5173/tcp graphrag-ui-server
f63d10453e22 opea/graphrag:latest "python graphrag.py" 3 hours ago Up 3 hours 0.0.0.0:8888->8888/tcp, :::8888->8888/tcp graphrag-xeon-backend-server
a48d0fba13e6 opea/dataprep:latest "sh -c 'python $( [ …" 3 hours ago Up 3 hours 0.0.0.0:6004->5000/tcp, [::]:6004->5000/tcp dataprep-neo4j-server
9301a833f220 opea/retriever:latest "python opea_retriev…" 3 hours ago Up 3 hours 0.0.0.0:7000->7000/tcp, :::7000->7000/tcp retriever-neo4j-server
eda369268406 ghcr.io/huggingface/text-embeddings-inference:cpu-1.7 "text-embeddings-rou…" 3 hours ago Up 3 hours 0.0.0.0:6006->80/tcp, [::]:6006->80/tcp tei-embedding-server
f21e82efa1fa opea/vllm-cpu:latest "python3 -m vllm.ent…" 3 hours ago Up 3 hours (healthy) 0.0.0.0:9009->80/tcp, [::]:9009->80/tcp vllm-service
3b541ceeaf9f neo4j:latest "tini -g -- /startup…" 3 hours ago Up 3 hours 7473/tcp, 0.0.0.0:11631->7474/tcp, [::]:11631->7474/tcp, 0.0.0.0:11632->7687/tcp, [::]:11632->7687/tcp neo4j-apoc
```
##### Test Final vLLM
```
curl http://localhost:9009/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"'${FINAL_LLM_MODEL_ID}'","messages":[{"role":"user","content":"Tell me a joke?"}]}'
```
### QuickStart: 3.Upload RAG Files and Consume the GraphRAG Service
To chat with retrieved information, you need to upload a file using `Dataprep` service.
Here is an example of uploading sample graph data (which can also be uploaded via the UI):
```bash
cd ~/GenAIExamples/GraphRAG/example_data
# First file
curl -X POST "http://${host_ip}:6004/v1/dataprep/ingest" \
-H "Content-Type: multipart/form-data" \
-F "files=@./programming_languages.txt"
# Second file
curl -X POST "http://${host_ip}:6004/v1/dataprep/ingest" \
-H "Content-Type: multipart/form-data" \
-F "files=@./programming_languages2.txt"
```
To login into the Neo4j UI you may browse to http://localhost:{NEO4J_PORT1}/browser, and login with your NEO4J login and password defined in the environment variables section.
The backend graphrag service can be queried via curl:
```bash
curl http://${host_ip}:8888/v1/graphrag \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user","content": "what are the main themes of the programming dataset?"}]}'
```
## Architecture and Deploy details
The GraphRAG example is implemented using the component-level microservices defined in [GenAIComps](https://github.com/opea-project/GenAIComps). The flow chart below shows the information flow between different microservices for this example.
```mermaid
---
config:
flowchart:
nodeSpacing: 400
rankSpacing: 100
curve: linear
themeVariables:
fontSize: 50px
---
flowchart LR
%% Colors %%
classDef blue fill:#ADD8E6,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef orange fill:#FBAA60,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
classDef invisible fill:transparent,stroke:transparent;
style GraphRAG-MegaService stroke:#000000
%% Subgraphs %%
subgraph GraphRAG-MegaService["GraphRAG MegaService "]
direction LR
RET([Retrieval MicroService]):::blue
LLM([LLM MicroService]):::blue
EM([Embedding MicroService]):::blue
end
subgraph UserInterface[" User Interface "]
direction LR
a([User Input Query]):::orchid
Ingest([Ingest data]):::orchid
UI([UI server
]):::orchid
end
GDB{{Graph DB
}}
DP([Data Preparation MicroService]):::blue
GW([GraphRAG GateWay
]):::orange
%% Data Preparation flow
%% Ingest data flow
direction LR
Ingest[Ingest data] --> UI
UI --> DP
%% interactions buried inside the DP and RET microservice implementations
DP <-.-> EM
DP <-.-> LLM
RET <-.-> EM
RET <-.-> LLM
%% Questions interaction
direction LR
a[User Input Query] --> UI
UI --> GW
GW <==> GraphRAG-MegaService
RET ==> LLM
direction TB
%% Graph DB interaction
RET <-.-> |d|GDB
DP <-.-> |d|GDB
linkStyle 2 stroke:#000000,stroke-width:2px;
linkStyle 3 stroke:#000000,stroke-width:2px;
linkStyle 4 stroke:#000000,stroke-width:2px;
linkStyle 5 stroke:#000000,stroke-width:2px;
```
Xeon default configuration:
| MicroService | Open Source Project | HW | Default Port | Endpoint |
| ------------ | ------------------- | --- | ------------ | -------- |
| Dataprep | Neo4j, LlamaIndex | OpenAI-like Endpoint | 6004 | /v1/dataprep/ingest |
| Embedding | Llama-index, TEI | Xeon or CPU | 6006 | /v1/embeddings |
| Retriever | Llama-index, Neo4j | OpenAI-like Endpoint | 7000 | /v1/retrieval |
| Final LLM | vLLM | Xeon or CPU | 9009 | /v1/chat/completions |
### Models Selection
[GraphRAG Model Notes](GraphRAG_LLM_notes.md)
## Consume GraphRAG Service with RAG
### 1. Check Service Status
Before consuming GraphRAG Service, make sure each microservice is ready by checking the docker logs of each microservice.
```bash
docker logs container_name
```
### 2. Access via frontend
To access the frontend, open the following URL in your browser: `http://{host_ip}:NGINX_PORT`
In the above example, the UI runs on port 8080 internally.
## Monitoring OPEA Service with Prometheus and Grafana dashboard
OPEA microservice deployment can easily be monitored through Grafana dashboards in conjunction with Prometheus data collection. Follow the [README](/GenAIEval/evals/benchmark/grafana/README.md) to setup Prometheus and Grafana servers and import dashboards to monitor the OPEA service.