Retriever Microservice with ArangoDB¶
🚀Start Microservice with Docker¶
Start ArangoDB Server¶
To launch ArangoDB locally, first ensure you have docker installed. Then, you can launch the database with the following docker command.
docker run -d -p 8529:8529 -e ARANGO_ROOT_PASSWORD=test arangodb/arangodb:latest
Setup Environment Variables¶
export no_proxy=${your_no_proxy}
export http_proxy=${your_http_proxy}
export https_proxy=${your_http_proxy}
export ARANGO_URL=${your_arango_url} # e.g. http://localhost:8529
export ARANGO_USERNAME=${your_arango_username} # e.g. root
export ARANGO_PASSWORD=${your_arango_password} # e.g test
export ARANGO_DB_NAME=${your_db_name} # e.g _system
export TEI_EMBEDDING_ENDPOINT=${your_tei_embedding_endpoint}
export HF_TOKEN=${your_huggingface_api_token}
Build Docker Image¶
cd ~/GenAIComps/
docker build -t opea/retriever:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/src/Dockerfile .
Run via CLI¶
docker run -d --name="retriever-arango-server" -p 7000:7000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e ARANGODB_URL="http://localhost:8529" opea/retriever:latest -e RETRIEVER_COMPONENT_NAME="OPEA_RETRIEVER_ARANGODB"
Run Docker with Docker Compose¶
cd ~/GenAIComps/comps/retrievers/deployment/docker_compose/
docker compose up retriever-arangodb -d
See below for additional environment variables that can be set.
🚀3. Consume Retriever Service¶
curl http://${your_ip}:7000/v1/health_check \
-X GET \
-H 'Content-Type: application/json'
3.2 Consume Embedding Service¶
To consume the Retriever Microservice, you can generate a mock embedding vector of length 768 with Python.
export your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://${your_ip}:7000/v1/retrieval \
-X POST \
-d "{\"input\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding}}" \
-H 'Content-Type: application/json'
export your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://localhost:7000/v1/retrieval \
-X POST \
-d "{\"input\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"similarity\", \"k\":4}" \
-H 'Content-Type: application/json'
export your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://localhost:7000/v1/retrieval \
-X POST \
-d "{\"input\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"similarity_distance_threshold\", \"k\":4, \"distance_threshold\":1.0}" \
-H 'Content-Type: application/json'
export your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://localhost:7000/v1/retrieval \
-X POST \
-d "{\"input\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"similarity_score_threshold\", \"k\":4, \"score_threshold\":0.2}" \
-H 'Content-Type: application/json'
export your_embedding=$(python -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)")
curl http://localhost:7000/v1/retrieval \
-X POST \
-d "{\"input\":\"What is the revenue of Nike in 2023?\",\"embedding\":${your_embedding},\"search_type\":\"mmr\", \"k\":4, \"fetch_k\":20, \"lambda_mult\":0.5}" \
-H 'Content-Type: application/json'
Additional options that can be specified from the environment variables are as follows (default values are in the config.py file):
ArangoDB Connection configuration
ARANGO_URL: The URL for the ArangoDB service.ARANGO_USERNAME: The username for the ArangoDB service.ARANGO_PASSWORD: The password for the ArangoDB service.ARANGO_DB_NAME: The name of the database to use for the ArangoDB service.
ArangoDB Vector configuration
ARANGO_GRAPH_NAME: The name of the graph to use for the ArangoDB service. Defaults toGRAPH.ARANGO_DISTANCE_STRATEGY: The distance strategy to use for the ArangoDB service. Defaults toCOSINE. Other option could be"EUCLIDEAN_DISTANCE".ARANGO_USE_APPROX_SEARCH: If set to True, the microservice will use the approximate nearest neighbor search for as part of the retrieval step. Defaults toFalse, which means the microservice will use the exact search.ARANGO_NUM_CENTROIDS: The number of centroids to use for the approximate nearest neighbor search. Defaults to1.ARANGO_SEARCH_START: The starting point for the search. Defaults tonode. Other option could be"edge", or"chunk".ARANGO_SEARCH_MODE: The method of search to use for ArangoDB Vector Search. Defaults tovector. Other option could be"hybrid", which combines Vector Search + Full Text Search via Reciprocal Rank Fusion (RRF).
ArangoDB Traversal configuration
ARANGO_TRAVERSAL_ENABLED: If set to True, the microservice will perform a traversal of the graph on the documents matched by similarity and return additional context (i.e nodes, edges, or chunks) from the graph. Defaults toFalse. See thefetch_neighborhoodsmethod in thearangodb.pyfile for more details.ARANGO_TRAVERSAL_MAX_DEPTH: The maximum depth for the traversal. Defaults to1.ARANGO_TRAVERSAL_MAX_RETURNED: The maximum number of nodes/edges/chunks to return per matched document from the traversal. Defaults to3.ARANGO_TRAVERSAL_SCORE_THRESHOLD: The score threshold for the traversal. Defaults to0.5.ARANGO_TRAVERSAL_QUERY: An optional query to define custom traversal logic. This can be used to specify a custom traversal query for the ArangoDB service. If not set, the default traversal logic will be used. See thefetch_neighborhoodsmethod in thearangodb.pyfile for more details.
Embedding configuration
TEI_EMBEDDING_ENDPOINT: The endpoint for the TEI service.TEI_EMBED_MODEL: The model to use for the TEI service. Defaults toBAAI/bge-base-en-v1.5.HF_TOKEN: The API token for Hugging Face access.
Summarizer Configuration
SUMMARIZER_ENABLED: If set to True, the microservice will apply summarization after retrieval. Defaults toFalse. Requires theVLLMservice to be running or a validOPENAI_API_KEYto be set. See theVLLM Configurationsection or theOpenAI Configurationsection below.
vLLM Configuration
VLLM_API_KEY: The API key for the vLLM service. Defaults to"EMPTY".VLLM_ENDPOINT: The endpoint for the VLLM service. Defaults tohttp://localhost:80.VLLM_MODEL_ID: The model ID for the VLLM service. Defaults toIntel/neural-chat-7b-v3-3.VLLM_MAX_NEW_TOKENS: The maximum number of new tokens to generate. Defaults to512.VLLM_TOP_P: If set to < 1, only the smallest set of most probable tokens with probabilities that add up to top_p or higher are kept for generation. Defaults to0.9.VLLM_TEMPERATURE: The temperature for the sampling. Defaults to0.8.VLLM_TIMEOUT: The timeout for the VLLM service. Defaults to600.
OpenAI Configuration: Note: This configuration can replace the VLLM and TEI services for text generation and embeddings.
OPENAI_API_KEY: The API key for the OpenAI service. If not set, the microservice will not use the OpenAI service.OPENAI_CHAT_MODEL: The chat model to use for the OpenAI service. Defaults togpt-4o.OPENAI_CHAT_TEMPERATURE: The temperature for the OpenAI service. Defaults to0.OPENAI_EMBED_MODEL: The embedding model to use for the OpenAI service. Defaults totext-embedding-3-small.OPENAI_EMBED_DIMENSION: The embedding dimension for the OpenAI service. Defaults to768.OPENAI_CHAT_ENABLED: If set to True, the microservice will use the OpenAI service for text generation, as long asOPENAI_API_KEYis also set. Defaults toTrue.OPENAI_EMBED_ENABLED: If set to True, the microservice will use the OpenAI service for text embeddings, as long asOPENAI_API_KEYis also set. Defaults toTrue.`
Some of these parameters are also available via parameters in the API call. If set, these will override the equivalent environment variables:
class RetrievalRequest(BaseModel): ...
class RetrievalRequestArangoDB(RetrievalRequest):
graph_name: str | None = None
search_start: str | None = None # "node", "edge", "chunk"
search_mode: str | None = None # "vector", "hybrid"
num_centroids: int | None = None
distance_strategy: str | None = None # # "COSINE", "EUCLIDEAN_DISTANCE"
use_approx_search: bool | None = None
enable_traversal: bool | None = None
enable_summarizer: bool | None = None
traversal_max_depth: int | None = None
traversal_max_returned: int | None = None
traversal_score_threshold: float | None = None
traversal_query: str | None = None
See the comps/cores/proto/api_protocol.py file for more details on the API request and response models.