Edge Craft Retrieval-Augmented Generation API Guide¶
Base URLs
EC-RAG Server:
http://${HOST_IP}:16010EC-RAG Mega Service:
http://${HOST_IP}:16011
Pipeline Management¶
Create a pipeline¶
curl -X POST http://${HOST_IP}:16010/v1/settings/pipelines \
-H "Content-Type: application/json" \
-d @tests/test_pipeline_local_llm.json | jq '.'
Update a pipeline¶
curl -X PATCH http://${HOST_IP}:16010/v1/settings/pipelines/{name} \
-H "Content-Type: application/json" \
-d @tests/test_pipeline_local_llm.json | jq '.'
Check all pipelines¶
curl -X GET http://${HOST_IP}:16010/v1/settings/pipelines \
-H "Content-Type: application/json" | jq '.'
Check a specific pipeline¶
curl -X GET http://${HOST_IP}:16010/v1/settings/pipelines/{name} \
-H "Content-Type: application/json" | jq '.'
Activate a pipeline¶
curl -X PATCH http://${HOST_IP}:16010/v1/settings/pipelines/{name} \
-H "Content-Type: application/json" \
-d '{"active": "true"}' | jq '.'
Remove a pipeline¶
# Firstly, deactivate the pipeline if the pipeline status is active
curl -X PATCH http://${HOST_IP}:16010/v1/settings/pipelines/{name} \
-H "Content-Type: application/json" \
-d '{"active": "false"}' | jq '.'
# Then delete the pipeline
curl -X DELETE http://${HOST_IP}:16010/v1/settings/pipelines/{name} \
-H "Content-Type: application/json" | jq '.'
Get pipeline JSON¶
curl -X GET http://${HOST_IP}:16010/v1/settings/pipelines/{name}/json \
-H "Content-Type: application/json" | jq '.'
Import pipeline from a JSON file¶
curl -X POST http://${HOST_IP}:16010/v1/settings/pipelines/import \
-H "Content-Type: multipart/form-data" \
-F "file=@your_pipeline.json" | jq '.'
Enable and check benchmark for pipelines¶
⚠️ NOTICE ⚠️¶
Benchmarking activities may significantly reduce system performance.
DO NOT perform benchmarking in a production environment.
# Set ENABLE_BENCHMARK as true before launching services
export ENABLE_BENCHMARK="true"
# Check the benchmark data for the active pipeline
curl -X GET http://${HOST_IP}:16010/v1/settings/pipeline/benchmark \
-H "Content-Type: application/json" | jq '.'
# Check the benchmark data for a specific pipeline
curl -X GET http://${HOST_IP}:16010/v1/settings/pipelines/{pipeline_name}/benchmarks \
-H "Content-Type: application/json" | jq '.'
Model Management¶
Load a model¶
curl -X POST http://${HOST_IP}:16010/v1/settings/models \
-H "Content-Type: application/json" \
-d '{"model_type": "reranker", "model_id": "BAAI/bge-reranker-large", "model_path": "./models/bge_ov_reranker", "device": "cpu", "weight": "INT4"}' | jq '.'
It will take some time to load the model.
Check all models¶
curl -X GET http://${HOST_IP}:16010/v1/settings/models \
-H "Content-Type: application/json" | jq '.'
Check a specific model¶
curl -X GET http://${HOST_IP}:16010/v1/settings/models/BAAI/bge-reranker-large \
-H "Content-Type: application/json" | jq '.'
Update a model¶
curl -X PATCH http://${HOST_IP}:16010/v1/settings/models/BAAI/bge-reranker-large \
-H "Content-Type: application/json" \
-d '{"model_type": "reranker", "model_id": "BAAI/bge-reranker-large", "model_path": "./models/bge_ov_reranker", "device": "gpu", "weight": "INT4"}' | jq '.'
Delete a model¶
curl -X DELETE http://${HOST_IP}:16010/v1/settings/models/BAAI/bge-reranker-large \
-H "Content-Type: application/json" | jq '.'
Get available model weights¶
Query the available compression weights (INT4 / INT8 / FP16) for a given model path:
curl -X GET "http://${HOST_IP}:16010/v1/settings/weight/BAAI/bge-reranker-large" \
-H "Content-Type: application/json" | jq '.'
Get available model IDs by type¶
Supported model_type values: LLM, vLLM, reranker, embedding, vLLM_embedding, kbadmin_embedding_model
# List available local LLM models
curl -X GET "http://${HOST_IP}:16010/v1/settings/avail-models/LLM" \
-H "Content-Type: application/json" | jq '.'
# List models served by a vLLM server (optional server_address parameter)
curl -X GET "http://${HOST_IP}:16010/v1/settings/avail-models/vLLM?server_address=http://localhost:8086" \
-H "Content-Type: application/json" | jq '.'
# List available embedding models
curl -X GET "http://${HOST_IP}:16010/v1/settings/avail-models/embedding" \
-H "Content-Type: application/json" | jq '.'
Get available models from a vLLM server¶
curl -X GET "http://${HOST_IP}:16010/v1/available_models?server_address=http://localhost:8086" \
-H "Content-Type: application/json" | jq '.'
Knowledge Base Management¶
Create a knowledge base¶
curl -X POST http://${HOST_IP}:16010/v1/knowledge \
-H "Content-Type: application/json" \
-d @tests/configs/test_kb.json | jq '.'
Check all knowledge bases¶
curl -X GET http://${HOST_IP}:16010/v1/knowledge \
-H "Content-Type: application/json" | jq '.'
Check a specific knowledge base¶
curl -X GET http://${HOST_IP}:16010/v1/knowledge/default_kb \
-H "Content-Type: application/json" | jq '.'
Get knowledge base JSON¶
curl -X GET http://${HOST_IP}:16010/v1/knowledge/default_kb/json \
-H "Content-Type: application/json" | jq '.'
Get knowledge base file map (paginated)¶
curl -X GET "http://${HOST_IP}:16010/v1/knowledge/default_kb/filemap?page_num=1&page_size=20" \
-H "Content-Type: application/json" | jq '.'
Update a knowledge base¶
curl -X PATCH http://${HOST_IP}:16010/v1/knowledge/patch \
-H "Content-Type: application/json" \
-d '{"name": "default_kb", "active": "True", "description": "Your knowledge base description"}' | jq '.'
Activate a knowledge base¶
curl -X PATCH http://${HOST_IP}:16010/v1/knowledge/patch \
-H "Content-Type: application/json" \
-d '{"name": "default_kb", "active": true}' | jq '.'
Remove a knowledge base¶
curl -X DELETE http://${HOST_IP}:16010/v1/knowledge/default_kb \
-H "Content-Type: application/json" | jq '.'
Add file to knowledge base¶
curl -X POST http://${HOST_IP}:16010/v1/knowledge/default_kb/files \
-H "Content-Type: application/json" \
-d '{"local_path": "/home/user/ui_cache/#REPLACE WITH YOUR FILE OR DIR PATH#"}' | jq '.'
Delete file from knowledge base¶
curl -X DELETE http://${HOST_IP}:16010/v1/knowledge/default_kb/files \
-H "Content-Type: application/json" \
-d '{"local_path": "/home/user/ui_cache/#REPLACE WITH YOUR FILE PATH#"}' | jq '.'
Experience Knowledge Base Management¶
Experience knowledge bases store curated Q&A pairs that can be retrieved to augment pipeline responses.
Get all experiences¶
curl -X GET http://${HOST_IP}:16010/v1/experiences \
-H "Content-Type: application/json" | jq '.'
Get experience by ID or question¶
curl -X POST http://${HOST_IP}:16010/v1/experience \
-H "Content-Type: application/json" \
-d '{"idx": "your-experience-id"}' | jq '.'
Update an experience¶
curl -X PATCH http://${HOST_IP}:16010/v1/experiences \
-H "Content-Type: application/json" \
-d '{"idx": "your-experience-id", "question": "Updated question?", "content": "Updated answer"}' | jq '.'
Delete an experience¶
curl -X DELETE http://${HOST_IP}:16010/v1/experiences \
-H "Content-Type: application/json" \
-d '{"idx": "your-experience-id"}' | jq '.'
Add experiences from a file¶
curl -X POST http://${HOST_IP}:16010/v1/experiences/files \
-H "Content-Type: application/json" \
-d '{"local_path": "/home/user/ui_cache/experiences.json"}' | jq '.'
Check and add multiple experiences (duplicate check first)¶
# Step 1: Check for duplicates; if none, experiences are added automatically
curl -X POST http://${HOST_IP}:16010/v1/multiple_experiences/check \
-H "Content-Type: application/json" \
-d '[{"question": "What is EC-RAG?", "content": "EdgeCraft RAG is ..."}]' | jq '.'
# Step 2 (only if duplicates detected): Confirm with overwrite flag
# flag=true → overwrite duplicates; flag=false → append
curl -X POST "http://${HOST_IP}:16010/v1/multiple_experiences/confirm?flag=true" \
-H "Content-Type: application/json" \
-d '[{"question": "What is EC-RAG?", "content": "EdgeCraft RAG is ..."}]' | jq '.'
Agent Management¶
Check all agents¶
curl -X GET http://${HOST_IP}:16010/v1/agents \
-H "Content-Type: application/json" | jq '.'
Check a specific agent¶
curl -X GET http://${HOST_IP}:16010/v1/agents/{name} \
-H "Content-Type: application/json" | jq '.'
Get default configs for an agent type¶
curl -X GET http://${HOST_IP}:16010/v1/agents/configs/{agent_type} \
-H "Content-Type: application/json" | jq '.'
Create an agent¶
curl -X POST http://${HOST_IP}:16010/v1/agents \
-H "Content-Type: application/json" \
-d '{"name": "my_agent", "type": "react_llm", "pipeline_idx": "your-pipeline-idx"}' | jq '.'
Update an agent¶
curl -X PATCH http://${HOST_IP}:16010/v1/agents/{name} \
-H "Content-Type: application/json" \
-d '{"name": "my_agent", "active": true}' | jq '.'
Delete an agent¶
curl -X DELETE http://${HOST_IP}:16010/v1/agents/{name} \
-H "Content-Type: application/json" | jq '.'
System Prompt Management¶
Get system prompt¶
curl -X GET http://${HOST_IP}:16010/v1/chatqna/prompt \
-H "Content-Type: application/json" | jq '.'
Get tagged system prompt¶
curl -X GET http://${HOST_IP}:16010/v1/chatqna/prompt/tagged \
-H "Content-Type: application/json" | jq '.'
Get default system prompt¶
curl -X GET http://${HOST_IP}:16010/v1/chatqna/prompt/default \
-H "Content-Type: application/json" | jq '.'
Update system prompt¶
curl -X POST http://${HOST_IP}:16010/v1/chatqna/prompt \
-H "Content-Type: application/json" \
-d '{"prompt": "This is a custom prompt template"}' | jq '.'
Upload system prompt from file¶
curl -X POST http://${HOST_IP}:16010/v1/chatqna/prompt-file \
-H "Content-Type: multipart/form-data" \
-F "file=@your_prompt_file.txt" | jq '.'
Reset system prompt to default¶
curl -X POST http://${HOST_IP}:16010/v1/chatqna/prompt/reset \
-H "Content-Type: application/json" | jq '.'
Data Management¶
Get all nodes (chunks) in the active knowledge base¶
curl -X GET http://${HOST_IP}:16010/v1/data/nodes \
-H "Content-Type: application/json" | jq '.'
Get nodes by document name¶
curl -X GET "http://${HOST_IP}:16010/v1/data/{document_name}/nodes" \
-H "Content-Type: application/json" | jq '.'
Get all document names in the active knowledge base¶
curl -X GET http://${HOST_IP}:16010/v1/data/documents \
-H "Content-Type: application/json" | jq '.'
Get all files¶
curl -X GET http://${HOST_IP}:16010/v1/data/files \
-H "Content-Type: application/json" | jq '.'
Get a specific file¶
curl -X GET http://${HOST_IP}:16010/v1/data/files/{name} \
-H "Content-Type: application/json" | jq '.'
Upload a file (from UI)¶
curl -X POST "http://${HOST_IP}:16010/v1/data/file/{file_name}" \
-H "Content-Type: multipart/form-data" \
-F "file=@/path/to/your/document.pdf" | jq '.'
Session Management¶
Get all sessions¶
curl -X GET http://${HOST_IP}:16010/v1/sessions \
-H "Content-Type: application/json" | jq '.'
Get a session by ID¶
curl -X GET http://${HOST_IP}:16010/v1/session/{session_id} \
-H "Content-Type: application/json" | jq '.'
System Information¶
Get system status¶
Returns CPU usage, memory usage, disk usage, OS info, and current time.
curl -X GET http://${HOST_IP}:16010/v1/system/info \
-H "Content-Type: application/json" | jq '.'
Get available inference devices¶
Returns OpenVINO-available devices (e.g., CPU, GPU, AUTO).
curl -X GET http://${HOST_IP}:16010/v1/system/device \
-H "Content-Type: application/json" | jq '.'
ChatQnA¶
Retrieval API¶
Retrieve relevant context chunks from the active knowledge base without running LLM generation.
curl -X POST http://${HOST_IP}:16010/v1/retrieval \
-H "Content-Type: application/json" \
-d '{"messages": "Your question here", "top_n": 5, "max_tokens": 512}' | jq '.'
ChatQnA API (Mega Service)¶
Send a question through the full RAG pipeline (retrieval + LLM generation).
curl -X POST http://${HOST_IP}:16011/v1/chatqna \
-H "Content-Type: application/json" \
-d '{"messages": "Your question here", "top_n": 5, "max_tokens": 512}' | jq '.'
RAGQnA API (with contexts in response)¶
Returns the LLM answer together with the retrieved context chunks.
# Non-streaming
curl -X POST http://${HOST_IP}:16010/v1/ragqna \
-H "Content-Type: application/json" \
-d '{"messages": "Your question here", "top_n": 5, "max_tokens": 512, "stream": false}' | jq '.'
# Streaming
curl -X POST http://${HOST_IP}:16010/v1/ragqna \
-H "Content-Type: application/json" \
-d '{"messages": "Your question here", "top_n": 5, "max_tokens": 512, "stream": true}'
Check vLLM server connection¶
curl -X POST http://${HOST_IP}:16010/v1/check/vllm \
-H "Content-Type: application/json" \
-d '{"server_address": "http://localhost:8086", "model_name": "Qwen/Qwen3-8B"}' | jq '.'