OPEA 2024 - 2025 Roadmap¶
May 2024¶
Contribution¶
Components
ASR
Data Prep
Embedding
Guardrails
LLM (Gaudi TGI)
Rerank
Retrieval
TTS
VectorDB
Use Cases/Examples
ChatQnA
CodeGen
CodeTrans
Cloud Native
OneClick OPEA on ChatQnA
OneClick OPEA on CodeGen
GenAI microservice connector
Evaluation & Others
CICD & Validation
Eval: E2E (GenAIComps & GenAIExamples), lm-eval-harness, bigcode-eval-harness
RAGAS evaluation service
AI Models¶
LLM: llama2 (7b, 13b, 70b), llama3 (8b, 70b), code-llama, Llama guard
Embedding: BGE-base
AI Tools Integration¶
VectorDB: Chroma
Framework: Langchain
Deployment Type¶
On Prem,IDC (Xeon, Gaudi)
June 2024¶
Contribution¶
Components
LLM (Xeon vLLM & Ray, Ollama)
OVMS
prompting
user feedback management
Mega Component (MI6 RAG service)
Use Cases/Examples
DocSum
SearchQnA
Cloud Native
OneClick OPEA for 2 more examples
GMC with switch support (dynamic pipelines)
Helm charts/templates for custom yamls (refactoring)
Evaluation & Others
CICD & Validation
Eval: E2E (GenAIComps & GenAIExamples) Gaudi (2) and CPUs in CICD cluster
AI Models¶
LLM: mistral-7B, mixtral-8x7B
Embedding: E5-mistral-7b-instruct, all-mpnet-base-v2
AI Tools Integration¶
VectorDB: Pinecone, Redis
Framework: Llamaindex, Haystack
Deployment Type¶
On Prem,IDC (Xeon, Gaudi)
July 2024¶
Contribution¶
Components
LVM (Gaudi vLLM & Ray)
vectordb (svs)
Gateway guardrail, Auth Z/N
Use Cases/Examples
FAQGen
Cloud Native
OpenShift enablement for OPEA
OneClick OPEA for 3 more examples
Security (Service Mesh, guardrails)
Evaluation & Others
CICD & Validation
Eval: E2E (GenAIComps & GenAIExamples)
AI Models¶
LLM: Phi, Gemma
Embedding: all-MiniLM-L6-v2, paraphrase-albert-small-v2
AI Tools Integration¶
VectorDB: PGVector, Qdrant
Deployment Type¶
Aug 2024¶
Contribution¶
Components
Documentation
Test automation script
Telemetry
Use Cases/Examples
Documentation
Test automation script
Cloud Native
Demo K8s resource management
Documentation on autoscaler analysis
Evaluation & Others
CICD & Validation
Eval: E2E (GenAIComps & GenAIExamples)
AI Models¶
Vision: llava
Mixtral-8x22B
AI Tools Integration¶
VectorDB: Milvus
Deployment Type¶
Public Cloud AWS (Xeon CPU & NV GPU)
Sep 2024¶
Contribution¶
Components
Microservice for Image and Video
Use Cases/Examples
Text to Image generation
Image to Video generation
Playground (composable and configurable)
Cloud Native
Evaluation & Others
CICD & Validation
Eval: E2E (GenAIComps & GenAIExamples)
AI Models¶
Diffusion model:
Stable Diffusion XL
Stable Diffusion 3M
Stable Video Diffusion
AI Tools Integration¶
VectorDB: Weaviate
Deployment Type¶
Q4 2024¶
Contribution¶
Components
Fine-tuning E2E pipeline
Knowledge Graph
Use Cases/Examples
Fine-tuning (Lora)
AI Agent (single Agent with text and Audio as user interface)
Closed source LLM
GraphRAG
Cloud Native
Static tuning on Resource management for deployment
Evaluation & Others
CICD & Validation
Eval: E2E (GenAIComps & GenAIExamples)
AI Models¶
LLM open: Grok 1
LLM Close: GPT3.5/4/4o, Claude 3/3.5
AWS Bedrock endpoint
AI Tools Integration¶
Knowledge graph: Neo4j
Agent: LangGraph
Deployment Type¶
Public Cloud (Azure, GCP, Oracle, AWS)
AI PC (Intel)
OPEA 2025 Roadmap¶
Release Cadence¶
Release cycles extended from 2 months to 3 months (TSC approved)
Upcoming versions:
Version |
Release Date |
Key Features |
---|---|---|
v1.6 |
Jan 2026 |
Domain-specific AI Agent Blueprints with Partners, Leading open source LLM, Image and Video Diffusion models |
v1.5 |
Oct 2025 |
RouteLLM, Finetuning (Advanced), Next Agent Example |
v1.4 |
Jul 2025 |
Agent (human in the loop), Finance Agent Advanced, GraphRAG (Arango DB), AI Resource Optimizer |
v1.3 |
Apr 2025 |
Agent (multi-turn message), Advanced AgentQnA, Finance Agent Basic, DocSum (Performance, accuracy and stability) |
v1.2 |
Jan 2025 |
vLLM Arc GPU via OpenVINO, Langchain Integration, Llamaindex Integration, Eval Benchmark for ChatQnA |
Q1 2025 (v1.2 release)¶
Contribution¶
GenAI Component
vLLM Arc GPU via OpenVINO
Opensearch vector DB
Elastic search
POC for Model context protocol
GenAI Examples
Langchain Integration
Llamaindex Integration
GenAI Infra
OPEA on k8s guide
HPA support in GMC
Istio m/TLS integration
GenAIEval
Eval benchmark for China ecosystem
K8s conformance test
Long context benchmark enhancement
ChatQnA Benchmark (performance, accuracy and stability)
AI Models¶
BAAI/bge-base-zh-v1.5
AWS bedrock endpoint
Q2 2025 (v1.3 release)¶
Contribution¶
GenAI Component
Agent (multi-turn message)
GenAI Examples
Advanced AgentQnA
Finance Agent Basic
vLLM enablement for GenAI examples
*Haystack Integration
GenAI Infra
Enhance existing HELM charts (8 GenAI Examples)
OIM basic (container structure)
HPA Scaling for IAAS
GenAIEval
Initial DocuSum Benchmark Support (Performance, accuracy and stability)
Long context benchmark enhancement (vLLM-Gaudi)
AI Models¶
Deepseek v3, R1, 6 distilled LLM
Mistral (Large, Small)
Granite (IBM)
Q3 2025 (v1.4 release)¶
Contribution¶
GenAI Component
Agent (human in the loop)
GenAI Examples
Finance Agent Advanced
GraphRAG (Arango DB)
Finetuning (Basic)
Model Context Protocol
GenAI Infra
OIM enhancement
AI Resource Optimizer
GenAIEval
AI Agent (performance, accuracy and stability)
AI Models¶
Deepseek upgrades
Llama4
Falcon LLM
Falcon LVM
Finetuned Financial model
Q4 2025 (v1.5 release)¶
Contribution¶
GenAI Component
RouteLLM
GenAI Examples
Finetuning (Advanced)
Next Agent Example
GenAI Infra
OIM advanced
GenAIEval
More GenAI Example performance, accuracy and stability (continuous)
AI Models¶
Next advanced LLMs
Q1 2026 (v1.6 release)¶
Contribution¶
GenAI Examples
Domain specific AI Agent Blueprint backed by customers/partners
Leading open source LLM (reasoning model, FM)
Image, Video Diffusion model