OPEA 2024 - 2025 Roadmap

May 2024

Contribution

  • Components

    • ASR

    • Data Prep

    • Embedding

    • Guardrails

    • LLM (Gaudi TGI)

    • Rerank

    • Retrieval

    • TTS

    • VectorDB

  • Use Cases/Examples

    • ChatQnA

    • CodeGen

    • CodeTrans

  • Cloud Native

    • OneClick OPEA on ChatQnA

    • OneClick OPEA on CodeGen

    • GenAI microservice connector

  • Evaluation & Others

    • CICD & Validation

    • Eval: E2E (GenAIComps & GenAIExamples), lm-eval-harness, bigcode-eval-harness

    • RAGAS evaluation service

AI Models

  • LLM: llama2 (7b, 13b, 70b), llama3 (8b, 70b), code-llama, Llama guard

  • Embedding: BGE-base

AI Tools Integration

  • VectorDB: Chroma

  • Framework: Langchain

Deployment Type

  • On Prem,IDC (Xeon, Gaudi)

June 2024

Contribution

  • Components

    • LLM (Xeon vLLM & Ray, Ollama)

    • OVMS

    • prompting

    • user feedback management

    • Mega Component (MI6 RAG service)

  • Use Cases/Examples

    • DocSum

    • SearchQnA

  • Cloud Native

    • OneClick OPEA for 2 more examples

    • GMC with switch support (dynamic pipelines)

    • Helm charts/templates for custom yamls (refactoring)

  • Evaluation & Others

    • CICD & Validation

    • Eval: E2E (GenAIComps & GenAIExamples) Gaudi (2) and CPUs in CICD cluster

AI Models

  • LLM: mistral-7B, mixtral-8x7B

  • Embedding: E5-mistral-7b-instruct, all-mpnet-base-v2

AI Tools Integration

  • VectorDB: Pinecone, Redis

  • Framework: Llamaindex, Haystack

Deployment Type

  • On Prem,IDC (Xeon, Gaudi)

July 2024

Contribution

  • Components

    • LVM (Gaudi vLLM & Ray)

    • vectordb (svs)

    • Gateway guardrail, Auth Z/N

  • Use Cases/Examples

    • FAQGen

  • Cloud Native

    • OpenShift enablement for OPEA

    • OneClick OPEA for 3 more examples

    • Security (Service Mesh, guardrails)

  • Evaluation & Others

    • CICD & Validation

    • Eval: E2E (GenAIComps & GenAIExamples)

AI Models

  • LLM: Phi, Gemma

  • Embedding: all-MiniLM-L6-v2, paraphrase-albert-small-v2

AI Tools Integration

  • VectorDB: PGVector, Qdrant

Deployment Type

Aug 2024

Contribution

  • Components

    • Documentation

    • Test automation script

    • Telemetry

  • Use Cases/Examples

    • Documentation

    • Test automation script

  • Cloud Native

    • Demo K8s resource management

    • Documentation on autoscaler analysis

  • Evaluation & Others

    • CICD & Validation

    • Eval: E2E (GenAIComps & GenAIExamples)

AI Models

  • Vision: llava

  • Mixtral-8x22B

AI Tools Integration

  • VectorDB: Milvus

Deployment Type

  • Public Cloud AWS (Xeon CPU & NV GPU)

Sep 2024

Contribution

  • Components

    • Microservice for Image and Video

  • Use Cases/Examples

    • Text to Image generation

    • Image to Video generation

    • Playground (composable and configurable)

  • Cloud Native

  • Evaluation & Others

    • CICD & Validation

    • Eval: E2E (GenAIComps & GenAIExamples)

AI Models

  • Diffusion model:

    • Stable Diffusion XL

    • Stable Diffusion 3M

    • Stable Video Diffusion

AI Tools Integration

  • VectorDB: Weaviate

Deployment Type

Q4 2024

Contribution

  • Components

    • Fine-tuning E2E pipeline

    • Knowledge Graph

  • Use Cases/Examples

    • Fine-tuning (Lora)

    • AI Agent (single Agent with text and Audio as user interface)

    • Closed source LLM

    • GraphRAG

  • Cloud Native

    • Static tuning on Resource management for deployment

  • Evaluation & Others

    • CICD & Validation

    • Eval: E2E (GenAIComps & GenAIExamples)

AI Models

  • LLM open: Grok 1

  • LLM Close: GPT3.5/4/4o, Claude 3/3.5

  • AWS Bedrock endpoint

AI Tools Integration

  • Knowledge graph: Neo4j

  • Agent: LangGraph

Deployment Type

  • Public Cloud (Azure, GCP, Oracle, AWS)

  • AI PC (Intel)

OPEA 2025 Roadmap

Release Cadence

  • Release cycles extended from 2 months to 3 months (TSC approved)

  • Upcoming versions:

Version

Release Date

Key Features

v1.6

Jan 2026

Domain-specific AI Agent Blueprints with Partners, Leading open source LLM, Image and Video Diffusion models

v1.5

Oct 2025

RouteLLM, Finetuning (Advanced), Next Agent Example

v1.4

Jul 2025

Agent (human in the loop), Finance Agent Advanced, GraphRAG (Arango DB), AI Resource Optimizer

v1.3

Apr 2025

Agent (multi-turn message), Advanced AgentQnA, Finance Agent Basic, DocSum (Performance, accuracy and stability)

v1.2

Jan 2025

vLLM Arc GPU via OpenVINO, Langchain Integration, Llamaindex Integration, Eval Benchmark for ChatQnA

Q1 2025 (v1.2 release)

Contribution

  • GenAI Component

    • vLLM Arc GPU via OpenVINO

    • Opensearch vector DB

    • Elastic search

    • POC for Model context protocol

  • GenAI Examples

    • Langchain Integration

    • Llamaindex Integration

  • GenAI Infra

    • OPEA on k8s guide

    • HPA support in GMC

    • Istio m/TLS integration

  • GenAIEval

    • Eval benchmark for China ecosystem

    • K8s conformance test

    • Long context benchmark enhancement

    • ChatQnA Benchmark (performance, accuracy and stability)

AI Models

  • BAAI/bge-base-zh-v1.5

  • AWS bedrock endpoint

Q2 2025 (v1.3 release)

Contribution

  • GenAI Component

    • Agent (multi-turn message)

  • GenAI Examples

    • Advanced AgentQnA

    • Finance Agent Basic

    • vLLM enablement for GenAI examples

    • *Haystack Integration

  • GenAI Infra

    • Enhance existing HELM charts (8 GenAI Examples)

    • OIM basic (container structure)

    • HPA Scaling for IAAS

  • GenAIEval

    • Initial DocuSum Benchmark Support (Performance, accuracy and stability)

    • Long context benchmark enhancement (vLLM-Gaudi)

AI Models

  • Deepseek v3, R1, 6 distilled LLM

  • Mistral (Large, Small)

  • Granite (IBM)

Q3 2025 (v1.4 release)

Contribution

  • GenAI Component

    • Agent (human in the loop)

  • GenAI Examples

    • Finance Agent Advanced

    • GraphRAG (Arango DB)

    • Finetuning (Basic)

    • Model Context Protocol

  • GenAI Infra

    • OIM enhancement

    • AI Resource Optimizer

  • GenAIEval

    • AI Agent (performance, accuracy and stability)

AI Models

  • Deepseek upgrades

  • Llama4

  • Falcon LLM

  • Falcon LVM

  • Finetuned Financial model

Q4 2025 (v1.5 release)

Contribution

  • GenAI Component

    • RouteLLM

  • GenAI Examples

    • Finetuning (Advanced)

    • Next Agent Example

  • GenAI Infra

    • OIM advanced

  • GenAIEval

    • More GenAI Example performance, accuracy and stability (continuous)

AI Models

  • Next advanced LLMs

Q1 2026 (v1.6 release)

Contribution

  • GenAI Examples

    • Domain specific AI Agent Blueprint backed by customers/partners

    • Leading open source LLM (reasoning model, FM)

    • Image, Video Diffusion model