OPEA 2024 - 2025 Roadmap

May 2024

Contribution

  • Components

    • ASR

    • Data Prep

    • Embedding

    • Guardrails

    • LLM (Gaudi TGI)

    • Rerank

    • Retrieval

    • TTS

    • VectorDB

  • Use Cases/Examples

    • ChatQnA

    • CodeGen

    • CodeTrans

  • Cloud Native

    • OneClick OPEA on ChatQnA

    • OneClick OPEA on CodeGen

    • GenAI microservice connector

  • Evaluation & Others

    • CICD & Validation

    • Eval: E2E (GenAIComps & GenAIExamples), lm-eval-harness, bigcode-eval-harness

    • RAGAS evaluation service

AI Models

  • LLM: llama2 (7b, 13b, 70b), llama3 (8b, 70b), code-llama, Llama guard

  • Embedding: BGE-base

AI Tools Integration

  • VectorDB: Chroma

  • Framework: Langchain

Deployment Type

  • On Prem,IDC (Xeon, Gaudi)

June 2024

Contribution

  • Components

    • LLM (Xeon vLLM & Ray, Ollama)

    • OVMS

    • prompting

    • user feedback management

    • Mega Component (MI6 RAG service)

  • Use Cases/Examples

    • DocSum

    • SearchQnA

  • Cloud Native

    • OneClick OPEA for 2 more examples

    • GMC with switch support (dynamic pipelines)

    • Helm charts/templates for custom yamls (refactoring)

  • Evaluation & Others

    • CICD & Validation

    • Eval: E2E (GenAIComps & GenAIExamples) Gaudi (2) and CPUs in CICD cluster

AI Models

  • LLM: mistral-7B, mixtral-8x7B

  • Embedding: E5-mistral-7b-instruct, all-mpnet-base-v2

AI Tools Integration

  • VectorDB: Pinecone, Redis

  • Framework: Llamaindex, Haystack

Deployment Type

  • On Prem,IDC (Xeon, Gaudi)

July 2024

Contribution

  • Components

    • LVM (Gaudi vLLM & Ray)

    • vectordb (svs)

    • Gateway guardrail, Auth Z/N

  • Use Cases/Examples

    • FAQGen

  • Cloud Native

    • OpenShift enablement for OPEA

    • OneClick OPEA for 3 more examples

    • Security (Service Mesh, guardrails)

  • Evaluation & Others

    • CICD & Validation

    • Eval: E2E (GenAIComps & GenAIExamples)

AI Models

  • LLM: Phi, Gemma

  • Embedding: all-MiniLM-L6-v2, paraphrase-albert-small-v2

AI Tools Integration

  • VectorDB: PGVector, Qdrant

Deployment Type

Aug 2024

Contribution

  • Components

    • Documentation

    • Test automation script

    • Telemetry

  • Use Cases/Examples

    • Documentation

    • Test automation script

  • Cloud Native

    • Demo K8s resource management

    • Documentation on autoscaler analysis

  • Evaluation & Others

    • CICD & Validation

    • Eval: E2E (GenAIComps & GenAIExamples)

AI Models

  • Vision: llava

  • Mixtral-8x22B

AI Tools Integration

  • VectorDB: Milvus

Deployment Type

  • Public Cloud AWS (Xeon CPU & NV GPU)

Sep 2024

Contribution

  • Components

    • Microservice for Image and Video

  • Use Cases/Examples

    • Text to Image generation

    • Image to Video generation

    • Playground (composable and configurable)

  • Cloud Native

  • Evaluation & Others

    • CICD & Validation

    • Eval: E2E (GenAIComps & GenAIExamples)

AI Models

  • Diffusion model:

    • Stable Diffusion XL

    • Stable Diffusion 3M

    • Stable Video Diffusion

AI Tools Integration

  • VectorDB: Weaviate

Deployment Type

Q4 2024

Contribution

  • Components

    • Fine-tuning E2E pipeline

    • Knowledge Graph

  • Use Cases/Examples

    • Fine-tuning (Lora)

    • AI Agent (single Agent with text and Audio as user interface)

    • Closed source LLM

    • GraphRAG

  • Cloud Native

    • Static tuning on Resource management for deployment

  • Evaluation & Others

    • CICD & Validation

    • Eval: E2E (GenAIComps & GenAIExamples)

AI Models

  • LLM open: Grok 1

  • LLM Close: GPT3.5/4/4o, Claude 3/3.5

  • AWS Bedrock endpoint

AI Tools Integration

  • Knowledge graph: Neo4j

  • Agent: LangGraph

Deployment Type

  • Public Cloud (Azure, GCP, Oracle, AWS)

  • AI PC (Intel)

Q1 2025

Contribution

  • Components

    • more Microservice request from community

    • Confidential Container

  • Use Cases/Examples

    • AI Agent (Multi Agent)

    • Fine-tuning (Adpative)

    • Long context window (>1M)

    • GenAI Studio

  • Cloud Native

    • Dynamic tuning on Resource management through K8s

  • Evaluation & Others

    • CICD & Validation

    • Eval: E2E (GenAIComps & GenAIExamples)

AI Models

  • LLM: SetFit

  • More to be defined

AI Tools Integration

  • AutoGen, CrewAI

Deployment Type

  • Public Cloud (tier2 CSP)

  • AI PC (others)