OPEA Release Notes v1.4

We are excited to announce the release of OPEA version 1.4, which includes significant contributions from the open-source community. This release addresses over 330 pull requests.

More information about how to get started with OPEA v1.4 can be found on the Getting Started page. All project source code is maintained in the opea-project organization. To pull Docker images, please access the Docker Hub. For instructions on deploying Helm Charts, please refer to the guide.

Table of Contents

What’s New in OPEA v1.4

This release includes new features, optimizations, and user-focused updates.

Advanced Agent Capabilities

Components as MCP Servers

OPEA components can now serve as Model Context Protocol (MCP) servers, allowing external MCP-compatible frameworks and applications to integrate with OPEA seamlessly. (GenAIComps#1652)

KubeAI Operator for OPEA

The KubeAI Operator now features an improved autoscaler, monitoring support, optimized resource placement via NRI plugins, and expanded support for new models on Gaudi. (GenAIInfra#, GenAIInfra#, GenAIInfra#, GenAIInfra#, GenAIInfra#, GenAIInfra#, GenAIInfra#)

New GenAI Capabilities

  • Fine-Tuning of Reasoning Models: This feature is compatible with the dataset format used in FreedomIntelligence/medical-o1-reasoning-SFT, enabling you to customize models with your own data. (GenAIComps#)

  • HybridRAG: Combined GraphRAG (knowledge graph-based retrieval) and VectorRAG (vector database retrieval) for enhanced accuracy and contextual relevance. (GenAIExamples#1968)

  • LLM Router: LLM Router decides which downstream LLM serving endpoint is best suited for an incoming prompt. (GenAIComps#1716)

  • OPEA Store: Redis and MongoDB have been integrated into OPEA Store. (GenAIComps#1816, GenAIComps#)

  • Guardrails: Added Input/Output Guardrails to enforce content safety and prevent the creation of inappropriate outputs. (GenAIComps#1798)

  • Language Detection: The microservice is used to ensure the pipeline’s response matches the query’s language. (GenAIComps#1774)

  • Prompt Template: The microservice can dynamically generate system and user prompts based on structured inputs and document context. (GenAIComps#1826)

  • Air-gapped Environment Support: Some OPEA microservices can now be deployed in an air-gapped Docker environment. (GenAIComps#1480)

  • Remote Inference Endpoints Support: Added support for remote inference endpoints for OPEA examples. (GenAIExamples#1973)

Better User Experience

  • One-click Deployment: You can now deploy 8 OPEA examples with one click. ChatQnA can deploy in an air-gapped Docker environment. (GenAIExamples#1727)

  • GenAIStudio: Added support for drag-and-drop creation of documentation summarization and code generation applications. (GenAIStudio#61)

  • Documentation Refinement: Refined READMEs for key examples and components to help readers easily locate documentation tailored to deployment, customization, and hardware. (GenAIExamples#1673, GenAIComps#)

Newly Supported Models

OPEA introduces support for the following models in this release.

Model

TGI-Gaudi

vLLM-CPU

vLLM-Gaudi

vLLM-ROCm

OVMS

Optimum-Habana

PredictionGuard

SGLANG-CPU

meta-llama/Llama-4-Scout-17B-16E-Instruct

-

-

-

-

-

-

-

meta-llama/Llama-4-Maverick-17B-128E-Instruct

-

-

-

-

-

-

-

(✓: supported; -: not validated; x: unsupported)

Newly Supported Hardware

Newly Supported OS

Updated Dependencies

Dependency

Hardware

Scope

Version

Version in OPEA v1.3

Comments

huggingface/text-embeddings-inference

all

all supported examples

cpu-1.7

cpu-1.6

vllm

Xeon

all supported examples except EdgeCraftRAG

v0.10.0

v0.8.3

Changes to Default Behavior

  • CodeTrans: The default model changed from mistralai/Mistral-7B-Instruct-v0.3 to Qwen/Qwen2.5-Coder-7B-Instruct on Xeon and Gaudi.

Validated Hardware

  • Intel® Gaudi® AI Accelerators (2nd)

  • Intel® Xeon® Scalable processor (3rd)

  • Intel® Arc™ Graphics GPU (A770)

  • AMD® EPYC™ processors (4th, 5th)

Validated Software

  • Docker version 28.3.3

  • Docker Compose version v2.39.1

  • Intel® Gaudi® software and drivers v1.21

  • Kubernetes v1.32.7

  • TEI v1.7

  • TGI v2.4.0 (Xeon, EPYC), v2.3.1 (Gaudi), v2.4.1 (ROCm)

  • Torch v2.5.1

  • Ubuntu 22.04

  • vLLM v0.10.0 (Xeon, EPYC), v0.6.6.post1+Gaudi-1.20.0 (Gaudi)

Known Issues

Full Changelogs

Contributors

This release would not have been possible without the contributions of the following organizations and individuals.

Contributing Organizations

  • AMD: AMD EPYC support.

  • Bud: Components as MCP Servers.

  • Intel: Development and improvements to GenAI examples, components, infrastructure, evaluation, and studio.

  • MariaDB: Added ChatQnA docker-compose example on Intel Xeon using MariaDB Vector.

  • openEuler: openEuler OS support.

Individual Contributors

For a comprehensive list of individual contributors, please refer to the Full Changelogs section.